Received SMART error for a disk. How to map to the right disk?

edtrumbull

New Member
Jun 24, 2022
29
5
3
I have received the following error message from one of my nodes:

Code:
This message was generated by the smartd daemon running on:

   host name:  ceph-00
   DNS domain: snc.as2inc.com

The following warning/error was logged by the smartd daemon:

Device: /dev/bus/0 [megaraid_disk_32] [SAT], ATA error count increased from 0 to 288

Device info:
ST8000NM000A-2KE101, S/N:WSD1D271, WWN:5-000c50-0c95f01f4, FW:SN03, 8.00 TB

When I look at the disks known to that host, I see the following:

ceph-disks.png

I don't see the mapping between either the serial number or the WWN for the disk that smartmon is telling me about to the /dev/sd# names that CEPH and the host know. I've looked in dmesg and smartmon and various other places, and cannot figure out how to make the connection - but I know it must be possible, since smartd knows it. I'm sure I'm missing something... Could one of you point me in the right direction?
 
I suspect you're right.
But that does not really answer the question of how do I find which drive the email points to.
 
You could run lsblk -o NAME,FSTYPE,UUID,SIZE,STATE,TYPE,MOUNTPOINT,LABEL,MODEL and see if you can identify the serial or WNN there. Then you could compare which /dev/sdX that relates to in the webUI. Or maybe you need to use some broadcom software.
 
Last edited:
S/N:WSD1D271
There is your "identification". That number is written on your physical disk, so you "just" have to find the right one. If you have not written your S/N on your caddies or have not a picture of your machine with S/N mapped directy to the drives you see on the disk, now is the time to do that.

For enterprise hardware, the S/N ist often written directly on the caddy. Most RAID controllers have also a "identify" command that will blink the identify led on your caddy, so that you just "see" the drive. Often, if the drive "really" fails, you will also have a red light. But "simple" SMART errors does not mean that the disk is already faulty. There can be a degradation that will yield a failed disk, but not immediately.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!