Disk smart status no longer working

Runnnig smartctl without the -d scsi gives: /dev/sdb: requires option '-d cciss,N'
Likely the issue is lies there, because the Proxmox VE backend doesn't pass any -d option currently. It relies on smartctl to be able to auto-detect it, but that seems to fail in your case. Maybe it can't guess the numeric parameter or there's multiple possibilities for it.

You said that it worked in 7.2-3. With the very same disk setup? You could try booting the kernel from before the upgrade (check /var/log/apt/history.log to see what the update did) to see if it works there. Our smartmontools package didn't change in quite a while, but you could try downgrading libpve-storage-perl to the version before the upgrade to see if it works there.
 
Fiona,

Sorry for the late reply. You are correct, the older version was running on an AMD system. The new version is running on an HP DL380p G8 server. I have SAS, SSD, and SATA drives in the HP server. All give same unknown error.

What does 'Error getting S.M.A.R.T. data: Exit code: 1 (500)' mean? That what I get when I click on any drive.
 
That's because the auto-detection doesn't work, so the smartctl command exits with status code 1. It seems like we need to pass -d explicitly in your case. Please open a bug report linking back to this thread so we can keep track of the issue: https://bugzilla.proxmox.com/
 
Fiona,

I installed an NVME drive directly in the PCIe slot and no issues with smart control. It seems only drives connected to the HP 420i Controller are having the issue.
 

Attachments

  • grabilla.503944.png
    grabilla.503944.png
    27.4 KB · Views: 15
Fiona,

I installed an NVME drive directly in the PCIe slot and no issues with smart control. It seems only drives connected to the HP 420i Controller are having the issue.
I guess in that case (and most cases) auto-detection simply works. There already is old bug report for a similar issue, which probably got lost and I pinged now: https://bugzilla.proxmox.com/show_bug.cgi?id=1127
 
  • Like
Reactions: kbftech
Following. Same issue, same kind of system (HP Dl380p, P420i in HBA mode). Works with "- d", errors-out without it.
 
Hi All

i can confirm that ll our HP servers running any HBA cards have this issue.
there are no smart stats showing in PVE at all.
From terminal passing the -d options shows stats.

  • HP DL360
  • Intel CPU's
  • H240 HBA

Same drives in a Dell using its Perch/ Mini controller all show up fine.
Same drives in a Supermicro using LSI HBA all show up fine
HP using their H240 or other series HBA - drive smart stats dont show unless you pass the -d option

we have never seen the HP with HBA showing smart stats have been with PVE for about 3 years now and nothing so far.

example below.

Code:
smartctl -H /dev/sde -d cciss,1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.83-1-pve] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Percentage used endurance indicator: 1%
Grown defects during certification = 0
Total blocks reassigned during format = 0
Total new blocks reassigned = 0
Power on minutes since format = 1047022

happy to help just let us know how we can assist.


""Cheers
G
 
I can confirm the bug, PVE not showing SMART values.
Likely the issue is lies there, because the Proxmox VE backend doesn't pass any -d option currently.

I really, really hope this bug to be reviewed and fixed soon. :rolleyes:

PVE: pve-manager/8.0.4/d258a813cfa6b390 (running kernel: 6.2.16-6-pve)

Controller: RAID bus controller: Broadcom / LSI MegaRAID SAS 2108

Code:
$:~# smartctl -a /dev/sda

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.2.16-6-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               LSI
Product:              RAID 5/6 SAS 6G
Revision:             2.13
Compliance:           SPC-3
User Capacity:        599,584,145,408 bytes [599 GB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
Logical Unit id:      0x6003005702218bf029211178105252f3
Serial number:        00f352521078112129f08b2102570003
Device type:          disk
Local Time is:        Tue Aug 15 17:48:56 2023 CEST
SMART support is:     Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

Error Counter logging not supported

Device does not support Self Test logging

Using -d command option:

Code:
smartctl -a /dev/sda -d megaraid,4
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.2.16-6-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST3600057SS
Revision:             1703
Compliance:           SPC-3
User Capacity:        600,127,266,816 bytes [600 GB]
Logical block size:   512 bytes
Rotation Rate:        15000 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c50088c52caf
Serial number:        6SLA8HS60000N5391MMW
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Tue Aug 15 17:52:14 2023 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     52 C
Drive Trip Temperature:        68 C

Accumulated power on time, hours:minutes 67855:17
Elements in grown defect list: 1925

...

Non-medium error count:      196

No Self-tests have been logged

/etc/smartd.conf

Code:
/dev/sdb -a -d megaraid,0 -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sdb -a -d megaraid,1 -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sda -a -d megaraid,4 -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sda -a -d megaraid,7 -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sdc -a -d megaraid,8 -m root -M exec /usr/share/smartmontools/smartd-runner
/dev/sdc -a -d megaraid,9 -m root -M exec /usr/share/smartmontools/smartd-runner
 
That's because the auto-detection doesn't work, so the smartctl command exits with status code 1. It seems like we need to pass -d explicitly in your case. Please open a bug report linking back to this thread so we can keep track of the issue: https://bugzilla.proxmox.com/
Hi, just to keep track of a possible fix, is there any fix planned or in work already?

Appreciate any feedback. :)
 
Hi,
Hi, just to keep track of a possible fix, is there any fix planned or in work already?

Appreciate any feedback. :)
unfortunately, I don't think anybody is currently working on it. You can subscribe to the issue on the bugzilla to receive updates when the status changes (and that also helps to see that more people are interested in a fix): https://bugzilla.proxmox.com/show_bug.cgi?id=1127
 
  • Like
Reactions: 0fake
Good afternoon, I literally have the same error, but looking at more details it shows me the following:

disk002.png

disk001.png

How do you know if it is really damaged?
disk003.png
I appreciate your help.
 
Hi,
Good afternoon, I literally have the same error, but looking at more details it shows me the following:

View attachment 56054

View attachment 56055

How do you know if it is really damaged?
View attachment 56056
I appreciate your help.
if ZFS says that the device is faulted and even smartctl on the device fails, I think you can be pretty sure there's an issue ;) You can check the system logs for more and you can also check the cable of course. For how to replace the ZFS device, see man zpool-replace or search the forums, there's many threads about that.
 
  • Like
Reactions: Montttiii
Bumping this thread.

I have this problem with multiple PVE 7x systems with PERC H730 & H330s. I can manually query the RAID status and disks via the command line but the GUI shows UNKNOWN. The script posted earlier in this thread does not seem to work.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!