Issues with HP P420i and SMART

Not worked in Proxmox 8.0.4 / non-commercial-license. I have migrated from ESXi and this is my first Proxmox installation. I had some issue with SSD (Patriot Burst 480GB - one of the cheapest SSD, 2 years old, SMART says OK) and said to me to try something new when I need to reinstall completely. I will move this new installation to new SSD when it comes so I can let you know, if it works. It can be that I have not yet restarted the server after update.
 
The script from udotirol is working for me, but the one from lclements0 not :-(.
works here with p420i on 3 x ML350p Gen8 HDDs only : 2 x PVE 7.2 + 1 x PVE 7.4
SSDs are on embedded SATA + 1 system have SSDs on cheap SATA pcie card

If you update to Proxmox 8 you don’t need the script anymore.
only if HBA mode.
Legacy RAID always require wrapper script.
 
Last edited:
Just to mention my own experience with P420i,

I had been running Ceph with OSDs on Raid0 for a couple of years with no problem.
When building a new cluster I recently I decided to experiment with HBA mode.

I can confirm that with PVE 8 the hpsa driver is used by linux automatically if the controller is in HBA mode.
It does work and performs fine.
BUT
it doesn't handle faulty disks very well at all.

I had a couple of disks with faults and instead of marking them as faulty and refusing to use them
the system simply locks up on retries and/or hangs.
No idea what the issue is, and I don't have the time or expertise to delve into it, although I did see someone else
on a different thread report a similar experience.

Good healthy disks are fine, but I decided I am not going to use this mode in anger, and invested in H220 HBA cards.

Its a shame because being able use this integrated controller would free up a PCI slot for something else, seems also a waste having it on the MB.
 
After this has been done, the proxmox UI will show the correct SMART values.

The problem with this script is that cciss number is leading paramether and do not correlate with drive letter.

Bash:
~# /usr/sbin/smartctl.orig -a -d cciss,2 /dev/sda | grep Serial
Serial number:        WBM76J4N0000K309099K
~# /usr/sbin/smartctl.orig -a -d cciss,15 /dev/sda | grep Serial
Serial number:        S5G0NC0WB01271


Bash:
~# ls /sys/bus/scsi/devices/0:0:*:0/block

'/sys/bus/scsi/devices/0:0:10:0/block':
sdj
'/sys/bus/scsi/devices/0:0:11:0/block':
sdk
'/sys/bus/scsi/devices/0:0:12:0/block':
sdl
'/sys/bus/scsi/devices/0:0:13:0/block':
sdm
'/sys/bus/scsi/devices/0:0:15:0/block':
sdn
'/sys/bus/scsi/devices/0:0:16:0/block':
sdp
'/sys/bus/scsi/devices/0:0:1:0/block':
sda
'/sys/bus/scsi/devices/0:0:23:0/block':
sdo
'/sys/bus/scsi/devices/0:0:2:0/block':
sde
'/sys/bus/scsi/devices/0:0:3:0/block':
sdb
'/sys/bus/scsi/devices/0:0:4:0/block':
sdc
'/sys/bus/scsi/devices/0:0:5:0/block':
sdf
'/sys/bus/scsi/devices/0:0:6:0/block':
sdd
'/sys/bus/scsi/devices/0:0:7:0/block':
sdg
'/sys/bus/scsi/devices/0:0:8:0/block':
sdi
'/sys/bus/scsi/devices/0:0:9:0/block':
sdh



for me worked add -d scsi:
Bash:
~# /usr/sbin/smartctl.orig -a /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.131+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/sda: requires option '-d cciss,N'
Please specify device type with the -d option.

Use smartctl -h to get a usage summary

~# /usr/sbin/smartctl.orig -a -d scsi /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.131+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST2400MM0129
...
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     31 C
Drive Trip Temperature:        60 C

Accumulated power on time, hours:minutes 9339:15
Manufactured in week 46 of year 2022
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  41
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  468
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 518222792
  Blocks received from initiator = 90326216
  Blocks read from cache and sent to initiator = 11650559
  Number of read and write commands whose size <= segment size = 844128
  Number of read and write commands whose size > segment size = 219

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 9339.25
  number of minutes until next internal SMART test = 41

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   64485493        0         0  64485493          0        264.691           0
write:         0        0         0         0          0         50.478           0
verify:   155988        0         0    155988          0          0.639           0

Non-medium error count:        0


[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
No Self-tests have been logged

my wrapper:
Bash:
#!/bin/bash

SMARTCTL=/usr/sbin/smartctl.orig

contains_key_d() {
    for arg in "$@"; do
        if [[ "$arg" == "-d" || "$arg" == -d=* ]]; then
            return 0 
        fi
    done
    return 1 
}

if contains_key_d "$@"; then
    exec $SMARTCTL "$@"
else
    exec $SMARTCTL -d scsi "$@"
fi