[SOLVED] Issues with USB NVMe Drive in Proxmox - SMART Monitoring Inconsistency

PapaGigas

Member
Mar 18, 2023
32
1
8
I've installed a NAND NVMe drive in a USB case and I'm encountering issues with SMART monitoring in Proxmox. I'm receiving emails from Proxmox due to this issue. The strangest part is that I can execute smartctl on the drive only after I click on the "Disks" section in the PVE GUI. Otherwise, it fails. Can anyone help me troubleshoot this?

Before Clicking on PVE > Disks:
Code:
pve smartd: Opened configuration file /etc/smartd.conf
pve smartd: Configuration file /etc/smartd.conf parsed.
pve smartd: Device: /dev/sda [USB NVMe JMicron], opened
pve smartd: Device: /dev/sda [USB NVMe JMicron], NVMe Identify Controller failed
pve smartd: Unable to register NVMe device /dev/sda [USB NVMe JMicron] at line 162 of file /etc/smartd.conf
pve smartd: Device: /dev/sda [USB NVMe JMicron], not available
pve smartd: Monitoring 0 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices

After Clicking on PVE > Disks:
Code:
pve smartd: Opened configuration file /etc/smartd.conf
pve smartd: Configuration file /etc/smartd.conf parsed.
pve smartd: Device: /dev/sda [USB NVMe JMicron], opened
pve smartd: Device: /dev/sda [USB NVMe JMicron], CT1000P3SSD8, S/N:2249E68EC61B, FW:P9CR30A
pve smartd: Device: /dev/sda [USB NVMe JMicron], is SMART capable. Adding to "monitor" list.
pve smartd: Device: /dev/sda [USB NVMe JMicron], state read from /var/lib/smartmontools/smartd.CT1000P3SSD8-2249E68EC61B.nvme.state
pve smartd: Monitoring 0 ATA/SATA, 0 SCSI/SAS and 1 NVMe devices
pve smartd: Device: /dev/sda [USB NVMe JMicron], state written to /var/lib/smartmontools/smartd.CT1000P3SSD8-2249E68EC61B.nvme.state

Has anyone experienced similar issues or have any insights on why the drive is only properly recognized after accessing the "Disks" section in the PVE GUI? Any help or suggestions would be greatly appreciated!
 
I've installed a NAND NVMe drive in a USB case and I'm encountering issues with SMART monitoring in Proxmox.

Just to clarify, this drive is not being used with Proxmox VE itself. Instead, it's set up as a separate storage device. I opted for this setup because I ran out of available PCIe lanes and wanted to utilize this 1TB NVMe drive.

1719918337000.jpeg

For reference, this is how it's configured in /etc/smartd.conf:

/dev/sda -d sntjmicron -a -m root -M exec /usr/share/smartmontools/smartd-runner
 
Last edited:
Before Clicking on PVE > Disks:
Code:
root@pve:~# smartctl -d sntjmicron -a /dev/sda
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.8-2-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

Read NVMe Identify Controller failed: scsi error medium or hardware error (serious)

After Clicking on PVE > Disks:
Code:
root@pve:~# smartctl -d sntjmicron -a /dev/sda
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.8-2-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       CT1000P3SSD8
Serial Number:                      2249E68EC61B
Firmware Version:                   P9CR30A
PCI Vendor/Subsystem ID:            0xc0a9
IEEE OUI Identifier:                0x00a075
Controller ID:                      1
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,000,204,886,016 [1.00 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            6479a7 7010000241
Local Time is:                      Tue Jul  2 14:29:27 2024 WEST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x06):         Cmd_Eff_Lg Ext_Get_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     85 Celsius
Critical Comp. Temp. Threshold:     95 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.00W  0.0000W       -    0  0  0  0        0       0
 1 +     3.00W  0.0000W       -    0  0  0  0        0       0
 2 +     1.50W  0.0000W       -    0  0  0  0        0       0
 3 -   0.0250W  0.0000W       -    3  3  3  3     5000    1900
 4 -   0.0030W       -        -    4  4  4  4    13000  100000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         1
 1 -    4096       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        37 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    2%
Data Units Read:                    14,420,751 [7.38 TB]
Data Units Written:                 16,717,716 [8.55 TB]
Host Read Commands:                 368,026,524
Host Write Commands:                140,152,699
Controller Busy Time:               217
Power Cycles:                       324
Power On Hours:                     1,948
Unsafe Shutdowns:                   118
Media and Data Integrity Errors:    0
Error Information Log Entries:      3,470
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               37 Celsius
Temperature Sensor 2:               42 Celsius
Temperature Sensor 8:               37 Celsius

Error Information (NVMe Log 0x01, 16 of 16 entries)
No Errors Logged
 
I've also disabled autosuspend on the USB case:

Code:
root@pve:~# usb-devices

...

T:  Bus=09 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=152d ProdID=0583 Rev=51.02
S:  Manufacturer=Ugreen
S:  Product=Ugreen Storage Device
S:  SerialNumber=152D05831A0F
C:  #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=500mA
I:  If#= 0 Alt= 1 #EPs= 4 Cls=08(stor.) Sub=06 Prot=62 Driver=uas
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

...

Code:
root@pve:~# echo "-1" > /sys/bus/usb/devices/9-1/power/autosuspend
root@pve:~# echo "-1" > /sys/bus/usb/devices/9-1/power/autosuspend_delay_ms
root@pve:~# echo "on" > /sys/bus/usb/devices/9-1/power/control
root@pve:~# nano /etc/udev/rules.d/99-usb-power.rules

ACTION=="add", SUBSYSTEM=="usb", ATTR{idVendor}=="152d", ATTR{idProduct}=="0583", ATTR{power/control}="on", ATTR{power/autosuspend}="-1", ATTR{power/autosuspend_delay_ms}="-1"

root@pve:~# udevadm control --reload-rules
root@pve:~# udevadm trigger

But without any success! :(

Code:
This message was generated by the smartd daemon running on:

   host name:  pve
   DNS domain: what.ever

The following warning/error was logged by the smartd daemon:

Device: /dev/sda [USB NVMe JMicron], failed to read NVMe SMART/Health Information

Device info:
CT1000P3SSD8, S/N:2249E68EC61B, FW:P9CR30A

For details see host's SYSLOG.
 
Last edited:
Well, I also tried to read from the drive to wake it up, but then I remembered it was, in fact, reading from ARC.

So I wrote a little script that writes 512b to the drive every 10 minutes and run it as a service. :)

And that was it, that solved the issue I was facing. Here's my solution in case anyone else have this issue:

Code:
#!/bin/bash
while true; do
    dd if=/dev/zero of=/mnt/usb/keep_drive_active bs=512 count=1 oflag=dsync
    sleep 600
done
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!