Enterprise SSD Showing 0B in size

minic119

Member
Mar 25, 2022
22
7
8
I have 18 1.6tb SSD's that all of a sudden are showing 0B in proxmox. I have hooked them up to my windows computer with a sas to usb adapter and they show up as 1.81TB on my computer with no issue. I had 7 of them go bad at 1 time and thought the disks just went bad with a power surge and have replaced them with 1.92TB ssds but then 2 different servers did the same thing with 6 drives each. I only have 18 now because i took 1 apart trying to see if there was any damage and there didn't look to be any damage but once i had it apart i did not want to go through the trouble of putting them back together since i thought they were bad. Now that i have an adapter to plug them in i can see it on my windows computer and now know the drives aren't dead. I can provide any logs that would be needed to help diagnose the issue I just may need the commands to get the exact logs you guys are looking for. Any help would be greatly appreciated. The proxmox community is amazing and my friend has had a couple issues fixed though the help of the community. I just want to say thank you to the community for being helpful to some of us who don't have as much experience as others :)
 
I have 2 of the drives attached and this is what shows in journalctl once i plug the drives in.


Code:
Jun 14 16:33:26 Saradomin kernel: mpt2sas_cm0: handle(0x15) sas_address(0x500003964c88062e) port_type(0x1)
Jun 14 16:33:26 Saradomin kernel: scsi 4:0:13:0: Direct-Access     TOSHIBA  PX02SMU020       MS02 PQ: 0 ANSI: 6
Jun 14 16:33:26 Saradomin kernel: scsi 4:0:13:0: SSP: handle(0x0015), sas_addr(0x500003964c88062e), phy(18), device_name(0x500003964c88062d)
Jun 14 16:33:26 Saradomin kernel: scsi 4:0:13:0: enclosure logical id (0x5003048001627dbf), slot(6)
Jun 14 16:33:26 Saradomin kernel: scsi 4:0:13:0: qdepth(254), tagged(1), scsi_level(7), cmd_que(1)
Jun 14 16:33:26 Saradomin kernel: scsi 4:0:13:0: Power-on or device reset occurred
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: Attached scsi generic sg11 type 0
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Unit Not Ready
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Sense Key : Hardware Error [current]
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Add. Sense: Diagnostic failure on component(c2)
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Sense Key : Hardware Error [current]
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Add. Sense: Diagnostic failure on component(c2)
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Sense Key : Hardware Error [current]
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Add. Sense: Diagnostic failure on component(c2)
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] 0 512-byte logical blocks: (0 B/0 B)
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] 0-byte physical blocks
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Test WP failed, assume Write Enabled
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Asking for cache data failed
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Assuming drive cache: write through
Jun 14 16:33:26 Saradomin kernel:  end_device-4:0:13: add: handle(0x0015), sas_addr(0x500003964c88062e)
Jun 14 16:33:26 Saradomin kernel: sd 4:0:13:0: [sdl] Attached SCSI disk
Jun 14 16:33:34 Saradomin pvedaemon[5360]: <root@pam> successful auth for user 'root@pam'
Jun 14 16:42:22 Saradomin pveproxy[211444]: worker exit
Jun 14 16:42:22 Saradomin pveproxy[5366]: worker 211444 finished
Jun 14 16:42:22 Saradomin pveproxy[5366]: starting 1 worker(s)
Jun 14 16:42:22 Saradomin pveproxy[5366]: worker 283961 started
Jun 14 16:43:52 Saradomin kernel: usb 3-12: new high-speed USB device number 6 using xhci_hcd
Jun 14 16:43:52 Saradomin kernel: usb 3-12: New USB device found, idVendor=0bda, idProduct=9210, bcdDevice=20.01
Jun 14 16:43:52 Saradomin kernel: usb 3-12: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Jun 14 16:43:52 Saradomin kernel: usb 3-12: Product: B USB Drive
Jun 14 16:43:52 Saradomin kernel: usb 3-12: Manufacturer: Realtek
Jun 14 16:43:52 Saradomin kernel: usb 3-12: SerialNumber: 012345678918
Jun 14 16:43:52 Saradomin kernel: usb-storage 3-12:1.0: USB Mass Storage device detected
Jun 14 16:43:52 Saradomin kernel: scsi host12: usb-storage 3-12:1.0
Jun 14 16:43:52 Saradomin kernel: usbcore: registered new interface driver usb-storage
Jun 14 16:43:52 Saradomin kernel: usbcore: registered new interface driver uas
Jun 14 16:43:53 Saradomin kernel: scsi 12:0:0:0: Direct-Access     SABRENT  EC-U2SA          1.00 PQ: 0 ANSI: 6
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: Attached scsi generic sg12 type 0
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] Sense Key : Illegal Request [current]
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] Add. Sense: Invalid field in cdb
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] 0 512-byte logical blocks: (0 B/0 B)
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] 0-byte physical blocks
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] Write Protect is off
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] Mode Sense: 37 00 00 08
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Jun 14 16:43:53 Saradomin kernel: sd 12:0:0:0: [sdm] Attached SCSI disk
Jun 14 16:48:34 Saradomin pvedaemon[5360]: <root@pam> successful auth for user 'root@pam'
 
Would you be ok to run lsblk -o tran,name,type,size,vendor,model,label,rota,log-sec,phy-sec on one of the Proxmox servers having the problem (with the disks connected)?

It'll give us a bunch of detailed info about the disks themselves, and save a bunch of back and forth questions. :)
 
  • Like
Reactions: minic119
This is the output of that command, the one that says usb is connected via a sas to usb adapter.
sas sdl disk 0B TOSHIBA PX02SMU020 0 512 512
usb sdm disk 0B SABRENT EC-U2SA 1 512 512

the other 16 effected disks are all the Toshiba PX02SMU020 drives, again thank you for any help :)
 
Last edited:
Ahhh. Would you be ok to paste that in again, including the column headings, and try getting the formatting set (maybe use "code" or"quote") to try and have the columns line up? :)
 
Last edited:
Code:
TRAN   NAME        TYPE    SIZE VENDOR   MODEL                      LABEL           ROTA LOG-SEC PHY-SEC
sas    sdl         disk      0B TOSHIBA  PX02SMU020                                    0     512     512
usb    sdm         disk      0B SABRENT  EC-U2SA                                       1     512     512
 
Last edited:
  • Like
Reactions: justinclift
It's probably a good idea to try and get the firmware info out from the drives too.

The Toshiba there seems to be /dev/sdl, with the Sabrent being /dev/sdm.

So, I'd probably try using smartctl -a on them, and seeing if the firmware info is shown. Something like this should work:

Bash:
smartctl -a /dev/sdl
 
Code:
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.4-2-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               TOSHIBA
Product:              PX02SMU020
Revision:             MS02
Compliance:           SPC-4
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Logical Unit id:      0x500003964c88062c
Serial number:        5520A0B0T2AA
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Fri Jun 14 22:08:14 2024 CDT
device Test Unit Ready  [medium or hardware error (serious)]
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

it says medium or hardware error (serious) but i was able to transfer some data to it when it was connected via usb to a windows machine.
 
  • Like
Reactions: justinclift
Note, I'm asking about the firmware because there are occasionally models of SSD which have strange (sometimes fatal) firmware bugs, and it's better to know and fix them before they bite (if possible). :)
 
I will have to go back to my work to plug the other drive in the drive bays to run smartctl on it, it wont run through the usb.
 
In the proxmox computer, how are the drives generally attached to it?

Guessing it's generally through a proper HBA adapter of some sort rather than the usb adapter?
 
Yes, the drives are usually connected to the backplane in the front which connects to a SAS2 HBA card that has been flashed to IT mode. I re-ran the smartctl on the drive that is there with the -T permissive and this is the output I got:

Code:
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.4-2-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               TOSHIBA
Product:              PX02SMU020
Revision:             MS02
Compliance:           SPC-4
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Logical Unit id:      0x500003964c88062c
Serial number:        5520A0B0T2AA
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Fri Jun 14 22:14:17 2024 CDT
device Test Unit Ready  [medium or hardware error (serious)]
SMART support is:     Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

Error Counter logging not supported

Device does not support Self Test logging

Edit: Also I believe the product is showing incorrect in the smartctl because these are 1.6TB drives and the Product for the 1.6 and what was on the label for the drives is PX02SMQ160.
 
Last edited:
  • Like
Reactions: justinclift
Looking around the net, there doesn't seem to be any mention of firmware problems with the drives. They're so old that the Kioxia website (rebranded name for Toshiba storage) only mentions them in one document though. They're not even coming up in what I can find of their support section.

This is something I've not seen before:
LU is resource provisioned, LBPRZ=1

Not sure if it's relevant. Doing some searching online, it seems to be something to do with custom firmware on a drive. At least, that's my impression.

As to more solid "what next?", that's a good question. My initial impressions are that the drive looks dead, but since you have it working under windows via a usb adapter it's clearly not.

Hmmm. What happens if you quick format the drive under windows? Does it complete a quick format and let you use it for writing a couple of GBs onto?
 
Last edited:
the drives are usually connected to the backplane in the front which connects to a SAS2 HBA card that has been flashed to IT mode.
Just on the off chance there's something we can do via the HBA side of things, would you be ok to grab the details of the PCIe cards in the system so we can figure out the HBA pieces?

Running lspci -nnk should give us the brief run down, and show what kernel module is being used by the HBA card. :)
 
Yes, I did a quick format in windows and transfered about 20GB to it with no issue then opened the data on the drive through windows, also i want to note that the ones that went bad in my server were being used for about a year and a half before this happened and the other 2 servers the drives had only been in those for about 10 months and also being used as zfs to host VMs.
 
  • Like
Reactions: justinclift
Code:
02:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
        Subsystem: Dell 6Gbps SAS HBA Adapter [1028:1f1c]
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas
 
  • Like
Reactions: justinclift
Cool. Sounds like your running a Dell box of some kind? R620 or similar maybe? :) (I have two of those in a dual node cluster)

Now that we know it's using the mpt3sas driver, check if the kernel log is showing any weirdness when that driver loads:

Bash:
# dmesg | grep -i mpt3sas

Half of me is hoping something strange shows up there, as it could mean something simple needs changing to fix everything. :D



Oh, I just realised something else. I wonder if the recent kernel update in Proxmox might be the cause? The Proxmox kernel recently updated from 6.5 series to 6.8 series. Lots of people have been having weird issues from it, including some strangeness with HBA adapters.

(My two R620's with H710 HBAs in them are fine though.)

How recently did this problem start for you?
 
Last edited:
I actually have a SuperMicro x10 board in a supermicro chassis but it is an HBA card out of an old dell server. And I thought the kernel as well since my friend's issue was the kernel and something to do with his HBA but on boot i checked the old kernel i had was 6.5 and I booted to that kernel with the same issues happening, here is the output of the command:
Code:
[    3.395474] mpt3sas version 43.100.00.00 loaded
[    3.433528] mpt3sas 0000:02:00.0: can't disable ASPM; OS doesn't have ASPM control
[    3.769559] mpt3sas 0000:03:00.0: can't disable ASPM; OS doesn't have ASPM control
[   29.718962] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2200)
[   29.719006] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2201)
[   29.719041] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2202)
[   29.719075] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2203)
[   29.719110] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2204)
[   29.719145] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2205)
[   29.719179] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2206)
[   29.720464] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2207)
[   29.722199] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2208)
[   29.723943] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c2209)
[   29.725766] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c220a)
[   29.727534] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c220b)
[   29.729308] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c220c)
[   29.731064] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c220d)
[   29.732752] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c220e)
[   29.734425] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c220f)
[   29.736081] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c223e)
[   29.737743] mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x50001555480c223f)
[ 2874.487813] mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x500003964c88053a)
[11023.498474] mpt2sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x500003964c88062e)

Edit: Also I have other drives that are connected to the same HBA and those drives are showing up just fine, would you like info on those drives as well?
 
Last edited:
Hmmm, that seems to be missing the later kernel message output.

Try this command instead, which should definitely output everything for the current boot:

Bash:
# journalctl -b 0 | grep -i mpt3sas

Also I have other drives that are connected to the same HBA and those drives are showing up just fine, would you like info on those drives as well?

Not sure. Maybe skip it for now, and we'll possibly have a look later on for comparison. :)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!