Invisible SATA disk

kana

Member
Sep 22, 2020
9
0
21
Hi all,
I've got a strange phenomenon where a SATA disk is connected to my mcomputer, but Proxmox is not able to see it, however, a Debian live installation disk can see it fine.
Here's what I tried in Proxmox:
  • it's not listed with lsblk
  • neither it is with fdisk -l
  • ls /dev/sd* -> "ls: cannot access '/dev/sd*': No such file or directory"
  • "smartctl --scan" results in only the NVME drive
  • interestingly, there's a one hint that it can see something somehow (I've got only the one invisible SATA drive in the system):
    dmesg | grep -i sata
    ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
  • to assure that the disk is not assigned to a VM, I checked them all manually and additionly I had script where this
    "qm config $vm_id | grep -E '(scsi|sata|ide|virtio|pci)'"
    loops through all VMs, disk is not assigned
When having booted the Debian live disk,
  • lsblk listed the drive right away
  • the partition manager would see it, there I could partition it etc.
The BIOS POST also shows it.

I think a HW issue can be excluded, what can be misconfigured in Proxmox that it cannot see the disk?
BTW, two USB sticks that I can see in live Debian (there I have /dev/sda, sdb and sdc) are invisible in Proxmox, too.

This is a mystery to me, has anbody got an idea?
 
HI,

can you post the output of 'dmesg' 'lsblk' and 'ls -l /dev' from the proxmox installation and from a working debian live disk?
it sounds like the sata controller/hba is either not recognized or the driver is broken (or something like that)

BTW, two USB sticks that I can see in live Debian (there I have /dev/sda, sdb and sdc) are invisible in Proxmox, too.
this also sounds very weird
 
Hi,
yes of course!
I pasted lsblk here and attached the other outputs as files due to their size.

I just looked through the two dmesg logs, and there are the same messages concerning SATA and sda in both logs, except for one occurrence
[ 20.516242] sd 0:0:0:0: [sda] Synchronizing SCSI cache
which is unique to Proxmox.
Doesn't still make sense to me.

And I just noticed that nvme0n1 is not recognized either.

Proxmox:


root@pve:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
zd0 230:0 0 1M 0 disk
zd16 230:16 0 32G 0 disk
├─zd16p1 230:17 0 32M 0 part
├─zd16p2 230:18 0 24M 0 part
├─zd16p3 230:19 0 256M 0 part
├─zd16p4 230:20 0 24M 0 part
├─zd16p5 230:21 0 256M 0 part
├─zd16p6 230:22 0 8M 0 part
├─zd16p7 230:23 0 96M 0 part
└─zd16p8 230:24 0 31.3G 0 part
zd32 230:32 0 636M 0 disk
├─zd32p1 230:33 0 16M 0 part
├─zd32p2 230:34 0 619.7M 0 part
└─zd32p128 259:7 0 239K 0 part
zd48 230:48 0 64G 0 disk
└─zd48p1 230:49 0 64G 0 part
zd64 230:64 0 32G 0 disk
zd80 230:80 0 1M 0 disk
zd96 230:96 0 80G 0 disk
├─zd96p1 230:97 0 100M 0 part
├─zd96p2 230:98 0 16M 0 part
├─zd96p3 230:99 0 79.1G 0 part
└─zd96p4 230:100 0 775M 0 part
zd112 230:112 0 1M 0 disk
zd128 230:128 0 1M 0 disk
zd144 230:144 0 4M 0 disk
nvme1n1 259:0 0 953.9G 0 disk
├─nvme1n1p1 259:2 0 1007K 0 part
├─nvme1n1p2 259:3 0 1G 0 part
└─nvme1n1p3 259:4 0 952.9G 0 part
root@pve:~#

Debian Live:


user@debian:~$ lsblk
Code:
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0         7:0    0   2.8G  1 loop /usr/lib/live/mount/rootfs/filesystem.squashfs
                                      /run/live/rootfs/filesystem.squashfs
sda           8:0    0   7.3T  0 disk
└─sda1        8:1    0   7.3T  0 part
sdb           8:16   1  14.3G  0 disk
└─sdb1        8:17   1  14.3G  0 part
sdc           8:32   1  28.8G  0 disk
├─sdc1        8:33   1  28.8G  0 part
└─sdc2        8:34   1    32M  0 part
nvme0n1     259:0    0 447.1G  0 disk
├─nvme0n1p1 259:2    0 447.1G  0 part
└─nvme0n1p9 259:3    0     8M  0 part
nvme1n1     259:1    0 953.9G  0 disk
├─nvme1n1p1 259:4    0  1007K  0 part
├─nvme1n1p2 259:5    0     1G  0 part
└─nvme1n1p3 259:6 0 952.9G 0 part
 

Attachments

Last edited:
I made another experiment: I booted into the terminal from the Proxmox installation iso, and this one saw sda, sdb and sdc, also, fdisk -l listed both the missing 8TB disk as well as the nvme device.
lsblk seems not available for some reason in the installer.

So Proxmox has all the means it needs to see my disks, but I must have baadly misconfigured it...
 
Doesn't still make sense to me.
does not make much sense to me either... the only thing i could think of would be if your udev rules are somehow messed up, did you maybe change something there?
also could you maybe post a 'journalctl -b' from the proxmox installations? maybe i can see something there
 
udev rules was the correct hint. I looked through the rules and found
/lib/udev/rules.d/60-block.rules
In there I saw this suspicious section and uncommented it:
Code:
ACTION!="remove", SUBSYSTEM=="block", \
 KERNEL=="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|ubi*|scm>
  OPTIONS+="watch"

That has the bad effect, that I can access my server only via ssh but no more via web interface, so this cannot be the final solution, but Proxmox does now list the missing devices:
Code:
root@pve:~# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda           8:0    0   7.3T  0 disk
└─sda1        8:1    0   7.3T  0 part
sdb           8:16   1  14.3G  0 disk
└─sdb1        8:17   1  14.3G  0 part
zd0         230:0    0     1M  0 disk
zd16        230:16   0    32G  0 disk
├─zd16p1    230:17   0    32M  0 part
├─zd16p2    230:18   0    24M  0 part
├─zd16p3    230:19   0   256M  0 part
├─zd16p4    230:20   0    24M  0 part
├─zd16p5    230:21   0   256M  0 part
├─zd16p6    230:22   0     8M  0 part
├─zd16p7    230:23   0    96M  0 part
└─zd16p8    230:24   0  31.3G  0 part
zd32        230:32   0   636M  0 disk
├─zd32p1    230:33   0    16M  0 part
├─zd32p2    230:34   0 619.7M  0 part
└─zd32p128  259:7    0   239K  0 part
zd48        230:48   0    64G  0 disk
└─zd48p1    230:49   0    64G  0 part
zd64        230:64   0    32G  0 disk
zd80        230:80   0     1M  0 disk
zd96        230:96   0    80G  0 disk
├─zd96p1    230:97   0   100M  0 part
├─zd96p2    230:98   0    16M  0 part
├─zd96p3    230:99   0  79.1G  0 part
└─zd96p4    230:100  0   775M  0 part
zd112       230:112  0     1M  0 disk
zd128       230:128  0     1M  0 disk
zd144       230:144  0     4M  0 disk
nvme0n1     259:0    0 953.9G  0 disk
├─nvme0n1p1 259:2    0  1007K  0 part
├─nvme0n1p2 259:3    0     1G  0 part
└─nvme0n1p3 259:4    0 952.9G  0 part
nvme1n1     259:1    0 447.1G  0 disk
├─nvme1n1p1 259:5    0 447.1G  0 part
└─nvme1n1p9 259:6    0     8M  0 part

Having no idea what this rule is about, what could I do next to get both the disks and the web interface?
 
Last edited:
I did some research and now have a rough idea what udev rules are about. I also consulted an AI (which was helpful only to some extent).
What I did: I comment out only the "WATCH" part in that rule -> no effect
I also bootet into the initramfs shell and would see the sda, b, c devices, which is in line with the theory that they get removed somehow during the boot process.
I then captured the journal as you suggested, file is attached. sda is mentioned a lot in there, but no hint that it got removed at some point.
 

Attachments

Hi,
sorry for the late answer.

The rule you posted is part of a normal installation so this should not be removed.
If you haven't played around with udev in the past and did not install any external packages that might have done that, i'm not sure why it's behaving like this

what you can do is edit the 'udev_log' variable in /etc/udev/udev.conf to 'info' or 'debug' (make sure to remove the leading '#' to uncomment that line)
then reboot and then sending the journal/syslog from that boot (there should now be much more udev related logging)
 
  • Like
Reactions: Johannes S
Hi @dcsapak,

sorry here as well for late answer.
I followed your instruction and uncommented udev_log=info, this did not make a difference though, but setting it to debug did.
The log is now 2MB of size, but I figured I just paste here the trailing excerpt, filtered for "sda" (I can provide the whole logfile if needed), the filtered part looks interesting enough:
sda seems to be started and later for some reason removed again. Maybe you/sb can make sense of it.

Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Starting '/lib/udev/vdev_id -d sda1'
Code:
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: '/lib/udev/vdev_id -d sda1'(out) 'Error: Config file "/etc/zfs/vdev_id.conf" not found'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Process '/lib/udev/vdev_id -d sda1' failed with exit code 1.
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: /usr/lib/udev/rules.d/69-vdev.rules:6 Command "/lib/udev/vdev_id -d sda1" returned 1 (error), ignoring
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Setting permissions /dev/sda1, uid=0, gid=6, mode=0660
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNF0W712801K-part1' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/disk/by-label/8TB' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/disk/by-id/wwn-0x5002538f43734449-part1' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/disk/by-partuuid/ddd901bb-f2c5-4ffd-9c4a-5bb9b0ed1389' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/disk/by-uuid/78d24272-f660-4d70-9fa8-39ba61b3a091' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/disk/by-path/pci-0000:03:00.1-ata-1-part1' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/disk/by-path/pci-0000:03:00.1-ata-1.0-part1' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Successfully created symlink '/dev/block/8:1' to '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: sd-device: Created db file '/run/udev/data/b8:1' for '/devices/pci0000:00/0000:00:02.1/0000:03:00.1/ata1/host0/target0:0
:0/0:0:0:0/block/sda/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Adding watch on '/dev/sda1'
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: Device processed (SEQNUM=2777, ACTION=add)
Mar 02 19:32:07 pve (udev-worker)[818]: sda1: sd-device-monitor(worker): Passed 2240 byte to netlink monitor.
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda, type changed from 'scsi' to 'sat'
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], opened
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], Samsung SSD 870 QVO 8TB, S/N:S5SSNF0W712801K, WWN:5-002538-f43734449, FW:SVQ02B6Q, 8.00 TB
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], found in smartd database 7.3/5319: Samsung based SSDs
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], can't monitor Current_Pending_Sector count - no Attribute 197
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], can't monitor Offline_Uncorrectable count - no Attribute 198
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], state read from /var/lib/smartmontools/smartd.Samsung_SSD_870_QVO_8TB-S5SSNF0W712801K.ata.state
Mar 02 19:32:11 pve smartd[1073]: Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.Samsung_SSD_870_QVO_8TB-S5SSNF0W712801K.ata.state
Mar 02 19:32:17 pve systemd-udevd[715]: sda1: Device is queued (SEQNUM=4042, ACTION=remove)
Mar 02 19:32:17 pve systemd-udevd[715]: sda1: Device ready for processing (SEQNUM=4042, ACTION=remove)
Mar 02 19:32:17 pve systemd-udevd[715]: sda1: sd-device-monitor(manager): Passed 302 byte to netlink monitor.
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: Processing device (SEQNUM=4042, ACTION=remove)
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: Removing watch handle 84.
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: /usr/lib/udev/rules.d/69-vdev.rules:6 Importing properties from results of '/lib/udev/vdev_id -d sda1'
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: Starting '/lib/udev/vdev_id -d sda1'
Mar 02 19:32:17 pve systemd-udevd[715]: sda: Device is queued (SEQNUM=4044, ACTION=remove)
Mar 02 19:32:17 pve systemd-udevd[715]: sda: SEQNUM=4044 blocked by SEQNUM=4042
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: '/lib/udev/vdev_id -d sda1'(out) 'Error: Config file "/etc/zfs/vdev_id.conf" not found'
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: Process '/lib/udev/vdev_id -d sda1' failed with exit code 1.
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: /usr/lib/udev/rules.d/69-vdev.rules:6 Command "/lib/udev/vdev_id -d sda1" returned 1 (error), ignoring
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: No reference left for '/dev/disk/by-partuuid/ddd901bb-f2c5-4ffd-9c4a-5bb9b0ed1389', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: No reference left for '/dev/disk/by-path/pci-0000:03:00.1-ata-1.0-part1', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: No reference left for '/dev/disk/by-uuid/78d24272-f660-4d70-9fa8-39ba61b3a091', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: No reference left for '/dev/disk/by-label/8TB', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: No reference left for '/dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNF0W712801K-part1', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: No reference left for '/dev/disk/by-id/wwn-0x5002538f43734449-part1', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: No reference left for '/dev/disk/by-path/pci-0000:03:00.1-ata-1-part1', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: Device processed (SEQNUM=4042, ACTION=remove)
Mar 02 19:32:17 pve (udev-worker)[818]: sda1: sd-device-monitor(worker): Passed 2230 byte to netlink monitor.
Mar 02 19:32:17 pve systemd-udevd[715]: sda: Device ready for processing (SEQNUM=4044, ACTION=remove)
Mar 02 19:32:17 pve systemd-udevd[715]: sda: sd-device-monitor(manager): Passed 283 byte to netlink monitor.
Mar 02 19:32:17 pve (udev-worker)[818]: sda: Processing device (SEQNUM=4044, ACTION=remove)
Mar 02 19:32:17 pve (udev-worker)[818]: sda: Removing watch handle 70.
Mar 02 19:32:17 pve (udev-worker)[818]: sda: /usr/lib/udev/rules.d/69-vdev.rules:5 Importing properties from results of '/lib/udev/vdev_id -d sda'
Mar 02 19:32:17 pve (udev-worker)[818]: sda: Starting '/lib/udev/vdev_id -d sda'
Mar 02 19:32:17 pve (udev-worker)[818]: sda: '/lib/udev/vdev_id -d sda'(out) 'Error: Config file "/etc/zfs/vdev_id.conf" not found'
Mar 02 19:32:17 pve (udev-worker)[818]: sda: Process '/lib/udev/vdev_id -d sda' failed with exit code 1.
Mar 02 19:32:17 pve (udev-worker)[818]: sda: /usr/lib/udev/rules.d/69-vdev.rules:5 Command "/lib/udev/vdev_id -d sda" returned 1 (error), ignoring
Mar 02 19:32:17 pve (udev-worker)[818]: sda: No reference left for '/dev/disk/by-path/pci-0000:03:00.1-ata-1.0', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda: No reference left for '/dev/disk/by-path/pci-0000:03:00.1-ata-1', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda: No reference left for '/dev/disk/by-id/wwn-0x5002538f43734449', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda: No reference left for '/dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNF0W712801K', removing
Mar 02 19:32:17 pve (udev-worker)[818]: sda: Device processed (SEQNUM=4044, ACTION=remove)
Mar 02 19:32:17 pve (udev-worker)[818]: sda: sd-device-monitor(worker): Passed 1640 byte to netlink monitor.
Mar 02 19:32:18 pve kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache