PVE 8.2 / Kernel 6.8.4-2 does not boot - cannot find root device

A new PVE kernel version 6.8.8-2-pve has been released by Proxmox, and it contains the fix for this problem. I just installed and test-booted it on my machine, and it indeed works fine and fixes the issue, so I'd consider this CLOSED.

If you were affected and install and successfully test this kernel, you will have a line like this in your dmesg output:

Code:
root@linus:~# dmesg | grep VPD
[    3.483058] scsi 0:3:0:0: scsi_get_vpd_size: long VPD page 0 length: 516 bytes

This is the dev_warn_once from Martin's fix, see above.
 
Last edited:
  • Like
Reactions: kabello
A new PVE kernel version 6.8.8-2-pve has been released by Proxmox, and it contains the fix for this problem. I just installed and test-booted it on my machine, and it indeed works fine and fixes the issue, so I'd consider this CLOSED.

If you were affected and and install and successfully test this kernel, you will have a line like this in your dmesg output:

Code:
root@linus:~# dmesg | grep VPD
[    3.483058] scsi 0:3:0:0: scsi_get_vpd_size: long VPD page 0 length: 516 bytes

This is the dev_warn_once from Martin's fix, see above.

I updated to 6.8.8-2-pve and now, my PERC H310 (Dell T420 server) is working without any issues!

Thanks pschneider1968 for your support
 
  • Like
Reactions: pschneider1968
Apparently the bug is still there. Few days ago I updated the home server with Proxmox and got the same result: no boot.
Buggy kernel: 6.8.8-2-pve
Kernel that works for me: 6.5.13-5-pve
M/B: AsRock Rack EP2C602-4L/D16
HDD: ST6000NE0021 (Seagate SATA 6 Tb)

But I don't have any additional drive adapters, just builtin SATA.

# pve-efiboot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
6.5.13-5-pve
6.8.8-2-pve

# lsscsi
[8:0:0:0] disk ATA ST6000NE0021-2EN EN02 /dev/sda
[14:0:0:0] process Marvell Console 1.01 -

# dmesg | grep "Attached"
[ 3.568778] sd 8:0:0:0: Attached scsi generic sg0 type 0
[ 3.569299] scsi 14:0:0:0: Attached scsi generic sg1 type 3
[ 3.600859] sd 8:0:0:0: [sda] Attached SCSI disk

# sg_vpd --all -HHHH /dev/sg0
00 00 00 08 00 80 83 89 b0 b1 b2 b9
....
00 b2 00 04 00 40 00 00

fetching VPD page failed: Illegal request
sg_vpd failed: Illegal request


Any suggestions????
 
Last edited:
Apparently the bug is still there. Few days ago I updated the home server with Proxmox and got the same result: no boot.
Buggy kernel: 6.8.8-2-pve
Kernel that works for me: 6.5.13-5-pve
M/B: AsRock Rack EP2C602-4L/D16
HDD: ST6000NE0021 (Seagate SATA 6 Tb)

But I don't have any additional drive adapters, just builtin SATA.

# pve-efiboot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
6.5.13-5-pve
6.8.8-2-pve

# lsscsi
[8:0:0:0] disk ATA ST6000NE0021-2EN EN02 /dev/sda
[14:0:0:0] process Marvell Console 1.01 -

# dmesg | grep "Attached"
[ 3.568778] sd 8:0:0:0: Attached scsi generic sg0 type 0
[ 3.569299] scsi 14:0:0:0: Attached scsi generic sg1 type 3
[ 3.600859] sd 8:0:0:0: [sda] Attached SCSI disk

# sg_vpd --all -HHHH /dev/sg0
00 00 00 08 00 80 83 89 b0 b1 b2 b9
....
00 b2 00 04 00 40 00 00

fetching VPD page failed: Illegal request
sg_vpd failed: Illegal request


Any suggestions????

I would suggest that you create a new bug report on the Proxmox Bugzilla, and also write a bug report to the Linux Kernel mailing list with CC linux-scsi, linux-stable and Martin K. Petersen. Probably best as a reply to his mail where he announced the Patch:

https://lore.kernel.org/linux-scsi/20240521023040.2703884-1-martin.petersen@oracle.com/

(It is easiest to follow LKML via the NNTP interface of lore.kernel.org - that's how I read the mailing lists)

I guess Martin and the other SCSI developers would be interested in the exact and full output of your last "sg_vpd" command, as that might show him in which exact way your device is buggy (with respect to the SCSI standard) and his scanning code still cannot deal with.

You should also check if you have any kernel crash log from the failed boot in /var/lib/systemd/pstore/ and if so, include that in the bug report. This might have a full stack trace, which normally scrolls too fast across the console to notice or copy. The Pstore thing might work and be present if your machine has some mechanism like ERST via BIOS to persistently store diagnostic early boot messages across boots even in case of a kernel panic (typically more often found in server hardware though).
 
Last edited:
Apparently the bug is still there. Few days ago I updated the home server with Proxmox and got the same result: no boot.
Buggy kernel: 6.8.8-2-pve
Kernel that works for me: 6.5.13-5-pve

I have a Dell PowerEdge 730xd with a PERC H310 Mini. Running an 6.8.8-2 or 6.8.8-3 kernel I encounter this (although not exact) problem. If I revert back to 6.5.13-5 everything works fine. So whatever the "bug" is, its still there -- or more than one bug has been introduced in the 6.8.8 stream somewhere along the line.

Specifically, I get the "Timed out for waiting the udev queue being empty" message on boot and booting can take several (5+) minutes. However, my system boots because my boot disk is on a Dell BOSS-S1 card, so it isn't the boot disk that has the issue.

Where I run into a problem is with the VMs that access VDs on the H310. With the 6.8.8.x kernels I cannot reliably access the H310 virtual disks. Sometimes they will mount, othertimes the system will just hang (at the proxmox level, requiring a cold start via the idrac) attempting to access them. All the H310 VDs are connected to VMs as SCSI drives via passthrough, no sata.

I also have a PERC H800 controller in this system which connects to an MD1200 array, I have no issues at all with those VDs, its just the H310.

Obviously my message is posted about 2 weeks from the last post on this issue because I do not reboot my proxmox servers all that often, so didn't encounter the problem until today when I had to perform some maintenance, and this is where google lead me.
 
In my case , it was my coral TPU (PCIe) . Had to rebuild driver. All is OK now.
Can you explain, how you did that?

I'm facing the same issue since I updated to PVE8.2.4 -> System will not boot anymore.
Kernels 6.8.8-4-pve and 6.8.12-1-pve won't boot
Code:
/dev/root: Can't open blockdev
VFS: cannot open root device "/dev/mapper/pve-root"§ or unknown-block(0,0): error -6

By selecting Kernel 6.5.13-6-pve in grub everything is booting fine.

I also have a Google Coral in my Futro S740 (Mainboard: D3544-S2)
Maybe the Coral is the Problem.

How to rebuild the driver?
Do I have to boot the new kernel anyway to rebuild the driver? (I don't think it makes much sense rebuilding the driver when I'm on the old, working Kernel)
can you explain to me, how you did the rebuild?
 
The problem has been solved for me.
It seems someone has been able to solve this issue here.

If someone needs that:
select an older kernel in grub to boot the System anyway.
than:

Code:
dpkg --list@grep proxmox-kernel

for me the newes one was 6.8.12-1-pve

Code:
sudo update-initramfs -u -k 6.8.12-1-pve

Code:
sudo update-grub

Code:
reboot now

now everything works fine again.
 
The problem has been solved for me.
It seems someone has been able to solve this issue here.

If someone needs that:
select an older kernel in grub to boot the System anyway.
than:

Code:
dpkg --list@grep proxmox-kernel

for me the newes one was 6.8.12-1-pve

Code:
sudo update-initramfs -u -k 6.8.12-1-pve

Code:
sudo update-grub

Code:
reboot now

now everything works fine again.
Has anyone else verified that this fix is working? Hopefully someone can confirm before I look at going back to one of the newer kernels.
 
Has anyone else verified that this fix is working? Hopefully someone can confirm before I look at going back to one of the newer kernels.
I tried the fix the above and it did not work for me unfortunately. I pinned the older kernel in the meantime. If anyone is still having the issue I would love to find a fix. The issue only happened after upgrading from 8.1-2 to 8.2.-2.
Dell R420 with H310 PERC and SAS drives. During my testing i found the Proxmox installer for 8.1-2 sees the drives and allowed the Proxmox install, but the 8.2-2 installer didn't' see the drives. I installed 8.1-2 then upgraded to 8.2-2 and then had the issue where it "Failed to mount pveroot as root file system" . This is a test server, but would still love to figure it out.
 
The latest kernel 6.8.12-2 still does not work at my AsRock (mentioned previously). Need to find out a way how to grab logs and then create a bug request and upload collected logs. So far stick with 6.5.13-5
 
  • Like
Reactions: ddfdom
I'm experiencing a similar issue. After performing a clean installation using the PVE 7.4-1 ISO, I could not upgrade the kernel to either 6.5.12-5-pve or 6.11.0-2-pve. Additionally, I couldn't install it from the 8.3-1 ISO because it did not detect the disks, which seems to be part of the problem.

I want to emphasize that this was a clean installation because this server had previously been upgraded from PVE version 5.x to 8.3. At one point, I successfully ran version 6.5.12-5-pve on this machine.

When attempting to upgrade from version 6.5.13-6-pve to a newer version, I end up in initramfs and have no success using the command "zpool import -N rpool."
1736651305425.png

The machine specs that have the issue are:
  • Dell PowerEdge T320
    • 1x Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
    • 64 GB RAM
      • 4x Micron Technology 36KSF2G72PZ-1G6E1
    • H330 in HBA Mode PCIe Slot 6
    • NetXtreme BCM5720 2-port Gigabit Ethernet PCIe Embedded
    • NetXtreme BCM5720 2-port Gigabit Ethernet PCIe Slot 4
    • iDRAC Enterprise
Firmware:
1736650147459.png
Proxmox Package Versions:

proxmox-ve: 8.3.0 (running kernel: 6.5.13-6-pve)
pve-manager: 8.3.2 (running version: 8.3.2/3e76eec21c4a14a7)
proxmox-kernel-helper: 8.1.0
pve-kernel-5.15: 7.4-15
proxmox-kernel-6.11.0-2-pve: 6.11.0-2
proxmox-kernel-6.8: 6.8.12-5
proxmox-kernel-6.8.12-5-pve-signed: 6.8.12-5
proxmox-kernel-6.5.13-6-pve: 6.5.13-6
pve-kernel-6.2.11-2-pve: 6.2.11-2
pve-kernel-5.15.158-2-pve: 5.15.158-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
ceph-fuse: 16.2.15+ds-0+deb12u1
corosync: 3.1.7-pve3
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.2.0
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.10
libpve-cluster-perl: 8.0.10
libpve-common-perl: 8.2.9
libpve-guest-common-perl: 5.1.6
libpve-http-server-perl: 5.1.2
libpve-network-perl: 0.10.0
libpve-rs-perl: 0.9.1
libpve-storage-perl: 8.3.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.5.0-1
proxmox-backup-client: 3.3.2-1
proxmox-backup-file-restore: 3.3.2-2
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.3.3
pve-cluster: 8.0.10
pve-container: 5.2.3
pve-docs: 8.3.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.14-2
pve-ha-manager: 4.0.6
pve-i18n: 3.3.2
pve-qemu-kvm: 9.0.2-4
pve-xtermjs: 5.3.0-3
qemu-server: 8.3.3
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.6-pve1

1736652119917.png

Are there any logs I can obtain to better understand the issue and possibly find a fix?

I will share some screenshots of the installation messages in the next post, as they may be helpful.

@fiona @t.lamprecht
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!