Boot failure after Proxmox 6.x upgrade

David Hedbor

Member
May 15, 2019
3
0
21
50
EDIT: I "fixed" this by plugging in a USB drive and putting BOOT and grub on the USB drive instead of booting from the ZFS pool. It's sufficient for now but still curious to know why 5.x kernel would fail when 4.x worked.

I just upgraded 4 machines to Proxmox 6.x. Three booted fine, except for one of them getting the NIC's renamed from enpXsY to ens1/ens3 for some reason..

The 4th one fails to boot using the 5.0 kernel. Fortunately it boots with the 4.x kernel that remained after upgrade. 5.0 fails with this error. The system boots in BIOS mode.

Screen Shot on 2019-08-01 at 18-20-19.png


The filesystem is a single mirror/stripe ZFS pool made up of 12 SSD's connected to controllers:

Code:
    NAME                        STATE     READ WRITE CKSUM
    rpool                       ONLINE       0     0     0
      mirror-0                  ONLINE       0     0     0
        wwn-0x55cd2e404c02a8c7  ONLINE       0     0     0
        wwn-0x55cd2e404c029779  ONLINE       0     0     0
      mirror-1                  ONLINE       0     0     0
        wwn-0x55cd2e404c029706  ONLINE       0     0     0
        wwn-0x55cd2e404c02a0ce  ONLINE       0     0     0
      mirror-2                  ONLINE       0     0     0
        wwn-0x55cd2e404c02a2d1  ONLINE       0     0     0
        wwn-0x55cd2e404c02a8fe  ONLINE       0     0     0
      mirror-3                  ONLINE       0     0     0
        wwn-0x55cd2e404c02973c  ONLINE       0     0     0
        wwn-0x55cd2e404c02a878  ONLINE       0     0     0
      mirror-4                  ONLINE       0     0     0
        wwn-0x55cd2e404c02a98c  ONLINE       0     0     0
        wwn-0x55cd2e404c02acf8  ONLINE       0     0     0
      mirror-5                  ONLINE       0     0     0
        wwn-0x55cd2e404c02971f  ONLINE       0     0     0
        wwn-0x55cd2e404c02a39b  ONLINE       0     0     0

The grub config looks like this (the 4.x one is identical except for a 4.x kernel from what I can see):
Code:
        menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 5.0.18-1-pve' --class proxmox --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.0.18-1-pve-advanced-8fee9614a9b903f6' {
                load_video
                insmod gzio
                if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod part_gpt
                insmod zfs
                set root='hd8,gpt1'
                if [ x$feature_platform_search_hint = xy ]; then
                  search --no-floppy --fs-uuid --set=root --hint-bios=hd8,gpt1 --hint-efi=hd8,gpt1 --hint-baremetal=ahci8,gpt1  --hint-bios=hd11,gpt1 --hint-efi=hd11,gpt1 --hint-baremetal=ahci11,gpt1  --hint-bios=hd2,gpt1 --hint-efi=hd2,gpt1 --hint-baremetal=ahci2,gpt1  --hint-bios=hd9,gpt1 --hint-efi=hd9,gpt1 --hint-baremetal=ahci9,gpt1  --hint-bios=hd5,gpt1 --hint-efi=hd5,gpt1 --hint-baremetal=ahci5,gpt1  --hint-bios=hd0,gpt1 --hint-efi=hd0,gpt1 --hint-baremetal=ahci0,gpt1  --hint-bios=hd6,gpt1 --hint-efi=hd6,gpt1 --hint-baremetal=ahci6,gpt1  --hint-bios=hd3,gpt1 --hint-efi=hd3,gpt1 --hint-baremetal=ahci3,gpt1  --hint-bios=hd1,gpt1 --hint-efi=hd1,gpt1 --hint-baremetal=ahci1,gpt1  --hint-bios=hd10,gpt1 --hint-efi=hd10,gpt1 --hint-baremetal=ahci10,gpt1  --hint-bios=hd4,gpt1 --hint-efi=hd4,gpt1 --hint-baremetal=ahci4,gpt1  --hint-bios=hd7,gpt1 --hint-efi=hd7,gpt1 --hint-baremetal=ahci7,gpt1  8fee9614a9b903f6
                else
                  search --no-floppy --fs-uuid --set=root 8fee9614a9b903f6
                fi
                echo    'Loading Linux 5.0.18-1-pve ...'
                linux   /ROOT/pve-1@/boot/vmlinuz-5.0.18-1-pve root=ZFS=rpool/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet
                echo    'Loading initial ramdisk ...'
                initrd  /ROOT/pve-1@/boot/initrd.img-5.0.18-1-pve
        }

From what I've gather, this could happen if bios/grub doesn't find all the drives. What I don't understand is, if that's the issue, why 4.x kernels have worked, and still work, fine, wheras the 5.x kernel doesn't load. Any help would be appreciated.

I could perhaps get a USB drive and use it with systemd-boot using EFI but that'd be a last resort.
 
Last edited:
see various other reports in the forum - the grub ZFS implementation is sometimes unable to read certain files the new kernel is freshly written and unreadable, the old kernel happens to be readable. this issue seems to trigger randomly (in other words, it cna happen for any rewrite of any files in /boot, and it can also 'randomly' solve itself again). PVE 6.x switched to systemd-boot with automatically synced ESPs, so depending on whether you still have free space/partitions on your disks you can opt-into that new scheme. you can also format a USB device and register it as 'synced' ESP with 'pve-efiboot-tool format/init', like you said, as a last restort.
 
Interesting, and somewhat weird - I haven't had any issues with any of the 4.x kernel upgrades in the past. In either case, I'm now using a USB drive to host /boot/ and nothing else, and it works (still using grub/mbr). Since I don't have UEFI booting setup, I don't think it's worth messing with UEFI for the time being.

Thanks.
 
pve5to6 upgrade here - one worked and pretty similar box after reboot simply reports
"error: attempt to read or write outside of disk 'hd0'"

grub rescue mode starts but unable to recover from there.
 
well well - stupid security :)
after removing the bios protection (aka secure boot) - it boots fine on the other box too...
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!