[PVE-9 BETA] PVE Nested Upgrade from 8 to 9 breaks boot

jsterr

Renowned Member
Jul 24, 2020
856
245
88
33
I just tested the PVE-8 to PVE-9 Upgrade on a freshly installed PVE-8 Installation (Nested, VM) so not on physical hardware. After upgrading and rebooting, the VM does not boot anymore and goes to VM-BIOS.

Thats the proxmox-boot-tool status output before rebooting:

Code:
root@pve-2:~# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
D557-E076 is configured with: grub (versions: 6.14.8-1-pve, 6.8.12-12-pve)
D558-30AB is configured with: grub (versions: 6.14.8-1-pve, 6.8.12-12-pve)

Thats the config von the nested pve-vm:

Code:
root@training4:~# qm config 202
agent: 0
bios: ovmf
boot: order=scsi0;scsi1
cores: 6
cpu: host
efidisk0: local-zfs:vm-202-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
ide2: local:iso/pve-b-autoinstall.iso,media=cdrom,size=1536128K
memory: 12000
meta: creation-qemu=9.0.0,ctime=1724752781
name: pve-2
net0: virtio=BC:24:11:7F:F9:09,bridge=pvea
net1: virtio=BC:24:11:BA:F4:3C,bridge=vmbr1,firewall=1
net2: virtio=BC:24:11:F8:B1:36,bridge=vmbr0,firewall=1
net3: virtio=BC:24:11:73:41:9A,bridge=vmbr0,firewall=1
net4: virtio=BC:24:11:6F:44:91,bridge=vmbr1,firewall=1
net5: virtio=BC:24:11:FA:CB:77,bridge=vmbr1,firewall=1
numa: 0
ostype: l26
parent: preupgrade
scsi0: local-zfs:vm-202-disk-1,discard=on,size=50G,ssd=1
scsi1: local-zfs:vm-202-disk-2,discard=on,size=50G,ssd=1
scsi2: local-zfs:vm-202-disk-3,discard=on,size=50G,ssd=1
scsi3: local-zfs:vm-202-disk-4,discard=on,size=50G,ssd=1
scsi4: local-zfs:vm-202-disk-5,discard=on,size=50G,ssd=1
scsi5: local-zfs:vm-202-disk-6,discard=on,size=50G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=c04c53e8-8428-493c-8d68-be72a27ddc32
sockets: 1
vmgenid: cde717ed-2356-46a1-8227-91b23ccdde29

I then checked proxmox-boot-tool. Did the following steps:

1. update-initramfs -u && proxmox-boot-tool refresh (reboot -> still broken)
2. proxmox-boot-tool init /dev/sda2 && proxmox-boot-tool init /dev/sdb2 && proxmox-boot-tool refresh (reboot -> pve9 is booting)

Is there something maybe missing in the upgrade-tutorial?
 
The nested installation is using ZFS, or?
After upgrading and rebooting, the VM does not boot anymore and goes to VM-BIOS.
Can you make it boot by selecting an entry there manually? Is there even an entry for proxmox in the boot loader menu?
 
The nested installation is using ZFS, or?

Can you make it boot by selecting an entry there manually? Is there even an entry for proxmox in the boot loader menu?

Yes those vms are using 2 disks in a zfs mirror. Manually booting does not work it just goes back to bios again. This is how it looks like. I also only see one proxmox-entry.

1753028167495.png
 
Yes those vms are using 2 disks in a zfs mirror. Manually booting does not work it just goes back to bios again. This is how it looks like. I also only see one proxmox-entry.

View attachment 88217
And selecting that did not make it work?

Will try to reproduce, while I tried some nested upgrades, I'm not sure if it included a ZFS mirror one.
 
  • Like
Reactions: jsterr
And selecting that did not make it work?

Will try to reproduce, while I tried some nested upgrades, I'm not sure if it included a ZFS mirror one.

No unfortunately not. Thanks! On my side it only worked when doing the proxmox-boot-tool init /dev/sda2 && proxmox-boot-tool init /dev/sdb2 && proxmox-boot-tool refresh before rebooting.
 
Same with btrfs as root Filesystem.
I Blacklist by Default the zfs modules, since i dont need zfs in my Home Servers...

Im on the Minisforum MS-02 which doesnt have legacy boot / csm sadly.

Now i have to find Out how i can fix the bootloader :-)
 
Okay i have a solution, was actually pretty easy:

1. Download the Proxmox VE 9 ISO (Or 8) and mount it either with your KVM/Ipmi or make an USB-Drive
2. Boot from it and select "Install Proxmox VE (Graphical, Debug Mode)" -> Press CTRL+D when it asks...
3. Check your partitions: lsblk -o NAME,PARTTYPE,SIZE,FSTYPE,MOUNTPOINT
4. Mount your root Partition: mount /dev/nvmeXnXpX /mnt
5. mount proc/sys/run to /mnt: for d in dev proc sys run; do mount --rbind /$d /mnt/$d; done
6. Switch to your mounted OS: chroot /mnt /bin/bash
7. You are in your Proxmox now, you can fix the boot here, either with grub or: proxmox-boot-tool init /dev/nvmeXnXpX or refresh etc....
8. Youre done, reboot....

Cheers :-)
 
This should be fixed now with the latest repository state with proxmox-kernel-helper package in version 9.0.2.
Basically there was a reinit of the proxmox-boot-tool missing when the grub/shim update got pulled in by the major update, thus causing some fallout in certain setups managed by proxmox-boot-tool.
 
Last edited:
This should be fixed now with the latest repository state with proxmox-kernel-helper package in version 9.0.2.
Basically there was a reinit of the proxmox-boot-tool missing when the grub/shim update got pulled in by the major update, thus causing some fallout in certain setups managed by proxmox-boot-tool.
Hi Thomas, its sadly not fixed for btrfs root systems still. Im just upgrading all Servers to PVE9 and only those on btrfs fail.
Cheers
 
Hi Thomas, its sadly not fixed for btrfs root systems still. Im just upgrading all Servers to PVE9 and only those on btrfs fail.
Cheers
Do you remember when the breaking systems were initially setup? - there were changes to how booting is handled for btrfs installs during PVE 8.0 - so it would help to know which ones make problems for you.

* Are those UEFI or legacy BIOS setups?
* RAID configuration or single-disk installs?

EDIT:
* Was/is secureboot enabled on the systems?

If possible could you share the /var/log/apt/term.log (and history.log) of such a failing upgrade?
Thanks!
 
Last edited:
  • Like
Reactions: Ramalama
Do you remember when the breaking systems were initially setup? - there were changes to how booting is handled for btrfs installs during PVE 8.0 - so it would help to know which ones make problems for you.

* Are those UEFI or legacy BIOS setups?
* RAID configuration or single-disk installs?

EDIT:
* Was/is secureboot enabled on the systems?

If possible could you share the /var/log/apt/term.log (and history.log) of such a failing upgrade?
Thanks!
Sorry for the late reply,
yes uefi+secure boot.

I Installed the Server on 10.Okt.2023 at 16:09:27, with Kernel 6.2.16-3, Debian 12.2.0-14 (Thats info from the oldest boot log)
So it was the Proxmox 8.0 ISO.

there is maybe broken text in the term.log -> that was the ncurses window, which asks if you want to restart services during installation....
Thanks Stoiko :-)
 

Attachments

I Installed the Server on 10.Okt.2023 at 16:09:27, with Kernel 6.2.16-3, Debian 12.2.0-14 (Thats info from the oldest boot log)
So it was the Proxmox 8.0 ISO.
Thanks - will try reproducing it with the 8.0 ISO - afaics it's a single disk system?

there is maybe broken text in the term.log
yeah it's not the most straight-forward to read - but it does contain the most information - and in that case it points us to the issue with your system:
Code:
This system is booted via proxmox-boot-tool, running proxmox-boot-tool init for all configured bootdisks
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="956C-8A9B" SIZE="1073741824" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT="/boot/efi"
E: '/dev/disk/by-uuid/956C-8A9B' is mounted on '/boot/efi' - exiting.

so your system seems to have been switched to using proxmox-boot-tool at some point (the installer only changed to do so for BTRFS with the 8.4 ISO)
this would not be an issue if /dev/sda2 is not mounted permanently on /boot/efi - I'd remove the entry from fstab, unmount it and run proxmox-boot-tool reinit

I hope this helps!
 
Thanks - will try reproducing it with the 8.0 ISO - afaics it's a single disk system?


yeah it's not the most straight-forward to read - but it does contain the most information - and in that case it points us to the issue with your system:
Code:
This system is booted via proxmox-boot-tool, running proxmox-boot-tool init for all configured bootdisks
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="956C-8A9B" SIZE="1073741824" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT="/boot/efi"
E: '/dev/disk/by-uuid/956C-8A9B' is mounted on '/boot/efi' - exiting.

so your system seems to have been switched to using proxmox-boot-tool at some point (the installer only changed to do so for BTRFS with the 8.4 ISO)
this would not be an issue if /dev/sda2 is not mounted permanently on /boot/efi - I'd remove the entry from fstab, unmount it and run proxmox-boot-tool reinit

I hope this helps!
Its a raid1 (2 disks).

Youre absolutely right, i seen this Message in other Servers and simply unmounted /boot/efi...

The Problem ist Just, on such big Upgrades where you Upgrade 600 packages, you wont See the log and simply reboot
:)
And bam, you have to recover with the iso xD

Ist there any reason, not to switch to the proxmox boot tool for all filesystems etc?
Then it would be an easy fix, in such cases simply to unmount /boot/efi

Cheers
:)
 
The Problem ist Just, on such big Upgrades where you Upgrade 600 packages, you wont See the log and simply reboot
:)
And bam, you have to recover with the iso xD
yes - can relate to that - it bit me while testing as well ... - We do try to add checks for more common problematic configurations to the `pve8to9` script
(using proxmox-boot-tool, while having the ESP mounted on /boot/efi is nothing I've seen too often until now) to prevent some of the issues which are reported but drown in the large output of a dist-upgrade to a new major version...

Ist there any reason, not to switch to the proxmox boot tool for all filesystems etc?
Then it would be an easy fix, in such cases simply to unmount /boot/efi
We do change which setups use proxmox-boot-tool from time to time, and there's also quite a few things changing in the boot-loaders we use (grub2 and systemd-boot currently).
We might change our ISO to use proxmox-boot-tool for all systems - but this would not help with issues that happen while upgrading older systems:
* setups created before the change would remain as they are (most could opt-in, but for some very long-running systems the partition layout would require a reboot, which not everyone can/wants to do)
* installing on top of Debian would not use proxmox-boot-tool
Both things (installing on top of Debian, and systems you can upgrade across major version changes) are quite important IMHO.

I hope this explains it!