[SOLVED] Proxmox VE 8 to 9 on OVH bare metal servers

jf2021

Member
Jul 17, 2021
20
13
23
Hi. After struggling a little to upgrade from Proxmox 8 to 9 on bare metal servers hosted by OVH, I thought I'd explain how I succeeded.

First of all, this is not a generic tutorial, this is just how I could solve my problem on my 2 servers (one "So-yo-start SYS-LE-2" and one "ADVANCE-1 | AMD EPYC 4244P"). It might be different depending on your config, your server model, your partition table, raid etc... So use wisely. Backup first and don't do that in production...

Servers have 2 1TB SSD, they use soft raid, and have an LVM partition for /var/lib/vz. No ZFS.

First, I upgraded them following the officiel guide (https://pve.proxmox.com/wiki/Upgrade_from_8_to_9).

- Back up everything
- Shut down all VM

Check everything was OK
Code:
pve8to9
pve8to9 --full

Update to latest proxmox 8
Code:
apt update && apt dist-upgrade

Check we are in 8.4.1+
Code:
pveversion
pve-manager/8.4.11/14a32011146091ed (running kernel: 6.8.12-13-pve)

change repo
Code:
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list.d/*

Upgrade
Code:
apt update
apt dist-upgrade
During the process check config differences when necessary and make decision

# Reboot
Code:
reboot

Everything went smooth during the upgrade, but I wasn't able to boot after that. The booting process failed and ended up in the bios of the server without any message in the console to help...

-----------------
[EDITED ON 2025/08/29]
PLEASE SKIP THE REST OF THIS MESSAGE AND JUMP TO THE NEXT POST by @sbraz : THE OVH TEAM PROVIDED A PROPER SOLUTION TO SOLVE THIS PROBLEM

If you have already applied my solution below, you should check your EFI boot order with efibootmgr and restore PXE in 1st place if it's not

-----------------


So I rebooted in rescue mode and decided to reinstall and reconfigure grub-efi-amd64. I'm not sure all those steps where necessary, but anyway, this was able to make my servers boot again.

So once in rescue mode

Identifying partition and mounting points to mount /root
Code:
lsblk -f

In my case, it looked like that :
nvme0n1
├─nvme0n1p1 vfat FAT16 EFI_SYSPART
├─nvme0n1p2 linux_raid_member 1.2 md2
│ └─md2 ext4 1.0 boot
├─nvme0n1p3 linux_raid_member 1.2 md3
│ └─md3 ext4 1.0 root
├─nvme0n1p4 swap 1 swap-nvme0n1p4
├─nvme0n1p5 linux_raid_member 1.2 md5
│ └─md5 LVM2_member LVM2 001
│ └─vg-data ext4 1.0 var-lib-vz
├─nvme0n1p6 linux_raid_member 1.2 md6
│ └─md6 ext4 1.0 var-log
└─nvme0n1p7 iso9660 Joliet Extension config-2
nvme1n1
├─nvme1n1p1 vfat FAT16 EFI_SYSPART
├─nvme1n1p2 linux_raid_member 1.2 md2
│ └─md2 ext4 1.0 boot
├─nvme1n1p3 linux_raid_member 1.2 md3
│ └─md3 ext4 1.0 root
├─nvme1n1p4 swap 1 swap-nvme1n1p4
├─nvme1n1p5 linux_raid_member 1.2 md5
│ └─md5 LVM2_member LVM2 001
│ └─vg-data ext4 1.0 var-lib-vz
└─nvme1n1p6 linux_raid_member 1.2 md6
└─md6 ext4 1.0 var-log

Prepare the /mnt if not present
Code:
mkdir -p /mnt

I mounted the boot and root partitions according to the partition table
Code:
mount /dev/md3 /mnt
mount /dev/md2 /mnt/boot

Mounting the 1st EFI partition (that is on first SSD)
Code:
mount /dev/nvme0n1p1 /mnt/boot/efi

Preparing for chroot
Code:
mount --bind /dev /mnt/dev
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys
mount --bind /dev/pts /mnt/dev/pts

Chrooting to reinstall grub
Code:
chroot /mnt /bin/bash

Reinstalling grub-efi-amd64 and shim-signed bootloader
Code:
apt update
apt install --reinstall grub-efi-amd64 shim-signed
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=proxmox --recheck --no-floppy
update-grub
grub-install /dev/nvme0n1
grub-install /dev/nvme1n1

I had some warnings saying
warning: EFI variables cannot be set on this system.
warning: You will have to complete the GRUB setup manually.

SO i mount efivarfs to check entries in boot manager
Code:
mount -t efivarfs efivarfs /sys/firmware/efi/efivars

# Check boot manager entries
Code:
efibootmgr -v

If proxmox not present
Code:
efibootmgr --create --disk /dev/nvme0n1 --part 1 --label "proxmox" --loader '\EFI\proxmox\grubx64.efi'
Else : check that it's correctly set up. If not correctly set up, delete and recreate proxmox entry properly
Code:
efibootmgr -b <ID_OF_ENTRY_TO_REMOVE> -B
efibootmgr --create --disk /dev/nvme0n1 --part 1 --label "proxmox" --loader '\EFI\proxmox\grubx64.efi'

Unmount efivarfs
Code:
umount /sys/firmware/efi/efivars

Exit chroot
Code:
exit

Unmounting 1st EFI partition
Code:
umount /mnt/boot/efi

Mounting 2nd EFI partition : EFI partition are not in the raid, so for consistency, I install grub on both. Not sure if it's necessary
Code:
mount /dev/nvme1n1p1 /mnt/boot/efi

chrooting to reinstall grub in 2nd partition
Code:
chroot /mnt /bin/bash
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=proxmox --recheck --no-floppy

Exit chroot
Code:
exit

Unmounting all mounts
Code:
umount -R /mnt

Rebooting, fingers crossed...
Code:
reboot

And that did the trick for me
 
Last edited:
Hi,
I work for OVHcloud on our OS images and I can indeed reproduce the problem on UEFI boot servers with PVE 8 installed on more than one disk with OVHcloud's installer. I'll first provide the solution and then explain why this issue occurs.

You do not need to change the EFI boot order or reinstall GRUB. The following script will sync your EFI System Partitions (ESPs), which will fix the boot. You need to run it after the upgrade, either before rebooting from PVE 8, or from the rescue system if you have already performed the upgrade and are unable to boot to disk.

The fix:
Bash:
#!/bin/bash

set -euo pipefail

overall_newest_mtime=0
while read -r partition; do
    mountpoint=$(mktemp -d)
    mount "${partition}" "${mountpoint}"
    newest_mtime=$(find "${mountpoint}" -type f -printf "%T@\n" | cut -d. -f1 | sort -n | tail -n1)
    if [[ $newest_mtime -gt $overall_newest_mtime ]]; then
        overall_newest_mtime=${newest_mtime}
        newest_esp=${partition}
        echo "${partition} is currently the newest ESP with a file modified at $(date -d @"${newest_mtime}" -Is)"
    fi
    umount "${mountpoint}"
    rmdir "${mountpoint}"
done < <(blkid -o device -t LABEL=EFI_SYSPART)

newest_esp_mountpoint=$(mktemp -d)
mount "${newest_esp}" "${newest_esp_mountpoint}"
while read -r partition; do
    if [[ "${partition}" == "${newest_esp}" ]]; then
        continue
    fi
    echo "Copying data from ${newest_esp} to ${partition}"
    mountpoint=$(mktemp -d)
    mount "${partition}" "${mountpoint}"
    rsync -ax "${newest_esp_mountpoint}/" "${mountpoint}/"
    umount "${mountpoint}"
    rmdir "${mountpoint}"
done < <(blkid -o device -t LABEL=EFI_SYSPART)
umount "${newest_esp_mountpoint}"
rmdir "${newest_esp_mountpoint}"
echo "Done synchronizing ESPs"

Here's the output on a server with 6 disks:
Code:
root@rescue12-customer-eu (ns123.ip-1-2-3.eu) ~ # ./sync_esps.sh
/dev/nvme0n1p1 is currently the newest ESP with a file modified at 2025-08-28T19:06:16+00:00
/dev/nvme3n1p1 is currently the newest ESP with a file modified at 2025-08-28T19:16:52+00:00
Copying data from /dev/nvme3n1p1 to /dev/nvme0n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme2n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme5n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme1n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme4n1p1
Done synchronizing ESPs

Detailed explanation:

The issue stems from the fact that we install Proxmox VE 8 with multiple ESPs and that they are not kept in sync. When you perform the update, only one of these ESPs gets updated, the one currently mounted at /boot/efi. On Linux OSes installed by OVHcloud, ESPs all have the same EFI_SYSPART label but only one of them is mounted at once.

Output from a server booted to disk with PVE 8 installed:
Code:
# grep LABEL=EFI_SYSPART /etc/fstab
LABEL=EFI_SYSPART    /boot/efi    vfat    defaults    0    1
# blkid -t LABEL=EFI_SYSPART
/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8542-85E8" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="aabc9c09-bda5-4be0-bf1a-63df23f4f9eb"
/dev/nvme3n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8582-5073" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="e644b85d-c397-42f1-8ac9-cee99f6ada16"
/dev/nvme2n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="856C-BF68" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="58fc96f9-d0db-426c-842d-67bc73358a3e"
/dev/nvme5n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="85AB-CC15" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="0e99c478-263a-467b-8d51-98ff1fd921f5"
/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8557-D623" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="49543325-1ba6-4525-b8e5-99621eeaef6e"
/dev/nvme4n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8597-1476" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="12a99ce6-d2bb-491e-807b-3701eb1a073b"
# findmnt /boot/efi
TARGET    SOURCE         FSTYPE OPTIONS
/boot/efi /dev/nvme1n1p1 vfat   rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro

Now, when the server reboots, either iPXE or rEFInd attempt to find EFI applications on its ESPs. They select EFI/proxmox/grubx64.efi but most of the time, it seems they don't pick the same ESP as Linux.

Therefore, because PVE 8 → PVE 9 upgraded GRUB from 2.06 to 2.12, the server ends up loading an old grubx64.efi which is unable to load the recent GRUB modules installed to /boot.
In the KVM, you'll see something like this:
1756410439691.png

There is a bit more detail in the SOL console, the following message appears:
Code:
error: symbol `grub_efi_set_variable_to_string' not found.

To fix this, we simply need to make sure we have a recent grubx64.efi version on all ESPs. The script given at the beginning of the post picks the ESP with the most recent files and copies its contents to all the other ESPs. If you want to do this manually, you can just copy grubx64.efi. My tests show that it is the the only file which was modified by the upgrade.

To prevent this from occurring in the future, we plan on creating the ESP on a md RAID device. In the meantime, we recommend ensuring all your ESPs are kept in sync, either manually or with a script.

One last thing: we do not recommend altering the EFI boot order or adding entries to it. Doing so might break PXE boot and prevent your server from booting into rescue and obtaining microcode updates via iPXE. At OVHcloud, all bare metal servers should follow this boot process:
  • PXE boot (this should always be the first entry in the boot order)
  • execute the iPXE boot loader
  • iPXE obtains its boot script from our APIs
  • iPXE updates CPU microcodes
  • Then, depending on the boot state:
    • If the customer set the boot state to "Boot to disk":
      • if the server's efiBootloaderPath property is set, iPXE boots to disk by executing the corresponding EFI application
      • If it is not set, iPXE starts rEFIND which boots to disk by executing the first EFI application it detects
    • If the customer set the boot state to the rescue system, iPXE loads it
 
I do confirm that it fixes the boot of our proxmox instance.

@sbraz I started to experience this after a simple "apt upgrade". Do we have to do this at each proxmox upgrade that touches the grub? Will there be a more permanent fix in the future?
 
Hi @autra,
What PVE version are you using? Did you see a major GRUB update from 2.06 to 2.12 recently? Check zgrep -w "upgrade grub" /var/log/dpkg.log*

Minor GRUB upgrades should not break compatibility between the EFI application in /boot/efi and the GRUB modules in /boot.

The proper fix to avoid this in the future is to put the ESP on top of a md RAID array. It's something we plan on doing in the future for new OS installations but I can try to write a tutorial to help you do it on an existing installation, would you like that?
 
@autra

Don't use apt upgrade on PVE
Always use apt full-upgrade or apt dist-upgrade otherwise it could happen that some pve packages are not properly updated.
 
AFAIK, BIOS can't read md RAID.
That's why there is Proxmox Boot Tool which sync each ESP boot partitions.
If the RAID is created with version 0.90 or 1.0, the superblock is located at the end of the partition. This usually means that the BIOS treats the ESP over RAID as a normal FAT partition. We haven't yet been able to test this on all the hardware, that's one of the reasons we currently don't do this.
What is your board model out of curiosity? Maybe I have one I could test.
 
Hi @autra,
What PVE version are you using? Did you see a major GRUB update from 2.06 to 2.12 recently? Check zgrep -w "upgrade grub" /var/log/dpkg.log*
I'm using 8.4.13.

In those dpkg.log, I have:
Code:
/var/log/dpkg.log:2025-09-12 10:30:32 upgrade grub-efi-amd64:amd64 2.06-13+pmx2 2.06-13+pmx7
/var/log/dpkg.log:2025-09-12 10:30:32 upgrade grub-efi-amd64-bin:amd64 2.06-13+pmx2 2.06-13+pmx7
/var/log/dpkg.log:2025-09-12 10:30:33 upgrade grub-common:amd64 2.06-13+pmx2 2.06-13+pmx7
/var/log/dpkg.log.8.gz:2025-01-14 18:44:54 upgrade grub-efi-amd64:amd64 2.06-3~deb11u6 2.06-13+pmx2
/var/log/dpkg.log.8.gz:2025-01-14 18:44:54 upgrade grub-efi-amd64-bin:amd64 2.06-3~deb11u6 2.06-13+pmx2
/var/log/dpkg.log.8.gz:2025-01-14 18:44:54 upgrade grub-common:amd64 2.06-3~deb11u6 2.06-13+pmx2
So no major upgrade since a while.
The proper fix to avoid this in the future is to put the ESP on top of a md RAID array. It's something we plan on doing in the future for new OS installations but I can try to write a tutorial to help you do it on an existing installation, would you like that?
That'd be great, yes, thanks!
 
Hi @autra,
I don't really understand what happened in your case. If you want to troubleshoot this further, you should compare the contents of all your ESPs.

As for the md RAID conversion, I ended up writing a script instead of a tutorial because I figured it could be more useful. I tested it on 70 newly-installed Proxmox VE 8 servers with various hardware specs and it works fine but I might have missed corner cases.

Here's the script, PROVIDED "AS IS" WITH NO GUARANTEE, USE AT YOUR OWN RISK, PLEASE ONLY RUN IT IF YOU UNDERSTAND WHAT IT DOES: https://gist.github.com/sbraz/0b58302a244cf1767b931b95226fefb8
  • After executing it, you need to add the new RAID array to mdadm.conf and rebuild the initramfs so that it contains the updated mdadm.conf file. Otherwise, the system will not recognize the array and it might be assembled as /dev/md127 at the next boot, which could be confusing.
  • You also need to ensure your fstab's entry for /boot/efi is valid with the new layout. If you didn't modify your fstab since the installation, the ESP should be mounted with LABEL=EFI_SYSPART, which will still work because the new FAT filesystem uses this same label.

Example run from Proxmox VE 8:
Code:
# Run the script itself
root@test:~# ./esp_over_raid.sh 
Unmounted /boot/efi
/dev/nvme0n1p1 is the newest ESP with a file modified at 2025-09-24T20:48:02+00:00
Copied newest ESP contents to /root/efi_system_partition_data/files/
Backed up /dev/nvme0n1p1 as /root/efi_system_partition_data/nvme0n1p1
Wiping signatures from /dev/nvme0n1p1
Wiped signatures from /dev/nvme0n1p1
Backed up /dev/nvme1n1p1 as /root/efi_system_partition_data/nvme1n1p1
Wiping signatures from /dev/nvme1n1p1
Wiped signatures from /dev/nvme1n1p1
Creating RAID1 device /dev/md0
mdadm: size set to 523200K
mdadm: array /dev/md0 started.
Creating FAT filesystem on /dev/md0
Copying ESP contents to /dev/md0 mounted on /boot/efi
The ESP is now over RAID1:
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
md0           9:0    0 510.9M  0 raid1 /boot/efi
├─nvme1n1p1 259:1    0   511M  0 part  
│ └─nvme1n1 259:0    0 419.2G  0 disk  
└─nvme0n1p1 259:7    0   511M  0 part  
  └─nvme0n1 259:6    0 419.2G  0 disk  
Script completed successfully, please check the /boot/efi entry in /etc/fstab, update mdadm.conf and rebuild the initramfs
# Update mdadm.conf, for instance using Debian's mkconf script
root@test:~# cp /etc/mdadm/mdadm.conf .
root@test:~# /usr/share/mdadm/mkconf force-generate
root@test:~# diff -u -I '^#' mdadm.conf /etc/mdadm/mdadm.conf 
--- mdadm.conf    2025-09-24 21:39:10.113286805 +0000
+++ /etc/mdadm/mdadm.conf    2025-09-24 21:42:29.457819759 +0000
@@ -18,6 +18,7 @@
 MAILADDR root
 
 # definitions of existing MD arrays
+ARRAY /dev/md0 UUID=4845c9d4:caf2d945:e58f4aa9:a69bb1cc
 ARRAY /dev/md/md2  metadata=1.2 UUID=c1b98502:0c9041aa:52f8b1ff:654bf40e name=md2
 ARRAY /dev/md/md3  metadata=1.2 UUID=76f7a921:78a279e2:5fdf9df1:cb4e0b06 name=md3
 ARRAY /dev/md/md5  metadata=1.2 UUID=e0b7ff4f:ead83485:705fa843:4f2607bc name=md5
# Rebuild the initramfs with the updated mdadm.conf file
root@test:~# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-6.8.12-15-pve
setupcon is missing. Please install the 'console-setup' package.
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
# Make sure the fstab's /boot/efi entry still works
root@test:~# grep /boot/efi /etc/fstab
LABEL=EFI_SYSPART    /boot/efi    vfat    defaults    0    1
root@test:~# umount /boot/efi
root@test:~# mount /boot/efi
# Check that the server can reboot
root@test:~# reboot

Some notes about the script:
  • It can be executed either from Proxmox itself or from the rescue system. The only difference is that if you run it from a system where /boot/efi is not mounted, it doesn't remount it at the end. If you run it from the rescue, you will need to chroot into Proxmox to update mdadm.conf and rebuild the initramfs.
  • It should work on all Linux distributions, I tested it on AlmaLinux 10 and Rocky Linux 10. It only requires bash, util-linux, dosfstools and rsync.
  • It relies on the mtime of files in the ESP to guess which one is up-to-date. In some cases, this might cause it to pick the wrong ESP as source. If that happens, just reinstall GRUB with grub-install --no-nvram (the --no-nvram option prevents changes to the boot order; failure to specify it may cause the server to boot to disk directly, in which case it's no longer possible to boot into rescue).
  • It relies heavily on the EFI_SYSPART label. Do not use this label for anything other than ESPs or it will break these partitions.
  • It backs up all ESPs so you can restore them later (for instance using dd) if you want to revert to the old setup with multiple individual ESPs. Please note that each ESP backup takes up roughly 511 MiB of space.
Also:
  • We're working on applying a similar configuration for all new installations. We're still performing tests to ensure it works on all the hardware we offer, but we hope to make the switch in the coming weeks.
  • If you installed Proxmox before June 2024, your server's efiBootloaderPath attribute is likely unset. This means that rEFInd scans the ESP for .efi files at boot. This is not optimal as rEFInd may pick files other than grubx64.efi, it's also quite slow. To speed things up and let iPXE load GRUB on its own, set efiBootloaderPath to \efi\proxmox\grubx64.efi (case-insensitive) using this API route. Don't forget to escape backslashes in the JSON payload, it should look like this:
    JSON:
    {
      "efiBootloaderPath": "\\efi\\proxmox\\grubx64.efi"
    }
 
  • Like
Reactions: _gabriel