[SOLVED] Proxmox VE 8 to 9 on OVH bare metal servers

jf2021 · Aug 18, 2025

Hi. After struggling a little to upgrade from Proxmox 8 to 9 on bare metal servers hosted by OVH, I thought I'd explain how I succeeded.

First of all, this is not a generic tutorial, this is just how I could solve my problem on my 2 servers (one "So-yo-start SYS-LE-2" and one "ADVANCE-1 | AMD EPYC 4244P"). It might be different depending on your config, your server model, your partition table, raid etc... So use wisely. Backup first and don't do that in production...

Servers have 2 1TB SSD, they use soft raid, and have an LVM partition for /var/lib/vz. No ZFS.

First, I upgraded them following the officiel guide (https://pve.proxmox.com/wiki/Upgrade_from_8_to_9).

- Back up everything
- Shut down all VM

Check everything was OK

Code:

pve8to9
pve8to9 --full

Update to latest proxmox 8

Code:

apt update && apt dist-upgrade

Check we are in 8.4.1+

Code:

pveversion

pve-manager/8.4.11/14a32011146091ed (running kernel: 6.8.12-13-pve)

change repo

Code:

sed -i 's/bookworm/trixie/g' /etc/apt/sources.list
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list.d/*

Upgrade

Code:

apt update
apt dist-upgrade

During the process check config differences when necessary and make decision

# Reboot

Code:

reboot

Everything went smooth during the upgrade, but I wasn't able to boot after that. The booting process failed and ended up in the bios of the server without any message in the console to help...

-----------------
[EDITED ON 2025/08/29]
PLEASE SKIP THE REST OF THIS MESSAGE AND JUMP TO THE NEXT POST by @sbraz : THE OVH TEAM PROVIDED A PROPER SOLUTION TO SOLVE THIS PROBLEM

If you have already applied my solution below, you should check your EFI boot order with efibootmgr and restore PXE in 1st place if it's not
-----------------

So I rebooted in rescue mode and decided to reinstall and reconfigure grub-efi-amd64. I'm not sure all those steps where necessary, but anyway, this was able to make my servers boot again.

So once in rescue mode

Identifying partition and mounting points to mount /root

Code:

lsblk -f

In my case, it looked like that :

nvme0n1

├─nvme0n1p1 vfat FAT16 EFI_SYSPART

├─nvme0n1p2 linux_raid_member 1.2 md2

│ └─md2 ext4 1.0 boot

├─nvme0n1p3 linux_raid_member 1.2 md3

│ └─md3 ext4 1.0 root

├─nvme0n1p4 swap 1 swap-nvme0n1p4

├─nvme0n1p5 linux_raid_member 1.2 md5

│ └─md5 LVM2_member LVM2 001

│ └─vg-data ext4 1.0 var-lib-vz

├─nvme0n1p6 linux_raid_member 1.2 md6

│ └─md6 ext4 1.0 var-log

└─nvme0n1p7 iso9660 Joliet Extension config-2

nvme1n1

├─nvme1n1p1 vfat FAT16 EFI_SYSPART

├─nvme1n1p2 linux_raid_member 1.2 md2

│ └─md2 ext4 1.0 boot

├─nvme1n1p3 linux_raid_member 1.2 md3

│ └─md3 ext4 1.0 root

├─nvme1n1p4 swap 1 swap-nvme1n1p4

├─nvme1n1p5 linux_raid_member 1.2 md5

│ └─md5 LVM2_member LVM2 001

│ └─vg-data ext4 1.0 var-lib-vz

└─nvme1n1p6 linux_raid_member 1.2 md6

└─md6 ext4 1.0 var-log

Prepare the /mnt if not present

Code:

mkdir -p /mnt

I mounted the boot and root partitions according to the partition table

Code:

mount /dev/md3 /mnt
mount /dev/md2 /mnt/boot

Mounting the 1st EFI partition (that is on first SSD)

Code:

mount /dev/nvme0n1p1 /mnt/boot/efi

Preparing for chroot

Code:

mount --bind /dev /mnt/dev
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys
mount --bind /dev/pts /mnt/dev/pts

Chrooting to reinstall grub

Code:

chroot /mnt /bin/bash

Reinstalling grub-efi-amd64 and shim-signed bootloader

Code:

apt update
apt install --reinstall grub-efi-amd64 shim-signed
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=proxmox --recheck --no-floppy
update-grub
grub-install /dev/nvme0n1
grub-install /dev/nvme1n1

I had some warnings saying

warning: EFI variables cannot be set on this system.

warning: You will have to complete the GRUB setup manually.

SO i mount efivarfs to check entries in boot manager

Code:

mount -t efivarfs efivarfs /sys/firmware/efi/efivars

# Check boot manager entries

Code:

efibootmgr -v

If proxmox not present

Code:

efibootmgr --create --disk /dev/nvme0n1 --part 1 --label "proxmox" --loader '\EFI\proxmox\grubx64.efi'

Else : check that it's correctly set up. If not correctly set up, delete and recreate proxmox entry properly

Code:

efibootmgr -b <ID_OF_ENTRY_TO_REMOVE> -B
efibootmgr --create --disk /dev/nvme0n1 --part 1 --label "proxmox" --loader '\EFI\proxmox\grubx64.efi'

Unmount efivarfs

Code:

umount /sys/firmware/efi/efivars

Exit chroot

Code:

exit

Unmounting 1st EFI partition

Code:

umount /mnt/boot/efi

Mounting 2nd EFI partition : EFI partition are not in the raid, so for consistency, I install grub on both. Not sure if it's necessary

Code:

mount /dev/nvme1n1p1 /mnt/boot/efi

chrooting to reinstall grub in 2nd partition

Code:

chroot /mnt /bin/bash
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=proxmox --recheck --no-floppy

Exit chroot

Code:

exit

Unmounting all mounts

Code:

umount -R /mnt

Rebooting, fingers crossed...

Code:

reboot

And that did the trick for me

sbraz · Aug 28, 2025

Hi,
I work for OVHcloud on our OS images and I can indeed reproduce the problem on UEFI boot servers with PVE 8 installed on more than one disk with OVHcloud's installer. I'll first provide the solution and then explain why this issue occurs.

You do not need to change the EFI boot order or reinstall GRUB. The following script will sync your EFI System Partitions (ESPs), which will fix the boot. You need to run it after the upgrade, either before rebooting from PVE 8, or from the rescue system if you have already performed the upgrade and are unable to boot to disk.

The fix:

Bash:

#!/bin/bash

set -euo pipefail

overall_newest_mtime=0
while read -r partition; do
    mountpoint=$(mktemp -d)
    mount "${partition}" "${mountpoint}"
    newest_mtime=$(find "${mountpoint}" -type f -printf "%T@\n" | cut -d. -f1 | sort -n | tail -n1)
    if [[ $newest_mtime -gt $overall_newest_mtime ]]; then
        overall_newest_mtime=${newest_mtime}
        newest_esp=${partition}
        echo "${partition} is currently the newest ESP with a file modified at $(date -d @"${newest_mtime}" -Is)"
    fi
    umount "${mountpoint}"
    rmdir "${mountpoint}"
done < <(blkid -o device -t LABEL=EFI_SYSPART)

newest_esp_mountpoint=$(mktemp -d)
mount "${newest_esp}" "${newest_esp_mountpoint}"
while read -r partition; do
    if [[ "${partition}" == "${newest_esp}" ]]; then
        continue
    fi
    echo "Copying data from ${newest_esp} to ${partition}"
    mountpoint=$(mktemp -d)
    mount "${partition}" "${mountpoint}"
    rsync -ax "${newest_esp_mountpoint}/" "${mountpoint}/"
    umount "${mountpoint}"
    rmdir "${mountpoint}"
done < <(blkid -o device -t LABEL=EFI_SYSPART)
umount "${newest_esp_mountpoint}"
rmdir "${newest_esp_mountpoint}"
echo "Done synchronizing ESPs"

Here's the output on a server with 6 disks:

Code:

root@rescue12-customer-eu (ns123.ip-1-2-3.eu) ~ # ./sync_esps.sh
/dev/nvme0n1p1 is currently the newest ESP with a file modified at 2025-08-28T19:06:16+00:00
/dev/nvme3n1p1 is currently the newest ESP with a file modified at 2025-08-28T19:16:52+00:00
Copying data from /dev/nvme3n1p1 to /dev/nvme0n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme2n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme5n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme1n1p1
Copying data from /dev/nvme3n1p1 to /dev/nvme4n1p1
Done synchronizing ESPs

Detailed explanation:

The issue stems from the fact that we install Proxmox VE 8 with multiple ESPs and that they are not kept in sync. When you perform the update, only one of these ESPs gets updated, the one currently mounted at /boot/efi. On Linux OSes installed by OVHcloud, ESPs all have the same EFI_SYSPART label but only one of them is mounted at once.

Output from a server booted to disk with PVE 8 installed:

Code:

# grep LABEL=EFI_SYSPART /etc/fstab
LABEL=EFI_SYSPART    /boot/efi    vfat    defaults    0    1
# blkid -t LABEL=EFI_SYSPART
/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8542-85E8" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="aabc9c09-bda5-4be0-bf1a-63df23f4f9eb"
/dev/nvme3n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8582-5073" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="e644b85d-c397-42f1-8ac9-cee99f6ada16"
/dev/nvme2n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="856C-BF68" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="58fc96f9-d0db-426c-842d-67bc73358a3e"
/dev/nvme5n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="85AB-CC15" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="0e99c478-263a-467b-8d51-98ff1fd921f5"
/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8557-D623" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="49543325-1ba6-4525-b8e5-99621eeaef6e"
/dev/nvme4n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="8597-1476" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="12a99ce6-d2bb-491e-807b-3701eb1a073b"
# findmnt /boot/efi
TARGET    SOURCE         FSTYPE OPTIONS
/boot/efi /dev/nvme1n1p1 vfat   rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro

Now, when the server reboots, either iPXE or rEFInd attempt to find EFI applications on its ESPs. They select EFI/proxmox/grubx64.efi but most of the time, it seems they don't pick the same ESP as Linux.

Therefore, because PVE 8 → PVE 9 upgraded GRUB from 2.06 to 2.12, the server ends up loading an old grubx64.efi which is unable to load the recent GRUB modules installed to /boot.
In the KVM, you'll see something like this:

There is a bit more detail in the SOL console, the following message appears:

Code:

error: symbol `grub_efi_set_variable_to_string' not found.

To fix this, we simply need to make sure we have a recent grubx64.efi version on all ESPs. The script given at the beginning of the post picks the ESP with the most recent files and copies its contents to all the other ESPs. If you want to do this manually, you can just copy grubx64.efi. My tests show that it is the the only file which was modified by the upgrade.

To prevent this from occurring in the future, we plan on creating the ESP on a md RAID device. In the meantime, we recommend ensuring all your ESPs are kept in sync, either manually or with a script.

One last thing: we do not recommend altering the EFI boot order or adding entries to it. Doing so might break PXE boot and prevent your server from booting into rescue and obtaining microcode updates via iPXE. At OVHcloud, all bare metal servers should follow this boot process:

PXE boot (this should always be the first entry in the boot order)
execute the iPXE boot loader
iPXE obtains its boot script from our APIs
iPXE updates CPU microcodes
Then, depending on the boot state:
- If the customer set the boot state to "Boot to disk":
  - if the server's efiBootloaderPath property is set, iPXE boots to disk by executing the corresponding EFI application
  - If it is not set, iPXE starts rEFIND which boots to disk by executing the first EFI application it detects
- If the customer set the boot state to the rescue system, iPXE loads it

jf2021 · Aug 29, 2025

Thanks a lot @sbraz for the detailed fix and explanantions. I edited my first post to recommend reading yours !

autra · Sep 12, 2025

I do confirm that it fixes the boot of our proxmox instance.

@sbraz I started to experience this after a simple "apt upgrade". Do we have to do this at each proxmox upgrade that touches the grub? Will there be a more permanent fix in the future?

sbraz · Sep 12, 2025

Hi @autra,
What PVE version are you using? Did you see a major GRUB update from 2.06 to 2.12 recently? Check zgrep -w "upgrade grub" /var/log/dpkg.log*

Minor GRUB upgrades should not break compatibility between the EFI application in /boot/efi and the GRUB modules in /boot.

The proper fix to avoid this in the future is to put the ESP on top of a md RAID array. It's something we plan on doing in the future for new OS installations but I can try to write a tutorial to help you do it on an existing installation, would you like that?

MarkusKo · Sep 12, 2025

@autra

Don't use apt upgrade on PVE
Always use apt full-upgrade or apt dist-upgrade otherwise it could happen that some pve packages are not properly updated.

_gabriel · Sep 12, 2025

sbraz said:
ESP on top of a md RAID array

AFAIK, BIOS can't read md RAID.
That's why there is Proxmox Boot Tool which sync each ESP boot partitions.

sbraz · Sep 12, 2025

_gabriel said:
AFAIK, BIOS can't read md RAID.
That's why there is Proxmox Boot Tool which sync each ESP boot partitions.

If the RAID is created with version 0.90 or 1.0, the superblock is located at the end of the partition. This usually means that the BIOS treats the ESP over RAID as a normal FAT partition. We haven't yet been able to test this on all the hardware, that's one of the reasons we currently don't do this.
What is your board model out of curiosity? Maybe I have one I could test.

autra · Sep 23, 2025

sbraz said:
Hi @autra,
What PVE version are you using? Did you see a major GRUB update from 2.06 to 2.12 recently? Check zgrep -w "upgrade grub" /var/log/dpkg.log*

I'm using 8.4.13.

In those dpkg.log, I have:

Code:

/var/log/dpkg.log:2025-09-12 10:30:32 upgrade grub-efi-amd64:amd64 2.06-13+pmx2 2.06-13+pmx7
/var/log/dpkg.log:2025-09-12 10:30:32 upgrade grub-efi-amd64-bin:amd64 2.06-13+pmx2 2.06-13+pmx7
/var/log/dpkg.log:2025-09-12 10:30:33 upgrade grub-common:amd64 2.06-13+pmx2 2.06-13+pmx7
/var/log/dpkg.log.8.gz:2025-01-14 18:44:54 upgrade grub-efi-amd64:amd64 2.06-3~deb11u6 2.06-13+pmx2
/var/log/dpkg.log.8.gz:2025-01-14 18:44:54 upgrade grub-efi-amd64-bin:amd64 2.06-3~deb11u6 2.06-13+pmx2
/var/log/dpkg.log.8.gz:2025-01-14 18:44:54 upgrade grub-common:amd64 2.06-3~deb11u6 2.06-13+pmx2

So no major upgrade since a while.

sbraz said:
The proper fix to avoid this in the future is to put the ESP on top of a md RAID array. It's something we plan on doing in the future for new OS installations but I can try to write a tutorial to help you do it on an existing installation, would you like that?

That'd be great, yes, thanks!

sbraz · Sep 25, 2025

Hi @autra,
I don't really understand what happened in your case. If you want to troubleshoot this further, you should compare the contents of all your ESPs.

As for the md RAID conversion, I ended up writing a script instead of a tutorial because I figured it could be more useful. I tested it on 70 newly-installed Proxmox VE 8 servers with various hardware specs and it works fine but I might have missed corner cases.

Here's the script, PROVIDED "AS IS" WITH NO GUARANTEE, USE AT YOUR OWN RISK, PLEASE ONLY RUN IT IF YOU UNDERSTAND WHAT IT DOES: https://gist.github.com/sbraz/0b58302a244cf1767b931b95226fefb8

After executing it, you need to add the new RAID array to mdadm.conf and rebuild the initramfs so that it contains the updated mdadm.conf file. Otherwise, the system will not recognize the array and it might be assembled as /dev/md127 at the next boot, which could be confusing.
You also need to ensure your fstab's entry for /boot/efi is valid with the new layout. If you didn't modify your fstab since the installation, the ESP should be mounted with LABEL=EFI_SYSPART, which will still work because the new FAT filesystem uses this same label.

Example run from Proxmox VE 8:

Code:

# Run the script itself
root@test:~# ./esp_over_raid.sh 
Unmounted /boot/efi
/dev/nvme0n1p1 is the newest ESP with a file modified at 2025-09-24T20:48:02+00:00
Copied newest ESP contents to /root/efi_system_partition_data/files/
Backed up /dev/nvme0n1p1 as /root/efi_system_partition_data/nvme0n1p1
Wiping signatures from /dev/nvme0n1p1
Wiped signatures from /dev/nvme0n1p1
Backed up /dev/nvme1n1p1 as /root/efi_system_partition_data/nvme1n1p1
Wiping signatures from /dev/nvme1n1p1
Wiped signatures from /dev/nvme1n1p1
Creating RAID1 device /dev/md0
mdadm: size set to 523200K
mdadm: array /dev/md0 started.
Creating FAT filesystem on /dev/md0
Copying ESP contents to /dev/md0 mounted on /boot/efi
The ESP is now over RAID1:
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
md0           9:0    0 510.9M  0 raid1 /boot/efi
├─nvme1n1p1 259:1    0   511M  0 part  
│ └─nvme1n1 259:0    0 419.2G  0 disk  
└─nvme0n1p1 259:7    0   511M  0 part  
  └─nvme0n1 259:6    0 419.2G  0 disk  
Script completed successfully, please check the /boot/efi entry in /etc/fstab, update mdadm.conf and rebuild the initramfs
# Update mdadm.conf, for instance using Debian's mkconf script
root@test:~# cp /etc/mdadm/mdadm.conf .
root@test:~# /usr/share/mdadm/mkconf force-generate
root@test:~# diff -u -I '^#' mdadm.conf /etc/mdadm/mdadm.conf 
--- mdadm.conf    2025-09-24 21:39:10.113286805 +0000
+++ /etc/mdadm/mdadm.conf    2025-09-24 21:42:29.457819759 +0000
@@ -18,6 +18,7 @@
 MAILADDR root
 
 # definitions of existing MD arrays
+ARRAY /dev/md0 UUID=4845c9d4:caf2d945:e58f4aa9:a69bb1cc
 ARRAY /dev/md/md2  metadata=1.2 UUID=c1b98502:0c9041aa:52f8b1ff:654bf40e name=md2
 ARRAY /dev/md/md3  metadata=1.2 UUID=76f7a921:78a279e2:5fdf9df1:cb4e0b06 name=md3
 ARRAY /dev/md/md5  metadata=1.2 UUID=e0b7ff4f:ead83485:705fa843:4f2607bc name=md5
# Rebuild the initramfs with the updated mdadm.conf file
root@test:~# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-6.8.12-15-pve
setupcon is missing. Please install the 'console-setup' package.
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
# Make sure the fstab's /boot/efi entry still works
root@test:~# grep /boot/efi /etc/fstab
LABEL=EFI_SYSPART    /boot/efi    vfat    defaults    0    1
root@test:~# umount /boot/efi
root@test:~# mount /boot/efi
# Check that the server can reboot
root@test:~# reboot

Some notes about the script:

It can be executed either from Proxmox itself or from the rescue system. The only difference is that if you run it from a system where /boot/efi is not mounted, it doesn't remount it at the end. If you run it from the rescue, you will need to chroot into Proxmox to update mdadm.conf and rebuild the initramfs.
It should work on all Linux distributions, I tested it on AlmaLinux 10 and Rocky Linux 10. It only requires bash, util-linux, dosfstools and rsync.
It relies on the mtime of files in the ESP to guess which one is up-to-date. In some cases, this might cause it to pick the wrong ESP as source. If that happens, just reinstall GRUB with grub-install --no-nvram (the --no-nvram option prevents changes to the boot order; failure to specify it may cause the server to boot to disk directly, in which case it's no longer possible to boot into rescue).
It relies heavily on the EFI_SYSPART label. Do not use this label for anything other than ESPs or it will break these partitions.
It backs up all ESPs so you can restore them later (for instance using dd) if you want to revert to the old setup with multiple individual ESPs. Please note that each ESP backup takes up roughly 511 MiB of space.

Also:

We're working on applying a similar configuration for all new installations. We're still performing tests to ensure it works on all the hardware we offer, but we hope to make the switch in the coming weeks.
If you installed Proxmox before June 2024, your server's efiBootloaderPath attribute is likely unset. This means that rEFInd scans the ESP for .efi files at boot. This is not optimal as rEFInd may pick files other than grubx64.efi, it's also quite slow. To speed things up and let iPXE load GRUB on its own, set efiBootloaderPath to \efi\proxmox\grubx64.efi (case-insensitive) using this API route. Don't forget to escape backslashes in the JSON payload, it should look like this:
JSON:
```
{
  "efiBootloaderPath": "\\efi\\proxmox\\grubx64.efi"
}
```

tomatoschewps · Sep 26, 2025

Nice, one of my lab pve (OVHcloud/KS) was stuck on bios on reboot after pve8to9 (needed to manualy select one of the system disk to continu booting.
used @sbraz script and instruction, now the boot is ok without manual intervention !

Thanks

privacytogether · Sep 28, 2025

Same issue here when upgrading from PVE8 to PVE9 on a dedicated OVH server. Dist upgrade went ok, but after reboot the challenge began. Here are some scripts I made to fix my problems.

My disk layout (4×2 TB Soft raid with ZFS):

Code:

lsblk -f
NAME   FSTYPE     FSVER            LABEL        MOUNTPOINTS
sda                                           
├─sda1 vfat       FAT16            EFI_SYSPART   /boot/efi
├─sda2 zfs_member 5000             zp0                                         
├─sda3 swap       1                swap-sda3                   
├─sda4 zfs_member 5000             data0                                       
└─sda5 iso9660    Joliet Extension config-2                                 
sdb                                           
├─sdb1 vfat       FAT16            EFI_SYSPART                                           
├─sdb2 zfs_member 5000             zp0                         
├─sdb3 swap       1                swap-sdb3                   
└─sdb4 zfs_member 5000             data0                                       
sdc                                           
├─sdc1 zfs_member 5000             data1                                       
└─sdc9                                         
sdd                                           
├─sdd1 zfs_member 5000             data1                                       
└─sdd9                                                                                             
nbd0

First I need to mount my ZFS disks in the rescue environment

Code:

#!/bin/bash
#
# This script mounts a ZFS root and boot dataset, mounts the EFI partition,
# binds essential system directories, and enters a chroot environment.
# Use in a rescue environment to recover or repair a ZFS-based system.
 
# Run as root in rescue environment

set -e

POOL="zp0"
MOUNTROOT="/mnt"
EFI_PART="/dev/sda1"   # adjust if your EFI is elsewhere
EFI_MNT="$MOUNTROOT/boot/efi"

# ------------------------------
# Step 1: Import ZFS pool if not already imported
# ------------------------------
if ! zpool list "$POOL" &>/dev/null; then
    echo "[1/6] Importing $POOL pool..."
    zpool import -f -R "$MOUNTROOT" "$POOL"
else
    echo "[1/6] $POOL is already imported"
fi

# ------------------------------
# Step 2: Set mountpoints
# ------------------------------
echo "[2/6] Setting mountpoints for root and boot..."
zfs set mountpoint=/ "$POOL/zd1"
zfs set mountpoint=/boot "$POOL/zd0"

# ------------------------------
# Step 3: Mount all datasets
# ------------------------------
echo "[3/6] Mounting all ZFS datasets..."
zfs mount -a

# ------------------------------
# Step 4: Mount EFI partition
# ------------------------------
echo "[4/6] Mounting EFI partition $EFI_PART at $EFI_MNT..."
mkdir -p "$EFI_MNT"
mount | grep -q "$EFI_MNT" || mount "$EFI_PART" "$EFI_MNT"

# ------------------------------
# Step 5: Bind system directories
# ------------------------------
echo "[5/6] Binding /dev, /proc, /sys, /run..."
for dir in dev proc sys run; do
    mkdir -p "$MOUNTROOT/$dir"
    mount --bind "/$dir" "$MOUNTROOT/$dir"
done

# ------------------------------
# Step 6: Enter chroot
# ------------------------------
echo "[6/6] Entering chroot..."
chroot "$MOUNTROOT" /bin/bash

Then I first verify what is actually wrong with this script:

Code:

#!/bin/bash
#
# This script verifies ZFS pool, datasets, mountpoints, EFI partition, GRUB config, and initramfs
# inside a chroot rescue environment. Use to check boot and ZFS setup before rebooting.
#

set -e

echo "=== [1/6] ZFS Pool Status ==="
zpool status
echo

echo "=== [2/6] ZFS Datasets ==="
zfs list -o name,mountpoint,readonly,canmount
echo

echo "=== [3/6] Check Readonly & Mountpoints ==="
for ds in zp0/zd1 zp0/zd0; do
    echo "$ds readonly:"
    zfs get readonly "$ds"
    echo "$ds mountpoint:"
    zfs get mountpoint "$ds"
    echo
done

echo "=== [4/6] Verify Mounts ==="
mount | grep zp0 || echo "No zp0 datasets mounted"
echo

echo "=== [5/6] EFI Partition ==="
lsblk -f | grep EFI_SYSPART
mount | grep /boot/efi || echo "/boot/efi not mounted"
echo "/boot/efi/EFI contents:"
ls -1 /boot/efi/EFI || echo "No EFI boot entries found"
echo

echo "=== [6/6] GRUB and initramfs ==="
echo "Check grub.cfg root=ZFS entries:"
grep "root=ZFS=" /boot/grub/grub.cfg || echo "No root=ZFS entries found"
echo
echo "Check initramfs includes ZFS:"
for k in $(ls /boot/initrd.img*); do
    echo "$k contains ZFS modules:"
    lsinitramfs "$k" | grep -E 'zfs|zpool' || echo "No ZFS modules found"
done
echo
echo "Check ZFS cache:"
ls -l /etc/zfs/zpool.cache || echo "/etc/zfs/zpool.cache missing"

echo
echo "✅ If all outputs are OK, boot should succeed!"

Than my actual fix script:

Code:

#!/bin/bash
#
# This script repairs boot in a chroot rescue environment:
# - Updates initramfs for all kernels
# - Reinstalls GRUB (UEFI)
# - Regenerates GRUB config
# - Updates ZFS cachefile
# - Syncs disks before reboot

set -euo pipefail

echo "=== [1/5] Updating initramfs for all kernels ==="
for k in $(ls /lib/modules); do
    echo " -> Kernel: $k"
    update-initramfs -c -k $k || update-initramfs -u -k $k
done

echo "=== [2/5] Reinstalling GRUB (UEFI) ==="
# adjust --bootloader-id if needed (shows up in BIOS boot menu)
grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=Proxmox --recheck

echo "=== [3/5] Regenerating GRUB config ==="
update-grub

echo "=== [4/5] Updating ZFS cachefile ==="
for pool in $(zpool list -H -o name); do
    echo " -> Setting cachefile for $pool"
    zpool set cachefile=/etc/zfs/zpool.cache "$pool" || true
done

echo "=== [5/5] Syncing disks ==="
sync

echo "✅ Boot repair completed. Exit chroot and unmount/ export pools safely before reboot."

You can use the verify script again before rebooting into your harddisk again.
Hope it helps someone

.

sbraz · Sep 28, 2025

Hi @privacytogether, I don't see the part where you sync ESPs. It's possible that the line that fixes things for you is when you reinstall GRUB without --no-nvram, making the system boot directly to the up-to-date ESP, which I don't recommend as it might break booting to rescue. Now that your system is fixed, I'd still recommend comparing both your grubx64.efi files and ensuring they are up-to-date.

privacytogether · Sep 28, 2025

Thanks for pointing out, @sbraz. I checked and /boot/efi was mounting /dev/sda1, while the newest ESP with up-to-date GRUB files was on /dev/sdb1. I ran your syncing ESP script to make sure they are both okay. Now they are, thanx

!

sbraz · Oct 9, 2025

FYI, starting on 2025-11-12, all new Proxmox VE 9 and Debian 13 installations started from the OVHcloud control panel will have one ESP over RAID1. This only applies to UEFI boot servers with multiple installation disks. The related status task is https://bare-metal-servers.status-ovhcloud.com/incidents/vtfc86l7mmr8.

See the forum post for additional details or if you have questions about this change.

Dead-Red · 2025-11-09T18:24:48+0100

@sbraz : ===> BIG THX <=== You have saved me

Search

Search

[SOLVED] Proxmox VE 8 to 9 on OVH bare metal servers

jf2021

Member

sbraz

Member

jf2021

Member

autra

New Member

sbraz

Member

MarkusKo

Active Member

_gabriel

Famous Member

sbraz

Member

autra

New Member

sbraz

Member

tomatoschewps

New Member

privacytogether

New Member

sbraz

Member

privacytogether

New Member

sbraz

Member

Dead-Red

Member

We value your privacy