[SOLVED] PVE 8 to 9 ... "Welcome to GRUB" then reboot (BTRFS)

ptyork

Member
Mar 23, 2023
7
2
8
I did an 8-9 upgrade on my "test" server and all went swimmingly. So I was confident in upgrading my "production" box.

The upgrade itself seemed to go fine. No errors. So I rebooted. And since then it's been completely down. Essentially I turn the server on and it posts. Then I briefly see the "Welcome to GRUB" message in the top right before the machine just reboots and eventually lands me in my BIOS screen.

I've tried using Super Grub2 to discover bootable options. I can see all options (the core Proxmox entries and the numerous kernel choices), but ALL of them fail to boot with an "Out of Memory" error. Briefly scanning the GRUB configs for them doesn't reveal anything obvious.

I was also able to boot up the system using System Rescue. I tried just doing an update-grub from there, but it's not making a difference. Removed the BIOS battery. Cleared the BIOS. Updated the BIOS. All thinking maybe something with secure boot. Don't think so.

It may just need some kind of bootloader reinstall, but I'm not smart or experienced enough to debug this. I'm likely to make things worse, so I'm hoping for a little help.

Hardware relevant to the server
- Intel i7-13700K + Asus Z790 motherboard
- Pair of 1TB SATA SSD's in BTRFS mirror as PVE boot drive
- 4x32GB DDR5

Yes I have backups, but a) they are a little out of date and b) I'd prefer to gain experience by fixing rather than nuking and reinstalling.

Thanks!
 
The upgrade itself seemed to go fine. No errors. So I rebooted. And since then it's been completely down. Essentially I turn the server on and it posts. Then I briefly see the "Welcome to GRUB" message in the top right before the machine just reboots and eventually lands me in my BIOS screen.

I've tried using Super Grub2 to discover bootable options. I can see all options (the core Proxmox entries and the numerous kernel choices), but ALL of them fail to boot with an "Out of Memory" error. Briefly scanning the GRUB configs for them doesn't reveal anything obvious.
hm - there were a few posts about mismatched grub stages after an upgrade ...
I'd check if you can look through the ESP (EFI service partition) and try the proxmox/shim / proxmox/grub or BOOTX64.efi entries on them - most UEFIs have an option to select these... (maybe that's what Super Grub2 does as well - but I'm not familiar with it)

if this doesn't work - orient your self on:
https://pve.proxmox.com/wiki/Recover_From_Grub_Failure
(you'll need to adapt a few paths as the article is for LVM installs with ext4/xfs)
in the chroot try:
* proxmox-boot-tool status
If this tells you that proxmox-boot-tool is used for booting:
* proxmox-boot-tool reinit
* proxmox-boot-tool refresh
If you're not using proxmox-boot-tool (that depends a bit on when your system was originally setup)
* `grub-install /dev/XXX` (where XXX is the disk you're booting from

should get the system in a bootable state.

If the system boots up successfully - please still share /var/log/apt/term.log (and history.log) from the upgrade - maybe we can find a common pattern and improve the upgrade guide/check script for users who upgrade in the future

Thanks!
 
Thanks Stoiko. So I'm partway there but there are some differences that I need clarity on (and ones that might help diagnose the issue).

So I have two drives, sda and sdb. sda3/sdb3 are the BTRFS mirror. The others are boot-related.

sda1/sdb1 is "BIOS Boot"
sda2/sdb2 is "EFI System"

So I'm ignoring partition 1.

The EFI partition has two directories: EFI/ and grub/. grub/ is empty. Only EFI has content:

/boot
/EFI
/BOOT
BOOTx64.EFI
/proxmox
grubx64.EFI

There is no /efi (lowercase) directory as indicated in the wiki.

First, which should I run? `grub-install /dev/sda` (the disk) or `grub-install /dev/sda2` (the partition)? Or is the tool smart enough to find the right partition?

Second, regardless of which I run, I get:

Installing for x86 _64-efi platform.
grub-install.real: error: cannot find EFI directory

I can only assume that this is because /boot/efi/EFI is obviously not the same as /boot/EFI.

So...should the EFI/boot drive be mounted differently? Or perhaps there's a need to manually create that directory structure? Or am I doing something grossly wrong?

Thanks!
 
I can only assume that this is because /boot/efi/EFI is obviously not the same as /boot/EFI.

So...should the EFI/boot drive be mounted differently? Or perhaps there's a need to manually create that directory structure? Or am I doing something grossly
I guess you have mounted your btrfs (sda3, sdb3 at a directory - for the reminder assume /target)
* check if you have /etc/kernel/proxmox-boot-uuids (this is the check if proxmox-boot-tool is used) - and the contents match the UUIDs of sda2 sda3
* if you have it - mount the proc,sys, and other filesystems mentioned in the wiki and run the proxmox-boot-tool commands I mentioned above (after chrooting)
* if you do not have it - mount /dev/sda2 at /target/boot/efi (and proc,sys,.... , chroot and run grub install /dev/sda )

if this does not match please post the output of mount and blkid from the rescue system
 
Just a typo in the wiki. Said to mount the EFI partition to /boot instead of /boot/efi and I'm not smart enough to have caught it.

This is an old install so I do not use the proxomox-boot-tool, unfortunately.

I ran the grub-install semi-successfully. I got warnings about needing to manually set some configuration options.

Installing for x86_64-efi platform.
grub-install.real: warning: EFI variables cannot be set on this system.
grub-install.real: warning: You will have to complete the GRUB setup manually.
Installation finished. No error reported.

But I tried rebooting and I did at least get to `grub rescue` prompt. It's now saying that "symbol 'grub_native_sectors' not found."

Tried again booting into the Proxmox Recovery Tool from the ISO and am still seeing an "rpool not found" message.

One step forward, for sure, but not quite there.

Thanks for your continued assistance.
 
Well, a couple more minutes of research and I fixed it. Turns out you also need to mount -o bind /sys/firmware/efi/efivars <target>.

Working. Finally. Thank you! I will search for and post the requested log files shortly.
 
  • Like
Reactions: Stoiko Ivanov
Log files attached...

There was a failed kernel install prior to the upgrade. I'd done a full dist-upgrade on the 8.x repos the night before, but I ran one more right before upgrading just to be sure. I believe the kernel failure was from the "right before" upgrade. I don't know if it was related, but I was in the process of stopping VM's during the apt dist-upgrade (I know...dumb), so that may have been an issue that triggered the kernel install fail. Something locked? Nonetheless, I rebooted after that to make sure I was "clean" before doing the 8 to 9 upgrade process. So maybe it was related ... maybe not.

Thanks again!
 

Attachments

Just to get the steps all in one place for anyone benefiting in the future...
  • Assuming booted onto system using recovery ISO.
  • Assuming you are logged in as root.
  • Assuming a boot disk of sda.
  • Assuming an EFI partition of sda1.
  • Assuming a PVE install partition of sda2.
mkdir /media/RESCUE
mount /dev/sda2 /media/RESCUE/
mount /dev/sda1 /media/RESCUE/boot/efi
mount -t proc proc /media/RESCUE/proc
mount -t sysfs sys /media/RESCUE/sys
mount -B /dev /media/RESCUE/dev
mount -B /run /media/RESCUE/run
mount -B /sys/firmware/efi/efivars /media/RESCUE/sys/firmware/efi/efivars

I believe that you could also do the following INSTEAD for the last line...may be required in certain situations:
mount -t efivarfs efivarfs /sys/firmware/efi/efivars

Chroot into your PVE install.

chroot /media/RESCUE

Then update grub and install.

update-grub
grub-install /dev/sda

If no errors, reboot and hopefully all is well.
 
Last edited:
  • Like
Reactions: Stoiko Ivanov
Log files attached...
Thanks - looking through them (and knowing what to search for) - shows:
Code:
Replacing config file /etc/default/grub with new version
Installing for x86_64-efi platform.
grub-install.real: error: cannot find EFI directory.
Failed: grub-install --target=x86_64-efi
WARNING: Bootloader is not properly installed, system may not be bootable
seems at that time your ESP (/dev/nvme0n1p2 or /dev/sda2, or similar (the second partition of your install drive) was not mounted on /boot/efi - however
systems setup with LVM and ext4 expect that.

Thanks for the feedback regarding the wiki-article - it's been overdue for a refresh for a while:
https://pve.proxmox.com/wiki/Recover_From_Grub_Failure
I hope it's a bit clearer now