UEFI + Grub - General topic

glaeken2

Member
Jun 7, 2023
27
3
8
Hello fellow proxmians.

Today I'd like to address the lingering issue of EFI boot process with grub. Because unfortunately I had to mark it as UNSTABLE.
Boot process, that is efi+grub+kernel is the most crucial process to the whole linux server world.
As we do have a STABLE kernel, which shows it CAN be done to boot in 99.999% of cases, the efi or efi+grub, fails way too often.
Now I'd like to interrupt for a moment to announce very clearly: I DO NOT WANT TO BYPASS THE PROBLEM. So please, all problem avoiders, skewers, misunderstanders, upsidedownturners, please stand aside.

The core of the problem:
- random, unforeseen, grub boot failures after upgrades.

This probably does not concern people with one or two machines, people who run proxmox under their desks. So please don't stress, relax.

My main question is:
- is it possible to simplify and make more stable the upgrade process of grub and its' efi loaders

I've implemented a whole testing procedure, running after grub+efi updates, just to avoid un-bootable machines, one of which is 600km away. Yet it happened AGAIN. This time fortunately was a local machine, so a few hours of down time, not a big deal.
So making it short: I really have enough of efi not finding grubx64 because the drive was moved to a new machine, efi booting some other files because it decided to, grub not finding it's modules, too small grubx64.efi because installer didn't include modules for any fucking reason out other in the whole universe, grub unable to find a drive because ID does not match, but hey the kernel is still there, shimx64 not booting grubx64 because today is a fuck you day, machine not booting because nvram was cleared and oh look there are no boot entries and "i won't boot bootx64 because It'S OnLy FoR ReMoVeAbLe DeViCeS" jesus fucking christ.

This is not only related to proxmox. It's a linux problem in general, but on VMs it's easy to repair, what can't be said about hypervisors. And what is worse, it's random. You never know if a grub update will work or not.
Maybe there is a way to introduce some local testbeds, "test boot" modes, or some other solutions to make sure the kernel will boot.
For now as a PROBLEM BYPASS I will pin grub package versions. But this is not a solution at all.