Proxmox 8.3.2 - No boot menu - unable to change kernel version.

NorViking

New Member
Nov 28, 2023
11
0
1
I have 3 nodes in my home lab. All running on Intel Nucs. Been running ok for over a year.
One of the nodes crashed and the problem started on that node after I restored it from backup.

Everything seems to work fine. Running ceph with not problem.

However vms crash shortly after I start them on that node. I believe it's because of the kernel version.
When I boot the server I don't get a boot menu. It just boots. I'm not able to change the kernel version.

Here is some information I've collected:

# uname -r
6.5.11-4-pve

# proxmox-boot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
6.5.11-4-pve
6.5.13-6-pve
6.8.12-5-pve

Pinned kernel:
6.8.12-5-pve


# cat /etc/kernel/proxmox-boot-pin
6.8.12-5-pve

# cat /etc/kernel/proxmox-boot-pin-uuids

# cat /etc/kernel/proxmox-boot-uuids-manual-kernels

# efibootmgr -v
BootCurrent: 0006
Timeout: 2 seconds
BootOrder: 0006,0005
Boot0000* Linux Boot Manager VenHw(99e275e7-75a0-4b37-a2e6-c5385e6c00cb)
Boot0001* Linux Boot Manager VenHw(99e275e7-75a0-4b37-a2e6-c5385e6c00cb)
Boot0002* Linux Boot Manager VenHw(99e275e7-75a0-4b37-a2e6-c5385e6c00cb)
Boot0005* Linux Boot Manager HD(2,GPT,e2e34c3b-cba5-4b5e-b3bf-ae728aa7853d,0x800,0x200000)/File(\EFI\SYSTEMD\SYSTEMD-BOOTX64.EFI)
Boot0006* UEFI OS HD(2,GPT,e2e34c3b-cba5-4b5e-b3bf-ae728aa7853d,0x800,0x200000)/File(\EFI\BOOT\BOOTX64.EFI)..BO

# proxmox-boot-tool kernel reinit
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
 
I compared the following directories on the problem node and a working node
/etc/default/grub.d
/etc/grub.d
They were identical.

Then I checked /etc/kernel.
/etc/kernel/proxmox-boot-uuids differed. On the problem node it was blank.
On the working it contained "C42D-05F9"

Hope this can help to solve the mystery.
 
Solved.
I found out that /etc/kernel/proxmox-boot-uuids should contain the uuid of the vfat EFI partition.
I got that by using "lsblk -f". Edited /etc/kernel/proxmox-boot-uuids so it contained the uuid.

Now things look a lot better.

# proxmox-boot-tool refresh
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
Copying and configuring kernels on /dev/disk/by-uuid/FFAC-ADD4
Copying kernel and creating boot-entry for 6.5.11-4-pve
Copying kernel and creating boot-entry for 6.5.13-6-pve
Copying kernel and creating boot-entry for 6.8.12-5-pve

Rebooted and:

# uname -r
6.8.12-5-pve

Moved a few vms to the node and they are now stable with the updated kernel.

Change request:
proxmox-boot-tool should give a error when the contents of /etc/kernel/proxmox-boot-uuids is empty. That could have saved me from several days of troubleshooting.