Cannot pin kernel using proxmox-boot-tool

Thibaut · Dec 19, 2024

Hello,

I'm currently setting up a testing environment based on a 2 nodes Proxmox cluster.
The 2 nodes are absolutely identical machines (micro-PC).

I've followed the exact same procedure to install both nodes, booting from the proxmox-ve_8.3-1.iso.
Both installations went off without a hitch.

Then I went through some configuration settings and updated the repositories to the No-Subscription ones (from the GUI).
The effective update was carried out from the command line, using apt update and apt -y upgrade
Both systems were then rebooted, and a peculiar behavior could be observed: after rebooting, node1 booted on kernel 6.8.12-4-pve, while node2 booted on kernel 6.8.12-5-pve.

I tried pinning kernel 6.8.12-5-pve on node1:

Bash:

root@node1:~# uname -r
6.8.12-4-pve

root@node1:~# proxmox-boot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
6.8.12-4-pve
6.8.12-5-pve

root@node1:~# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
AD85-A0BC is configured with: uefi (versions: 6.8.12-4-pve, 6.8.12-5-pve)

root@node1:~# proxmox-boot-tool kernel pin 6.8.12-5-pve
Set kernel '6.8.12-5-pve' in /etc/kernel/proxmox-boot-pin.
Refresh the actual boot ESPs now? [yN] y
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
Copying and configuring kernels on /dev/disk/by-uuid/AD85-A0BC
        Copying kernel and creating boot-entry for 6.8.12-4-pve
        Copying kernel and creating boot-entry for 6.8.12-5-pve

Rebooting the system didn't load the 6.8.12-5-pve kernel, but the 6.8.12-4-pve kernel.
I found absolutely no way to pin the targeted kernel, even though I even tried a re-install from scratch!

Since the objective is to create a cluster, it seems it would be wiser to have all nodes using the same kernel version, am I wrong?
Moreover, I suppose the latest kernel version is probably automatically defined as the default one when an upgrade takes place. It would become tedious to have to re-pin each and every new kernel version.

In the end, I decided to go the other way around, pinning node2 to the 6.8.12-4-pve kernel, which worked as expected on the next node2 reboot. But, obviously, this is exactly the opposite of what I would want.

I'm out of ideas on where to go from here.
In case someone reading this has some track I could follow it would be welcome.

Thanks.

fireon · Dec 20, 2024

Really strange... I could not reproduce it here.

And if you are currently a

Code:

proxmox-boot-tool kernel list

execute, does the newer kernel show up as pinned?

Can you even manually boot the newer kernel from the boot menu?

Do you use secure boot?

Thibaut · Dec 20, 2024

Thank you for taking the time to answer me.

Yes, the kernel is shown as pinned, although the active kernel is not the pinned one:

Code:

root@node1:~# proxmox-boot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
6.8.12-4-pve
6.8.12-5-pve

Pinned kernel:
6.8.12-5-pve
root@node1:~# uname -r
6.8.12-4-pve

Yes, manually selecting the 6.8.12-5-pve kernel using the keyboard selection before the boot process does work.
But there is an arrow next to the 6.8.12-4-pve one, which stays there even when the other kernel is pinned.

No, secure boot is disabled:

Code:

root@node1:~# mokutil --sb-state
SecureBoot disabled

fireon · Dec 20, 2024

Basically, it looks the way it should look. Please show me what's in your EFI partition in “loader.conf”. So mount one of your EFI Partitions on /boot/efi:

On my example this is "sda2"

Code:

NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda      8:0    0   35G  0 disk
├─sda1   8:1    0 1007K  0 part
├─sda2   8:2    0  512M  0 part /boot/efi
└─sda3   8:3    0 34.5G  0 part
sdb      8:16   0   35G  0 disk
├─sdb1   8:17   0 1007K  0 part
├─sdb2   8:18   0  512M  0 part
└─sdb3   8:19   0 34.5G  0 part

After show me the content of the loader.conf:

Code:

cat /boot/efi/loader/loader.conf

The pinned kernel is always available there. In our case, we want to see “proxmox-6.8.12-5-pve.conf” here. And then also the content of this config file:

Code:

cat /boot/efi/loader/entries/proxmox-6.8.12-5-pve.conf

The content should look like this:

Code:

title    Proxmox Virtual Environment
version  6.8.12-5-pve
options  root=ZFS=rpool/ROOT/pve-1 boot=zfs
linux    /EFI/proxmox/6.8.12-5-pve/vmlinuz-6.8.12-5-pve
initrd   /EFI/proxmox/6.8.12-5-pve/initrd.img-6.8.12-5-pve

See also: https://pve.proxmox.com/wiki/Host_Bootloader#sysboot_systemd_boot

Thibaut · Dec 23, 2024

I mounted the EFi partition (which in my case is /dev/mmcblk0p2).

While the 6.8.12-5-pve kernel is pinned:

Code:

root@node1:~# proxmox-boot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
6.8.12-4-pve
6.8.12-5-pve

Pinned kernel:
6.8.12-5-pve

root@node1:~# cat /boot/efi/loader/loader.conf
timeout 3
default proxmox-6.8.12-5-pve.conf

When no kernel is pinned:

Code:

root@node1:~# proxmox-boot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
6.8.12-4-pve
6.8.12-5-pve

root@node1:~# cat /boot/efi/loader/loader.conf
timeout 3
default proxmox-*

Here are the contents of both files present under /boot/efi/loader/entries/:

Code:

root@node1:~# cat /boot/efi/loader/entries/proxmox-6.8.12-4-pve.conf
title    Proxmox Virtual Environment
version  6.8.12-4-pve
options  root=ZFS=rpool/ROOT/pve-1 boot=zfs
linux    /EFI/proxmox/6.8.12-4-pve/vmlinuz-6.8.12-4-pve
initrd   /EFI/proxmox/6.8.12-4-pve/initrd.img-6.8.12-4-pve

root@node1:~# cat /boot/efi/loader/entries/proxmox-6.8.12-5-pve.conf
title    Proxmox Virtual Environment
version  6.8.12-5-pve
options  root=ZFS=rpool/ROOT/pve-1 boot=zfs
linux    /EFI/proxmox/6.8.12-5-pve/vmlinuz-6.8.12-5-pve
initrd   /EFI/proxmox/6.8.12-5-pve/initrd.img-6.8.12-5-pve

I see the exact same behavior on node2, except that on node2 the pinned kernel is the one used at reboot while I cannot get the pinned kernel to be activated on node1. It's still a total mystery for me!

gfngfn256 · Dec 23, 2024

Thibaut said:
apt -y upgrade

NEVER upgrade Proxmox with that command.

You should either use the apt-get dist-upgrade in the CLI or use the >_ Upgrade button in the GUI (which does the same).

Search these forums (& the official docs) for further info.

I only skimmed this thread, but the behavior you describe is probably caused by this.

Thibaut · Dec 24, 2024

Thank you for indicating the best practice of using apt-get dist-upgrade, I redid the entire installation of both machines, using the recommended command when it came to performing the Proxmox system update/upgrade.

Unfortunately the result was exactly the same, node1 is still locked on kernel 6.8.12-4-pve.

gfngfn256 · Dec 24, 2024

I must agree, rather a weird experience. Have you checked & compared ALL bios settings on both PCs?

I assume all of the above is before attempting to cluster them.

I see you use ZFS on node1, is that also the case for node2. What drive/s do the nodes have, how are they configured? Contrast & compare.
Were those drives completely wiped of all data & partitions before a fresh install?

After pinning the kernel have you tried:
proxmox-boot-tool refresh

If you are courageous enough (I see you are anyway on a fresh install), you could try (on node1) after successfully booting to 6.8.12-5, then delete the 6.8.12-4 kernel & reboot. You will leave it no option!

Good luck.

Thibaut · Dec 24, 2024

Thanks for the suggestion about removing the 6.8.12-4 kernel.
Up to now, the only way I have to boot node1 on 6.8.12-5 is by manually selecting it on the startup screen right before the boot process starts.

I'll have to wait until Friday to regain physical access to the machine...
Once I've done the tests, I'll report the results here.

_gabriel · Dec 24, 2024

Thibaut said:
I mounted the EFi partition (which in my case is /dev/mmcblk0p2).

I bet on you boot on another disk, iirc, a PVE won't install on mmcblk.
lsblk will confirm

gfngfn256 · Dec 24, 2024

_gabriel said:
mmcblk

Good catch. I'm not sure about the not installing to it (although that makes sense), but I suspect he may have the wrong boot device selected in the node1 BIOS. Alternatively (or in addition to) he chose the wrong target for installation.

In all events, what he should do is disable (if possible) that mmcblk0 in the BIOS etc. & then re-install.

Thibaut · Dec 25, 2024

Hello @_gabriel and @gfngfn256, I indeed didn't specify that installing Proxmox on the eMMC device was deliberate!
I won't detail all the reasons why this decision was made, simply let's say that, since this is a test system, I wanted it to be this way.

It isn't possible to install Proxmox on mmcblk devices "by mistake", since one has to run the Proxmox installer in debug mode and patch the /usr/share/perl5/Proxmox/Sys/Block.pm file (with Proxmox 8) in order to have mmcblk devices listed by the Proxmox installer. This is all described and explained on this ibug.io page.

Here are the lsblk outputs from each node, you'll see that there can be no mistake about the EFI partition being mmcblk0p2:
node1:

Code:

root@node1:~# lsblk -fp
NAME              FSTYPE     FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
/dev/sda                                                                                   
/dev/mmcblk0                                                                               
├─/dev/mmcblk0p1                                                                           
├─/dev/mmcblk0p2  vfat       FAT32       4E42-8D95                             878.5M    14% /boot/efi
└─/dev/mmcblk0p3  zfs_member 5000  rpool 419430437375642408                               
/dev/mmcblk0boot0                                                                         
/dev/mmcblk0boot1                                                                         
/dev/nvme0n1                                                                               
├─/dev/nvme0n1p1  swap       1           f13cc7d1-94f0-428c-b8aa-c86401b986aa                [SWAP]
└─/dev/nvme0n1p2  drbd       v08         abc376e6e82ceae                                   
  └─/dev/drbd0

node2:

Code:

root@node2:~# lsblk -fp
NAME              FSTYPE     FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
/dev/sda                                                                                   
/dev/mmcblk0                                                                               
├─/dev/mmcblk0p1                                                                           
├─/dev/mmcblk0p2  vfat       FAT32       AD85-A0BC                                         
└─/dev/mmcblk0p3  zfs_member 5000  rpool 8369975410939063214                               
/dev/mmcblk0boot0                                                                         
/dev/mmcblk0boot1                                                                         
/dev/nvme0n1                                                                               
├─/dev/nvme0n1p1  swap       1           ae6f222e-a93d-43df-b616-c101970e7c4d                [SWAP]
└─/dev/nvme0n1p2  drbd       v08         e0127859b8db2d7e                                 
  └─/dev/drbd0

Regarding the BIOS boot order, I've tried both "Linux Boot Manager" and "EFI OS" as first entry, the boot process was identical on both nodes.

As soon as I get the chance to try removing the 6.8.12-4 kernel from node1 I'll report here...

Thank you for your attention to what I'm experiencing.

_gabriel · Dec 25, 2024

weird, no USB key plugged at boot stage ?
Try changing "title" from /boot/efi/loader/entries/proxmox-*.conf to check if boot menu load these.

gfngfn256 · Dec 26, 2024

You obviously realize you can't expect support on an unsupported & unconventional install. eMMC cards in general, apart from durability issues are unreliable , error prone & partition un-friendly. Proxmox have excluded them for setup/install with good reasoning. I also like to tinker, test, learn & discover, but running a HV server on some Mini Pc's unbranded eMMC card seems ridiculous to me. (In my experience many eMMC cards are less OS install friendly than SD cards. I put that down to the 8 data lanes used in eMMC vs the 4 on standard SD cards. But that's a rabbit-hole on its own).

If you want to test if the eMMC install is the cause of your problem, attach another drive (even USB to HD/SSD) to that PC & re-install & test.

Good luck.

Search

Search

Cannot pin kernel using proxmox-boot-tool

Thibaut

Active Member

fireon

Distinguished Member

Thibaut

Active Member

fireon

Distinguished Member

Thibaut

Active Member

gfngfn256

Distinguished Member

Thibaut

Active Member

gfngfn256

Distinguished Member

Thibaut

Active Member

_gabriel

Famous Member

gfngfn256

Distinguished Member

Thibaut

Active Member

_gabriel

Famous Member

gfngfn256

Distinguished Member

We value your privacy