Opt-in Linux 6.1 Kernel for Proxmox VE 7.x available

In a non-production server, I upgraded to the 6.1 kernel. Seems ok so far, but I did notice something interesting in the Netdata stats, suggesting that the processes are waiting a bit longer for CPU resources now. Perhaps the future pve-firmware will improve this? Regardless, I'll leave it as-is.

61kernel.jpg

cpu.jpg
 
Just tested upgrading to 6.1.10-1-pve (from another 6.1, though I'm not sure which, one of 6.1.0-1-pve or 6.1.2-1-pve) and I've seen a few VMs have their I/O to ceph hang with lots of IOPS that proxmox doesnt show in the UI. This appears to only occur if the VM is set to use a virtio disk (not virtio SCSI), and I am currently using krbd, so assume that's also required, but haven't tested without. Specifically, after a number of hours of light I/O, eventually all I/O will hang, `rbd perf image iostat` will show substantial bandwidth to the rbd volume backing the VM's root disk, but the proxmox UI will show 0 I/O. Inside the VM, all I/O is completely hung, though the VM otherwise operates fine.
 
I had the problem with pass thru disks with kernel 6.1.10.
With kernel 6.1.14 the problem is fixed.
 
  • Like
Reactions: danberry
I rejoiced too soon, after 24 hours of runtime the error occurred again.
Unfortunately, i no longer pin the kernel 6.1.2, because it was removed during the update.
 
Kernel is not removed during update,only if you use autoremove command.
how can I pin the kernel if the one from proxmox-boot-tool is no longer displayed after the update?
Autoremove I have not performed.

proxmox-boot-tool kernel list Manually selected kernels: None. Automatically selected kernels: 5.15.85-1-pve 6.1.10-1-pve 6.1.14-1-pve
 
Update, i have pin the Kernel without displaying.
All runs fine again.

Code:
proxmox-boot-tool kernel list
Manually selected kernels:
None.

Automatically selected kernels:
5.15.85-1-pve
6.1.10-1-pve
6.1.14-1-pve
6.1.2-1-pve

Pinned kernel:
6.1.2-1-pve
 
how can I pin the kernel if the one from proxmox-boot-tool is no longer displayed after the update?

The list is not for what is installed and available to pin, but for what is selected to keep (for next autoremove), automatically or manually.

You can still pin the not-automatically selected kernels using the same version scheme.
You can check for installed kernels with e.g.: apt list --installed | grep -P 'pve-kernel-\d+.\d+.\d+'
 
  • Like
Reactions: Falk R.
I just rebooted into the latest kernel 6.1.15-1-pve and my VMs with memory hotplug (Linux Mint 21.1 with PCIe passthrough) fail to properly start. They run out of memory and crash (100% one CPU) before they get a chance to hotplug/activate all of the memory. Disabling memory hotplug is a work-around for now.
I didn't expect a kernel update to have such an effect but I also think I already rebooted before after other no-subscription updates. Am I the only one having this problem?
Did something change in this area? Can I set the amount of memory that is "pre-plugged" in some way? Is this an issue that can be fixed/work-around inside the Linux VMs?

EDIT: Windows 10 Home with memory hotplug still works fine. I guess it's a Linux Mint issue that did not get triggered before. This is the virtual serial console of the VMs just after the GRUB menu:
Code:
error: out of memory.

Press any key to continue...
[    0.649534] shpchp 0000:05:01.0: pci_hp_register failed with error -16
[    0.650047] shpchp 0000:05:01.0: Slot initialization failed
[    0.652315] shpchp 0000:05:02.0: pci_hp_register failed with error -16
[    0.652810] shpchp 0000:05:02.0: Slot initialization failed
[    0.655294] shpchp 0000:05:03.0: pci_hp_register failed with error -16
[    0.655728] shpchp 0000:05:03.0: Slot initialization failed
[    0.657520] shpchp 0000:05:04.0: pci_hp_register failed with error -16
[    0.657918] shpchp 0000:05:04.0: Slot initialization failed
[    0.770160] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    0.772236] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-67-generic #74-Ubuntu
[    0.774022] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[    0.775519] Call Trace:
[    0.775715]  <TASK>
[    0.775934]  show_stack+0x52/0x5c
[    0.776246]  dump_stack_lvl+0x4a/0x63
[    0.776859]  dump_stack+0x10/0x16
[    0.777099]  panic+0x149/0x321
[    0.777414]  mount_block_root+0x144/0x1dd
[    0.778023]  mount_root+0x10c/0x11c
[    0.778362]  prepare_namespace+0x13f/0x191
[    0.778721]  kernel_init_freeable+0x18c/0x1b5
[    0.779072]  ? rest_init+0x100/0x100
[    0.779365]  kernel_init+0x1b/0x150
[    0.779689]  ? rest_init+0x100/0x100
[    0.780008]  ret_from_fork+0x22/0x30
[    0.780309]  </TASK>
[    0.780570] Kernel Offset: 0x23000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    0.781495] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---

EDIT: An Ubuntu Server VM (with passthrough but no hotplug) would start with 576MB before, now gives the same out of memory error and needs 640MB. A nested Proxmox (without passthrough but with memory hotplug) still starts successfully. Probably the Ubuntu kernel after an update needs a little too much memory before it can activate the hot-plugged memory during boot.

SOLVED: Downgrading pve-edk2-firmware from 3.20220526-1 to 3.20220526-1 fixes the problem, so it's not a kernel issue.
 
Last edited:
Last edited:
The Problems with Passthru Disks with kernel 6.1.10 and 6.1.14 are gone, with 6.1.15 and since Yesterday runs with Kernel 6.2.2 very fine.
 
  • Like
Reactions: t.lamprecht
Hello,

normally I am a silent reader on this forum, however I now have a specific question.

I have 4 nodes in use, all on version 7.3-6 and kernel 5.15.85-1.

Three nodes are running on an AMD Epyc 7352, one on an Intel Xeon Gold 5118, all without shared storage on ZFS.

With the installed kernel I have the known problems after live migration with VM'S freezing.

Which opt-in kernel can I use productively without getting big problems, but not having problems with live migrations anymore?

Thanks in advance
 
Hi,
after some investigation, it seems to only happen if i migrate from intel to amd. all other variants are working

some ppl solved the problem with switching to 5.19 like discussed here: https://bugzilla.proxmox.com/show_bug.cgi?id=4073

question is: is it save to migrate to 5.19, in fact its EOL
those people switched from 5.15 and the issue in the bug report was not about migrations between CPUs from different vendors. The fixes from 5.19 should still be there in 6.1 and 6.2. Please note that in general live migration between CPUs from different vendors is not supported, see the requirements in the docs.

EDIT: you can still try upgrading of course, 6.2 is the current opt-in kernel, but again, no guarantees.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!