Opt-in Linux 5.19 Kernel for Proxmox VE 7.x available

I try to install it on my N5105 server , and it crash after booting with this version of 5.19

To take note of:
There are general problems going on with the Intel N5105 (and probably others in that family):
"Main" thread: https://forum.proxmox.com/threads/vm-freezes-irregularly.111494
Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=4188

What I have read so far was, that the majority(/all?) with those CPUs/systems encountered a stable PVE-host, but unstable VMs. So a unstable PVE-host would be something new with this kernel version, I guess? But do not nail me on this.

Anyway, I only wanted to note, that there is more going on with those CPUs.
 
What crashes (host, VM, ..?) and do you got any specifc error logs that show up before/during crash? Also more details about the HW could be nice to have.

this is the log

the boot start at Sep 19 18:34:07 et crash at Sep 19 18:36:24

in the log we can see , the host try to start container lxc-224 and crash ( 18h34 ), and with kernel 5.15, no problem it start after lxc-175 ( 18h43) arch: amd64 cores: 2 features: nesting=1 hostname: Proxy memory: 256 mp0: NVME-Data:224/vm-224-disk-1.raw,mp=/var/spool/squid/,size=5G nameserver: 172.16.4.254 net0: name=eth0,bridge=vmbr4,gw=172.16.4.254,hwaddr=A2:03:79:B5:3F:B2,ip=172.16.4.222/24,type=veth onboot: 1 ostype: debian protection: 1 rootfs: NVME-Data:224/vm-224-disk-0.raw,size=2G searchdomain: xxxx.fr startup: order=3 swap: 128
 

Attachments

Last edited:
To take note of:
There are general problems going on with the Intel N5105 (and probably others in that family):
"Main" thread: https://forum.proxmox.com/threads/vm-freezes-irregularly.111494
Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=4188

What I have read so far was, that the majority(/all?) with those CPUs/systems encountered a stable PVE-host, but unstable VMs. So a unstable PVE-host would be something new with this kernel version, I guess? But do not nail me on this.

Anyway, I only wanted to note, that there is more going on with those CPUs.

i Know, that why i want to try this kernel...
 
  • Like
Reactions: Neobin
Reading through this thread, I am a little apprehensive with upgrading from kernel 5.15.53-1-pve to 5.19 kernel.
I am running the following CPUs and have experenced issues with VMs hanging when migrating backward.

64 x Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz (2 Sockets)
32 x Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (2 Sockets)
12 x Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz (2 Sockets
12 x Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz (2 Sockets)

I have been suggested to upgrade the kernel to possibly solve my migration issues. Below is the thread that I created. https://forum.proxmox.com/threads/proxmox-ha-cluster-failover-issue.115453/

Thanks in advance for any advice possible.
Regards
Lawrence
 
Reading through this thread, I am a little apprehensive with upgrading from kernel 5.15.53-1-pve to 5.19 kernel.
I am running the following CPUs and have experenced issues with VMs hanging when migrating backward.

64 x Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz (2 Sockets)
32 x Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (2 Sockets)
12 x Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz (2 Sockets
12 x Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz (2 Sockets)

I have been suggested to upgrade the kernel to possibly solve my migration issues. Below is the thread that I created. https://forum.proxmox.com/threads/proxmox-ha-cluster-failover-issue.115453/

Thanks in advance for any advice possible.
Regards
Lawrence

What advice you are exactly looking/hoping for?

I mean, the suggestion came from the Proxmox-Staff and in this (short) thread are already two people reporting, that this kernel fixed the migration issues for them:
nice with version 5.19 I can again migrate my VMs between my servers without the VMs crashing
It looks like it has resolved my migration issues to/from an i7-12700K and i7-8700K machine.

What I see, the only two negative reports in this thread so far are one user with GPU-passthrough and lagging games inside a gaming-VM and another user with problems on the Intel N5105, which has general problems, as I mentioned here already in another post.

What I can add in regard of GPU-passthrough and (game-)lagging: I have also such a setup and do not have any problems with it and the 5.19 kernel.

You can ever go back to an older kernel again.
 
What advice you are exactly looking/hoping for?

I mean, the suggestion came from the Proxmox-Staff and in this (short) thread are already two people reporting, that this kernel fixed the migration issues for them:



What I see, the only two negative reports in this thread so far are one user with GPU-passthrough and lagging games inside a gaming-VM and another user with problems on the Intel N5105, which has general problems, as I mentioned here already in another post.

What I can add in regard of GPU-passthrough and (game-)lagging: I have also such a setup and do not have any problems with it and the 5.19 kernel.

You can ever go back to an older kernel again.
@Neobin

Thank you for your input. I agree with what you are saying but I am not very knowledgeable when it comes to any kernel config or issues.

I also understand that I can install the newer kernel and if I see issues, I can always roll back. I will take this on board and discuss with my manager before moving forward.
Your info is very much appreciated.

Lawrence
 
@Neobin

Thank you for your input. I agree with what you are saying but I am not very knowledgeable when it comes to any kernel config or issues.

I also understand that I can install the newer kernel and if I see issues, I can always roll back. I will take this on board and discuss with my manager before moving forward.
Your info is very much appreciated.

Lawrence
you should have no problem my servers were similarly equipped Intel(R) Xeon(R) Silver 4114 and Intel(R) Xeon(R) CPU E5-2640 v3

and if you encounter problems booting just select the old kernel
 
Passthrough of two AMD GPU (with vendor-reset), audio, SATA and USB controllers works well on X570S. No more need of initcall_blacklist=sysfb_init work-around for passthrough of boot GPU, which was needed after 5.11.22 (because amdgpu crashes when unloading) until 5.15.35 and after recent (Debian 11.5?) update again.
lm-sensors still does not detect it8628 on X570S AERO G but work-around still works and not related to Proxmox.
I do not notice any regressions but I also don't see the wlan device of the mt7921e (driver in use) but that probably requires kernel 5.19.8 (or 5.15.67).
very nice! seamless upgrade also.
similar x570M Pro4 / Ryzen rig setup here, passing through AMD boot GPU to desktop VM.

are you saying you can shutdown VM and GPU returns to proxmox host now with 5.19?
may have to fiddle with commenting out some switches in my grub if so!
 
Last edited:
very nice! seamless upgrade also.
similar x570M Pro4 / Ryzen rig setup here, passing through AMD boot GPU to desktop VM.

are you saying you can shutdown VM and GPU returns to proxmox host now with 5.19?
may have to fiddle with commenting out some switches in my grub if so!
Yes, that worked again. When amdgpu unloads gracefully it can also rebind, in my experiences. Doing this every time and restarting the VM many times does appears to become unstable eventually (or maybe I need longer pauses in between). In some kernel version this works perfectly, in some version it just doesn't. I can't tell why.
 
  • Like
Reactions: psyyo
Does this have support for the Intel 12th generation Performance and Efficient cores?
 
Does this have support for the Intel 12th generation Performance and Efficient cores?
FWIW, they already work out, and are supported just fine with our 5.15 kernel, most relevant stuff got backported, and I'm actually using an i7-12700K with P and E cores since February with the 5.15 kernel as my main workstation without any issues, with quite a lot of usage of VMs and compiling resource hungry stuff like ceph, kernel, qemu, ...

So newer kernel should only improve performance and/or efficiency of scheduling, like the mentioned HFI, but while I did not run specific benchmarks to compare 5.15 and 5.19 kernel, I did not notice any performance change on a more subjective level between the two - but that may be related to the type of my most common workloads (compiling stuff uses all cores to their limit anyway).

Is the 5.18 kernel needed in the VM that has Linux as well or does PVE handle that managing of P an E cores?
No, I ran VMs with way older kernels (e.g. 4.15) on my Alder Lake based workstation just fine.
 
Last edited:
  • Like
Reactions: NE78
This works really well on my N5105. No problems encountered and hardware transcoding is now working as it should using Quicksync in lowpower mode (Guc/huc) with Jellyfin in a LXC container.
 
Last edited:
NVIDIA boot gpu passthrough still fails due to BAR issues.
Specs:
ASRock B550 Pro4
Nvidia GeForce GTX 1050 Ti [ASUS Cerberus GTX 1050 Ti OC 4GB]

Code:
[  117.539136] vfio-pci 0000:06:00.0: BAR 3: can't reserve [mem 0xd0000000-0xd1ffffff 64bit pref]
[  117.539147] vfio-pci 0000:06:00.0: BAR 3: can't reserve [mem 0xd0000000-0xd1ffffff 64bit pref]
 
NVIDIA boot gpu passthrough still fails due to BAR issues.
Specs:
ASRock B550 Pro4
Nvidia GeForce GTX 1050 Ti [ASUS Cerberus GTX 1050 Ti OC 4GB]

Code:
[  117.539136] vfio-pci 0000:06:00.0: BAR 3: can't reserve [mem 0xd0000000-0xd1ffffff 64bit pref]
[  117.539147] vfio-pci 0000:06:00.0: BAR 3: can't reserve [mem 0xd0000000-0xd1ffffff 64bit pref]
Does the initcall_blacklist=sysfb_init kernel parameter work-around still work?
 
Does the initcall_blacklist=sysfb_init kernel parameter work-around still work?
For Nvidia GPUs i think all that is needed is a basic cron script to detach it on startup to get boot gpu passthrough to work, no kernel parameter required - atleast for me with a 1070ti:

Bash:
#!/bin/sh

echo 1 > /sys/bus/pci/devices/0000\:0x\:00.0/remove
echo 1 > /sys/bus/pci/rescan

for their case replace x with 6.
 
Hi, I also have a big problem with the 5.19 kernel.
My Broadcom P210 10 GBit network cards suddenly go up to 87-89 degrees Celsius. After switching back to the 5.15 kernel I am back to about 38 degrees Celsius.
 
It worked fine for several days, but now I have this in one node:

[324845.995993] BUG: unable to handle page fault for address: ffffffff00000008 [324845.996316] #PF: supervisor instruction fetch in kernel mode [324845.996610] #PF: error_code(0x0010) - not-present page [324845.996854] PGD 14e67214067 P4D 14e67215067 PUD 0 [324845.997089] Oops: 0010 [#1] PREEMPT SMP NOPTI [324845.997324] CPU: 99 PID: 935263 Comm: z_wr_iss Tainted: P O 5.19.7-1-pve #1 [324845.997558] Hardware name: iXsystems RS700-E10-RS12U-WOCPU005Z-IXN/Z12PP-D32 Series, BIOS 0701 10/15/2021 [324845.997795] RIP: 0010:0xffffffff00000008 [324845.998036] Code: Unable to access opcode bytes at RIP 0xfffffffeffffffde. [324845.998281] RSP: 0018:ff6cad692caf7d50 EFLAGS: 00010246 [324845.998525] RAX: 0000000000000010 RBX: 0000000000000000 RCX: ff46923f2038ddd8 [324845.998768] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ff46923f2038e088 [324845.999008] RBP: ff6cad692caf7da0 R08: 0000000000001000 R09: ff46923f2038e068 [324845.999246] R10: 0000000000001000 R11: 0000000000000001 R12: ff46923f2038eb40 [324845.999484] R13: 0000000000080000 R14: ff46923f2038dca0 R15: 0000000000000000 [324845.999718] FS: 0000000000000000(0000) GS:ff469434ff6c0000(0000) knlGS:0000000000000000 [324845.999956] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [324846.000191] CR2: fffffffeffffffde CR3: 0000014e67210003 CR4: 0000000000773ee0 [324846.000430] PKRU: 55555554 [324846.000668] Call Trace: [324846.000903] <TASK> [324846.001136] zio_execute+0x92/0x160 [zfs] [324846.001485] taskq_thread+0x29c/0x4d0 [spl] [324846.001726] ? wake_up_q+0x90/0x90 [324846.001961] ? zio_gang_tree_free+0x70/0x70 [zfs] [324846.002306] ? taskq_thread_spawn+0x60/0x60 [spl] [324846.002550] kthread+0xee/0x120 [324846.002796] ? kthread_complete_and_exit+0x20/0x20 [324846.003030] ret_from_fork+0x1f/0x30 [324846.003274] </TASK> [324846.003503] Modules linked in: xt_mac act_police cls_basic sch_ingress sch_htb nfsv3 nfs_acl veth rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev xt_addrtype xt_comment xt_tcpudp xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip_set_hash_net ip_set sctp ip6_udp_tunnel udp_tunnel nf_tables 8021q garp mrp bonding softdog nfnetlink_log nfnetlink ipmi_ssif intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ast drm_vram_helper rapl drm_ttm_helper ttm drm_kms_helper i2c_algo_bit cdc_ether fb_sys_fops syscopyarea usbnet intel_cstate cmdlinepart pcspkr efi_pstore spi_nor sysfillrect input_leds [324846.003556] joydev mii isst_if_mbox_pci mei_me sysimgblt ioatdma isst_if_mmio mtd mei isst_if_common intel_vsec intel_pch_thermal acpi_ipmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c simplefb csiostor hid_generic usbmouse usbkbd usbhid hid uas usb_storage crc32_pclmul cxgb4 xhci_pci xhci_pci_renesas scsi_transport_fc tls spi_intel_pci i2c_i801 i40e ahci spi_intel i2c_smbus xhci_hcd libahci wmi [324846.007594] CR2: ffffffff00000008 [324846.007924] ---[ end trace 0000000000000000 ]--- [324846.100485] RIP: 0010:0xffffffff00000008 [324846.101045] Code: Unable to access opcode bytes at RIP 0xfffffffeffffffde. [324846.101563] RSP: 0018:ff6cad692caf7d50 EFLAGS: 00010246 [324846.102010] RAX: 0000000000000010 RBX: 0000000000000000 RCX: ff46923f2038ddd8 [324846.102478] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ff46923f2038e088 [324846.102817] RBP: ff6cad692caf7da0 R08: 0000000000001000 R09: ff46923f2038e068 [324846.103166] R10: 0000000000001000 R11: 0000000000000001 R12: ff46923f2038eb40 [324846.103496] R13: 0000000000080000 R14: ff46923f2038dca0 R15: 0000000000000000 [324846.103819] FS: 0000000000000000(0000) GS:ff469434ff6c0000(0000) knlGS:0000000000000000 [324846.104151] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [324846.104478] CR2: fffffffeffffffde CR3: 0000014e67210003 CR4: 0000000000773ee0 [324846.104804] PKRU: 55555554

I can't even run pveversion -v on the node... this is from another node with same versions
proxmox-ve: 7.2-1 (running kernel: 5.19.7-1-pve) pve-manager: 7.2-11 (running version: 7.2-11/b76d3178) pve-kernel-helper: 7.2-12 pve-kernel-5.19: 7.2-11 pve-kernel-5.15: 7.2-10 pve-kernel-5.4: 6.4-18 pve-kernel-5.19.7-1-pve: 5.19.7-1 pve-kernel-5.15.53-1-pve: 5.15.53-1 pve-kernel-5.15.35-2-pve: 5.15.35-5 pve-kernel-5.4.189-2-pve: 5.4.189-2 pve-kernel-5.4.34-1-pve: 5.4.34-2 ceph-fuse: 14.2.21-1 corosync: 3.1.5-pve2 criu: 3.15-1+pve-1 glusterfs-client: 9.2-1 ifupdown: residual config ifupdown2: 3.1.0-1+pmx3 ksm-control-daemon: 1.4-1 libjs-extjs: 7.0.0-1 libknet1: 1.24-pve1 libproxmox-acme-perl: 1.4.2 libproxmox-backup-qemu0: 1.3.1-1 libpve-access-control: 7.2-4 libpve-apiclient-perl: 3.2-1 libpve-common-perl: 7.2-2 libpve-guest-common-perl: 4.1-2 libpve-http-server-perl: 4.1-3 libpve-storage-perl: 7.2-8 libqb0: 1.0.5-1 libspice-server1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 5.0.0-3 lxcfs: 4.0.12-pve1 novnc-pve: 1.3.0-3 proxmox-backup-client: 2.2.6-1 proxmox-backup-file-restore: 2.2.6-1 proxmox-mini-journalreader: 1.3-1 proxmox-widget-toolkit: 3.5.1 pve-cluster: 7.2-2 pve-container: 4.2-2 pve-docs: 7.2-2 pve-edk2-firmware: 3.20220526-1 pve-firewall: 4.2-6 pve-firmware: 3.5-1 pve-ha-manager: 3.4.0 pve-i18n: 2.7-2 pve-qemu-kvm: 7.0.0-3 pve-xtermjs: 4.16.0-1 qemu-server: 7.2-4 smartmontools: 7.2-pve3 spiceterm: 3.2-2 swtpm: 0.7.1~bpo11+1 vncterm: 1.7-1 zfsutils-linux: 2.1.5-pve1