Opt-in Linux 5.19 Kernel for Proxmox VE 7.x available

Hi everyone,

I'm also experiencing crashes/hangs to my VMs running on PVE 7.2 kernel 5.15 with an Intel N5105 (Topton Mini PC with 16GB RAM / 240GB SSD).

Tried to update to 5.19 kernel with command apt install pve-kernel-5.19 but I get a package not found error and thus can't go further.

A search for pve-kernel in apt cache only gives pve-kernel-5.15 (I did an apt update).

Sorry for my dumb question but what did I miss here ?

Thank you !
 
Sorry for my dumb question but what did I miss here ?
Seems you did not enable a valid Proxmox VE update repository?
 
It was indeed a dumb question, thank you for your answer!

About to try the update, let's cross fingers!
 
Kernel 5.19 enables live migration between AMD hosts with different CPUs (AMD EPYC 7371 and AMD EPYC 7313) using a both a custom QEMU cpu and the default "EPYC" cpu type. Live migration was working on v6.4 and stopped working when upgraded the cluster to 7.2 with kernel 5.15.
 
  • Like
Reactions: jsterr
I seem to be having issues with the latest 5.19.17-1 kernel, throws out watchdog cpu soft lockup bug after a couple minutes after boot. I'm on a ryzen-based system with 3700x. Downgrading to 5.15 and 5.19.7-2 seems to work fine. The issue seemed to pop up randomly though as the latest kernel was working fine for quite a while after the initial upgrade, then today i booted the machine this issue occurred out of nowhere.

Code:
proxmox-ve: 7.3-1 (running kernel: 5.19.7-2-pve)
pve-manager: 7.3-3 (running version: 7.3-3/c3928077)
pve-kernel-5.15: 7.2-14
pve-kernel-helper: 7.2-14
pve-kernel-5.11: 7.0-10
pve-kernel-5.4: 6.4-4
pve-kernel-5.19.7-2-pve: 5.19.7-2
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.7-1-pve: 5.15.7-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.7-1
proxmox-backup-file-restore: 2.2.7-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.5-6
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-1
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1

Note: I have uninstalled latest kernel for now, as the bug makes my system completely unusable.
 
I don't know how many times to post that...

If your CPU is AMD ZEN2 or ZEN3... (Ryzen 3xxx and 5xxx, and probably older and newer as 2xxx and 7xxx and all their counterparts of Threadripper and Epyc Series), try first following... (just use search "ryzen crash", you find all of that). Enjoy.

https://forum.proxmox.com/threads/k...shes-about-every-day.91803/page-5#post-412174
https://forum.proxmox.com/threads/o...ox-ve-7-x-available.115090/page-3#post-507337
https://forum.proxmox.com/threads/proxmox-keeps-crashing.117837/page-2#post-511345

@staff, please, create a note in Wiki or a Sticky-Post in forum for that.


keywords:
CPUs: amd ryzen threadripper epyc cpu soft lockup crash
affected chipsets/sockets: sWRX80 trx40 AM4 x570s x570 b550 a520 x470 b450 x370 b350 a320 a300
possibly affected chipsets: AM5 x670e x670 b650e b650 (not tested yet)

FIX: disable CSTATE 6. Some vendors call it different, like "Power idle control" (set to "Typical current idle" or anything that is not "Low").

In some cases you might need to update your bios/uefi, like some gigabyte x570 boards, that are compatible with zen3/ryzen5xxx with F31 (january 2021), but the above option is available from F34 (june 2021) or something.

Use your favourite search engine, to find how to disable CSTATE 6 with your specific mainboard or bios/uefi version and which updates you need.

FYI: this is not a hardware bug, the CSTATE 6 feature is not supported by older linux kernels. PVE kernel is a custom build up to 5.19.x where it is not supported yet.
which exact linux kernel does support CSTATE 6 i don't know. also i don't know, if there is a pve edge kernel out yet, that supports that feature. i don't suggest to install pve-edge kernel anyway, if you do so, ask there for help, not here, since it is not officially supported by Proxmox.
 
Last edited:
I seem to be having issues with the latest 5.19.17-1 kernel, throws out watchdog cpu soft lockup bug after a couple minutes after boot. I'm on a ryzen-based system with 3700x. Downgrading to 5.15 and 5.19.7-2 seems to work fine. The issue seemed to pop up randomly though as the latest kernel was working fine for quite a while after the initial upgrade, then today i booted the machine this issue occurred out of nowhere.

Code:
proxmox-ve: 7.3-1 (running kernel: 5.19.7-2-pve)
pve-manager: 7.3-3 (running version: 7.3-3/c3928077)
pve-kernel-5.15: 7.2-14
pve-kernel-helper: 7.2-14
pve-kernel-5.11: 7.0-10
pve-kernel-5.4: 6.4-4
pve-kernel-5.19.7-2-pve: 5.19.7-2
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.7-1-pve: 5.15.7-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.7-1
proxmox-backup-file-restore: 2.2.7-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.5-6
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-1
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1

Note: I have uninstalled latest kernel for now, as the bug makes my system completely unusable.

Is the bios/UEFI and all firmwares (e.g.: SSDs) up-to-date?
Would be really helpful (not for me, but the knowledgeable people) to have the exact and complete error message, I assume.
 
AMD CPU and "cpu soft lockup" is what i told above. no need to go through all the debugging every time someone didn't used search.
 
I don't know how many times to post that...

If your CPU is AMD ZEN2 or ZEN3... (Ryzen 3xxx and 5xxx, and probably older and newer as 2xxx and 7xxx and all their counterparts of Threadripper and Epyc Series), try first following... (just use search "ryzen crash", you find all of that). Enjoy.

https://forum.proxmox.com/threads/k...shes-about-every-day.91803/page-5#post-412174
https://forum.proxmox.com/threads/o...ox-ve-7-x-available.115090/page-3#post-507337
https://forum.proxmox.com/threads/proxmox-keeps-crashing.117837/page-2#post-511345

@staff, please, create a note in Wiki or a Sticky-Post in forum for that.


keywords:
CPUs: amd ryzen threadripper epyc cpu soft lockup crash
affected chipsets/sockets: sWRX80 trx40 AM4 x570s x570 b550 a520 x470 b450 x370 b350 a320 a300
possibly affected chipsets: AM5 x670e x670 b650e b650 (not tested yet)

FIX: disable CSTATE 6. Some vendors call it different, like "Power idle control" (set to "Typical current idle").

In some cases you might need to update your bios/uefi, like some gigabyte x570 boards, that are compatible with zen3/ryzen5xxx with F31 (january 2021), but the above option is available from F34 (june 2021) or something.

Use your favourite search engine, to find how to disable CSTATE 6 with your specific mainboard or bios/uefi version and which updates you need.

FYI: this is not a hardware bug, the CSTATE 6 feature is not supported by older linux kernels. PVE kernel is a custom build up to 5.19.x where it is not supported yet.
which exact linux kernel does support CSTATE 6 i don't know. also i don't know, if there is a pve edge kernel out yet, that supports that feature. i don't suggest to install pve-edge kernel anyway, if you do so, ask there for help, not here, since it is not officially supported by Proxmox.
Thanks for the insight but the issue still persists, I already had c states completely disabled and have also just tried manually setting the power supply control setting to "Typical current idle" - issue still persists. I don't think this is the same problem as those posts, the issue only persists on the latest 5.19.17-1 kernel. The mobo is Asus ROG STRIX x470-F, bios is one behind latest version released in march, guess i could try updating that as a last resort, would rather just use the an earlier kernel tho.

Here are the last things in syslog, not sure how else to get debug info as terminal is completely unusable when this happens.

Code:
Nov 28 19:49:41 pve kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 469s! [kworker/u64:5:313]
Nov 28 19:49:41 pve kernel: Modules linked in: tcp_diag udp_diag inet_diag rpcsec_gss_krb5 xt_hl ip6t_rt xt_LOG nf_log_syslog xt_recent xt_nat macvlan xt_MASQUERADE xfrm_user xfrm_algo iptable_nat nf_nat overlay veth ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_physdev xt_addrtype xt_tcpudp xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment xt_NFLOG xt_limit xt_set xt_mark ip_set_hash_net ip_set softdog iptable_filter bpfilter bonding tls nfnetlink_log nfnetlink btrfs blake2b_generic xor zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid6_pq intel_rapl_msr intel_rapl_common wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha iwlmvm ip6_udp_tunnel udp_tunnel
Nov 28 19:49:41 pve kernel:  edac_mce_amd mac80211 kvm_amd libarc4 snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi kvm btusb btrtl snd_seq_device btbcm vfio_pci crct10dif_pclmul mc ghash_clmulni_intel vfio_pci_core btintel eeepc_wmi vfio_virqfd snd_pcm aesni_intel irqbypass btmtk iwlwifi asus_wmi vfio_iommu_type1 crypto_simd snd_timer platform_profile bluetooth sparse_keymap cryptd snd ecdh_generic joydev asus_wmi_sensors rapl pcspkr input_leds vfio efi_pstore video wmi_bmof mxm_wmi k10temp soundcore ccp cfg80211 ecc mac_hid amdgpu iommu_v2 gpu_sched drm_ttm_helper ttm drm_display_helper nfsd cec rc_core auth_rpcgss drm_kms_helper nfs_acl fb_sys_fops lockd syscopyarea sysfillrect grace hwmon_vid sysimgblt drm sunrpc ip_tables x_tables autofs4 simplefb dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c hid_generic igb usbmouse usbkbd uas usbhid usb_storage hid xhci_pci crc32_pclmul i2c_algo_bit xhci_pci_renesas ahci nvme i2c_piix4 dca xhci_hcd libahci nvme_core wmi gpio_amdpt z3fold
Nov 28 19:49:41 pve kernel:  zstd
Nov 28 19:49:41 pve kernel: CPU: 2 PID: 313 Comm: kworker/u64:5 Tainted: P        W  O L    5.19.17-1-pve #1
Nov 28 19:49:41 pve kernel: Hardware name: System manufacturer System Product Name/ROG STRIX X470-F GAMING, BIOS 5861 08/10/2021
Nov 28 19:49:41 pve kernel: Workqueue: btrfs-cache btrfs_work_helper [btrfs]
Nov 28 19:49:41 pve kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x87/0x300
Nov 28 19:49:41 pve kernel: Code: c1 e0 08 89 c2 41 8b 04 24 30 e4 09 d0 a9 00 01 ff ff 0f 85 f4 01 00 00 85 c0 74 12 41 8b 04 24 84 c0 74 0a f3 90 41 8b 04 24 <84> c0 75 f6 b8 01 00 00 00 66 41 89 04 24 5b 41 5c 41 5d 41 5e 41
Nov 28 19:49:41 pve kernel: RSP: 0018:ffff9f0882b13c10 EFLAGS: 00000202
Nov 28 19:49:41 pve kernel: RAX: 0000000000000101 RBX: ffff90d3937075b0 RCX: 0000000080240022
Nov 28 19:49:41 pve kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f0882b13c98
Nov 28 19:49:41 pve kernel: RBP: ffff9f0882b13c38 R08: ffff90d3937075b0 R09: 0000000080240022
Nov 28 19:49:41 pve kernel: R10: 0000000040000000 R11: ffff90d4458a7f00 R12: ffff9f0882b13c98
Nov 28 19:49:41 pve kernel: R13: ffff90cdcd969c10 R14: 0000000000000001 R15: ffff90cdc42cf000
Nov 28 19:49:41 pve kernel: FS:  0000000000000000(0000) GS:ffff90d4cea80000(0000) knlGS:0000000000000000
Nov 28 19:49:41 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 28 19:49:41 pve kernel: CR2: 000000c000258000 CR3: 000000063c906000 CR4: 0000000000350ee0
Nov 28 19:49:41 pve kernel: Call Trace:
Nov 28 19:49:41 pve kernel:  <TASK>
Nov 28 19:49:41 pve kernel:  _raw_spin_lock+0x29/0x30
Nov 28 19:49:41 pve kernel:  __btrfs_remove_free_space_cache+0x13/0x40 [btrfs]
Nov 28 19:49:41 pve kernel:  load_free_space_cache+0x372/0x3d0 [btrfs]
Nov 28 19:49:41 pve kernel:  caching_thread+0x350/0x580 [btrfs]
Nov 28 19:49:41 pve kernel:  ? pwq_adjust_max_active+0x8b/0x100
Nov 28 19:49:41 pve kernel:  btrfs_work_helper+0xd9/0x340 [btrfs]
Nov 28 19:49:41 pve kernel:  process_one_work+0x21f/0x3f0
Nov 28 19:49:41 pve kernel:  worker_thread+0x50/0x3e0
Nov 28 19:49:41 pve kernel:  ? rescuer_thread+0x3a0/0x3a0
Nov 28 19:49:41 pve kernel:  kthread+0xf0/0x120
Nov 28 19:49:41 pve kernel:  ? kthread_complete_and_exit+0x20/0x20
Nov 28 19:49:41 pve kernel:  ret_from_fork+0x22/0x30
Nov 28 19:49:41 pve kernel:  </TASK>
 
Ok, if you did the above, please open a new thread for you issue.
I see from your last log, that BTRFS cache helper is crashing. that might be indeed a different issue, but soft lockups with amd cpus are usually cstate 6 (in 99%). While there is also a possibility, if you are running non-ecc (unbuffered as only available with ryzen) ram, which is prone to give issues with zfs and btrfs (is btrfs now officially supported?)
 
  • Like
Reactions: dunk1452
on Mini PC Celeron, disable mitigations and cstate tweaks are needed
nano /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="intel_idle.max_cstate=1 processor.max_cstate=1 mitigations=off"

Hi everyone,

I'm also experiencing crashes/hangs to my VMs running on PVE 7.2 kernel 5.15 with an Intel N5105 (Topton Mini PC with 16GB RAM / 240GB SSD).

Tried to update to 5.19 kernel with command apt install pve-kernel-5.19 but I get a package not found error and thus can't go further.

A search for pve-kernel in apt cache only gives pve-kernel-5.15 (I did an apt update).

Sorry for my dumb question but what did I miss here ?

Thank you !
 
Last edited:
What's the proper method for changing back and forth between Linux 5.19 and 5.15 when doing various tests and benchmarks? The obvious path to 5.19 is the apt install method, which then yields both kernels within the GRUB boot manager, but it would be good to have the method to change the default back handy within the original post for anyone finding this thread and may wan to work with both kernels...
 
What's the proper method for changing back and forth between Linux 5.19 and 5.15 when doing various tests and benchmarks? The obvious path to 5.19 is the apt install method, which then yields both kernels within the GRUB boot manager, but it would be good to have the method to change the default back handy within the original post for anyone finding this thread and may wan to work with both kernels...

https://pve.proxmox.com/wiki/Host_Bootloader#sysboot_kernel_pin
 
  • Like
Reactions: verulian
Hi everyone,

Just an update to tell you that following the update to kernel 5.19, I no longer experience VM crashes with my Topton N5105 mini PC.

No issue encountered during the update.

My pfSense VM uptime is 3 days now. With kernel 5.16, it would systematically crash after 1 day or so.

Have a nice day !
 
Hey all,

I just thought I'd add my experience with the new kernel.

After lockups and total crashing issues with my Dell OptiPlex 7050, i7-7700, I opened a thread of my own and was advised to upgrade to the 5.19 kernel. With a couple of commands I had successfully upgraded my machine and it's now been a week under heavy use without a single hint of instability.

Problem solved! Thank you Proxmox lads.

Chris
 
Hi guys. I just upgrade mine first homelab server - DeskMeet X300 + AMD 5600G without any issue. I'm thinking about updating the cluster in homelab, because live migration is not working and it freezes often (strange behavior)
 
Hello,

we had no problems with the 5.19.7-2 Kernel.
With the new update to the 5.19.17-1 Kernel, we had random freezes of the hole system on 12th Gen Intel(R) Core(TM) i7-12700K .

The strange thing is, we have systems where it occurs and some where it doesn't.
But we have one system that allways freezes in the first 5 Minutes after reboot.

the solution on this system was to add

options i915 enable_psr=0

in the file

/etc/modprobe.d/i915.conf


Maybe this helps identifying problems. Or maybe it's also a solution for other people.
I found this solution here: https://community.frame.work/t/hard-freezing-on-fedora-36-with-the-new-12th-gen-system/20675/144

best regards
Benedikt
 
I don't know how many times to post that...

If your CPU is AMD ZEN2 or ZEN3... (Ryzen 3xxx and 5xxx, and probably older and newer as 2xxx and 7xxx and all their counterparts of Threadripper and Epyc Series), try first following... (just use search "ryzen crash", you find all of that). Enjoy.

https://forum.proxmox.com/threads/k...shes-about-every-day.91803/page-5#post-412174
https://forum.proxmox.com/threads/o...ox-ve-7-x-available.115090/page-3#post-507337
https://forum.proxmox.com/threads/proxmox-keeps-crashing.117837/page-2#post-511345

@staff, please, create a note in Wiki or a Sticky-Post in forum for that.


keywords:
CPUs: amd ryzen threadripper epyc cpu soft lockup crash
affected chipsets/sockets: sWRX80 trx40 AM4 x570s x570 b550 a520 x470 b450 x370 b350 a320 a300
possibly affected chipsets: AM5 x670e x670 b650e b650 (not tested yet)

FIX: disable CSTATE 6. Some vendors call it different, like "Power idle control" (set to "Typical current idle" or anything that is not "Low").

In some cases you might need to update your bios/uefi, like some gigabyte x570 boards, that are compatible with zen3/ryzen5xxx with F31 (january 2021), but the above option is available from F34 (june 2021) or something.

Use your favourite search engine, to find how to disable CSTATE 6 with your specific mainboard or bios/uefi version and which updates you need.

FYI: this is not a hardware bug, the CSTATE 6 feature is not supported by older linux kernels. PVE kernel is a custom build up to 5.19.x where it is not supported yet.
which exact linux kernel does support CSTATE 6 i don't know. also i don't know, if there is a pve edge kernel out yet, that supports that feature. i don't suggest to install pve-edge kernel anyway, if you do so, ask there for help, not here, since it is not officially supported by Proxmox.

Thank you for that !
I applied the bios settings and had no issues after installing 5.19.
It has not solved the "BAR 0: can't reserve [mem..." issue though.
I still need to rely on the hookscript to reset the pci bus before starting the VM.
 
Yeah, I have this already...
the vendor-reset module as well. It does not help.
To be fair I have no try without video=efifb:eek:ff" or "video=simplefb:eek:ff
I kept adding stuff in my grub conf file but did not clean up. I'll try that
Thank you
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!