Opt-in Linux 6.14 Kernel for Proxmox VE 8 available on test & no-subscription

piexil · Apr 15, 2025

ness1602 said:
I have one strange error:
Kernel 6.14 ,lxc with mysql on it.
/var/log/kern.log:3651:2025-04-10T10:35:40.646702+02:00 sp19 kernel: [236703.525309] audit: type=1400 audit(1744274140.640:2244): apparmor="DENIED" operation="create" class="net" namespace="root//lxc-1193_<-var-lib-lxc>" profile="/usr/sbin/mysqld" pid=2674395 comm="mysqld" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
This doesnt happen with 6.11 kernel.

I Have this same error and its preventing Docker in LXC from working for me, have to revert to 6.11 for now

gfngfn256 · Apr 15, 2025

piexil said:
I Have this same error and its preventing Docker in LXC from working for me, have to revert to 6.11 for now

ness1602 said:
I have one strange error:
Kernel 6.14 ,lxc with mysql on it.
/var/log/kern.log:3651:2025-04-10T10:35:40.646702+02:00 sp19 kernel: [236703.525309] audit: type=1400 audit(1744274140.640:2244): apparmor="DENIED" operation="create" class="net" namespace="root//lxc-1193_<-var-lib-lxc>" profile="/usr/sbin/mysqld" pid=2674395 comm="mysqld" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
This doesnt happen with 6.11 kernel.

Maybe (not sure) this is linked to this bug; here & again here?

Edit: I now see it has been picked up here in forums.

fst · Apr 17, 2025

While booting I have received this error message. It was not present under previous kernels:

[Thu Apr 17 13:57:49 2025] vmbr0: port 2(veth101i0) entered blocking state
[Thu Apr 17 13:57:49 2025] vmbr0: port 2(veth101i0) entered forwarding state
[Thu Apr 17 13:57:49 2025] ------------[ cut here ]------------
[Thu Apr 17 13:57:49 2025] WARNING: CPU: 11 PID: 10245 at net/bridge/br_netfilter_hooks.c:602 br_nf_local_in+0x1b9/0x1e0
[Thu Apr 17 13:57:49 2025] Modules linked in: cfg80211 veth ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_NFLOG xt_limit xt_mac ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev xt_addrtype xt_comment xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp iptable_filter ip_set_hash_net ip_set sctp ip6_udp_tunnel udp_tunnel scsi_transport_iscsi softdog nf_tables nvme_fabrics nvme_keyring 8021q garp mrp sunrpc binfmt_misc bonding tls nfnetlink_log nfnetlink ipmi_ssif amd_atl intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd irdma i40e kvm vhost_net vhost ast ib_uverbs vhost_iotlb jc42 acpi_ipmi rapl ib_core ipmi_si wmi_bmof ccp pcspkr i2c_algo_bit k10temp ipmi_devintf ee1004 ptdma ipmi_msghandler joydev input_leds mac_hid tap efi_pstore dmi_sysfs ip_tables x_tables autofs4 zfs(PO) spl(O) btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 linear uas usb_storage
[Thu Apr 17 13:57:49 2025] hid_generic usbmouse usbhid hid rndis_host cdc_ether usbnet mii polyval_clmulni polyval_generic ghash_clmulni_intel xhci_pci sha256_ssse3 ice nvme sha1_ssse3 gnss ahci bnxt_en libie nvme_core libahci xhci_hcd i2c_piix4 i2c_smbus nvme_auth wmi aesni_intel crypto_simd cryptd
[Thu Apr 17 13:57:49 2025] CPU: 11 UID: 192 PID: 10245 Comm: systemd-network Tainted: P O 6.14.0-2-pve #1
[Thu Apr 17 13:57:49 2025] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[Thu Apr 17 13:57:49 2025] Hardware name: Supermicro AS -1114S-WN10RT/H12SSW-NTR, BIOS 3.0 07/29/2024
[Thu Apr 17 13:57:49 2025] RIP: 0010:br_nf_local_in+0x1b9/0x1e0
[Thu Apr 17 13:57:49 2025] Code: df e8 4b 49 d9 ff 66 83 ab b8 00 00 00 08 eb 92 be 04 00 00 00 48 89 df e8 34 49 d9 ff 66 83 ab b8 00 00 00 04 e9 78 ff ff ff <0f> 0b e9 9b fe ff ff 0f 0b e9 e7 fe ff ff 4c 89 e7 e8 f1 d2 e7 ff
[Thu Apr 17 13:57:49 2025] RSP: 0018:ffffad9f80700990 EFLAGS: 00010202
[Thu Apr 17 13:57:49 2025] RAX: 0000000000000002 RBX: ffff9acf12fd9300 RCX: 0000000000000000
[Thu Apr 17 13:57:49 2025] RDX: ffffad9f80700a00 RSI: ffff9acf12fd9300 RDI: 0000000000000000
[Thu Apr 17 13:57:49 2025] RBP: ffffad9f807009b0 R08: 0000000000000000 R09: 0000000000000000
[Thu Apr 17 13:57:49 2025] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9ad194b7a500
[Thu Apr 17 13:57:49 2025] R13: ffffad9f80700a00 R14: 0000000000000001 R15: ffff9acfa928d180
[Thu Apr 17 13:57:49 2025] FS: 00007ce6ce3e0bc0(0000) GS:ffff9b4c4dd80000(0000) knlGS:0000000000000000
[Thu Apr 17 13:57:49 2025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Apr 17 13:57:49 2025] CR2: 00005750ac359cd0 CR3: 000000079f35c003 CR4: 0000000000f70ef0
[Thu Apr 17 13:57:49 2025] PKRU: 55555554
[Thu Apr 17 13:57:49 2025] Call Trace:
[Thu Apr 17 13:57:49 2025] <IRQ>
[Thu Apr 17 13:57:49 2025] ? show_regs+0x6c/0x80
[Thu Apr 17 13:57:49 2025] ? __warn+0x8d/0x150
[Thu Apr 17 13:57:49 2025] ? br_nf_local_in+0x1b9/0x1e0
[Thu Apr 17 13:57:49 2025] ? report_bug+0x182/0x1b0
[Thu Apr 17 13:57:49 2025] ? handle_bug+0x6e/0xb0
[Thu Apr 17 13:57:49 2025] ? exc_invalid_op+0x18/0x80
[Thu Apr 17 13:57:49 2025] ? asm_exc_invalid_op+0x1b/0x20
[Thu Apr 17 13:57:49 2025] ? br_nf_local_in+0x1b9/0x1e0
[Thu Apr 17 13:57:49 2025] nf_hook_slow+0x46/0x120
[Thu Apr 17 13:57:49 2025] br_pass_frame_up+0x146/0x1d0
[Thu Apr 17 13:57:49 2025] ? __pfx_br_netif_receive_skb+0x10/0x10
[Thu Apr 17 13:57:49 2025] br_handle_frame_finish+0x3ab/0x690
[Thu Apr 17 13:57:49 2025] ? __pfx_br_handle_frame_finish+0x10/0x10
[Thu Apr 17 13:57:49 2025] br_nf_hook_thresh+0x10a/0x120
[Thu Apr 17 13:57:49 2025] ? __pfx_br_handle_frame_finish+0x10/0x10
[Thu Apr 17 13:57:49 2025] br_nf_pre_routing_finish+0x17e/0x390
[Thu Apr 17 13:57:49 2025] ? __pfx_br_handle_frame_finish+0x10/0x10
[Thu Apr 17 13:57:49 2025] ? ipv4_conntrack_in+0x14/0x20 [nf_conntrack]
[Thu Apr 17 13:57:49 2025] br_nf_pre_routing+0x24b/0x5f0
[Thu Apr 17 13:57:49 2025] ? __pfx_br_nf_pre_routing_finish+0x10/0x10
[Thu Apr 17 13:57:49 2025] br_handle_frame+0x2a3/0x440
[Thu Apr 17 13:57:49 2025] ? __pfx_br_handle_frame_finish+0x10/0x10
[Thu Apr 17 13:57:49 2025] __netif_receive_skb_core.constprop.0+0x29a/0x1250
[Thu Apr 17 13:57:49 2025] ? sched_clock+0x10/0x30
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? psi_task_change+0x89/0xc0
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] __netif_receive_skb_one_core+0x3e/0xa0
[Thu Apr 17 13:57:49 2025] __netif_receive_skb+0x15/0x60
[Thu Apr 17 13:57:49 2025] process_backlog+0x90/0x160
[Thu Apr 17 13:57:49 2025] __napi_poll+0x33/0x1f0
[Thu Apr 17 13:57:49 2025] net_rx_action+0x20c/0x400
[Thu Apr 17 13:57:49 2025] handle_softirqs+0xda/0x2e0
[Thu Apr 17 13:57:49 2025] __do_softirq+0x10/0x18
[Thu Apr 17 13:57:49 2025] do_softirq.part.0+0x3f/0x80
[Thu Apr 17 13:57:49 2025] </IRQ>
[Thu Apr 17 13:57:49 2025] <TASK>
[Thu Apr 17 13:57:49 2025] __local_bh_enable_ip+0x6e/0x70
[Thu Apr 17 13:57:49 2025] __dev_queue_xmit+0x278/0x1010
[Thu Apr 17 13:57:49 2025] ? __alloc_skb+0x60/0x1b0
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? alloc_skb_with_frags+0x61/0x240
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] packet_xmit+0xae/0x120
[Thu Apr 17 13:57:49 2025] packet_sendmsg+0xabd/0x1980
[Thu Apr 17 13:57:49 2025] __sys_sendto+0x242/0x250
[Thu Apr 17 13:57:49 2025] __x64_sys_sendto+0x24/0x40
[Thu Apr 17 13:57:49 2025] x64_sys_call+0x1d04/0x2540
[Thu Apr 17 13:57:49 2025] do_syscall_64+0x7e/0x170
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? __handle_mm_fault+0x840/0x10b0
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? __count_memcg_events+0xc0/0x160
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? count_memcg_events.constprop.0+0x2a/0x50
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? handle_mm_fault+0xae/0x360
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? do_user_addr_fault+0x1ec/0x830
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? irqentry_exit_to_user_mode+0x2d/0x1d0
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? irqentry_exit+0x43/0x50
[Thu Apr 17 13:57:49 2025] ? srso_alias_return_thunk+0x5/0xfbef5
[Thu Apr 17 13:57:49 2025] ? exc_page_fault+0x96/0x1e0
[Thu Apr 17 13:57:49 2025] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Thu Apr 17 13:57:49 2025] RIP: 0033:0x7ce6cecccff7
[Thu Apr 17 13:57:49 2025] Code: c7 c0 ff ff ff ff eb be 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d d5 a0 0f 00 00 41 89 ca 74 10 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 69 c3 55 48 89 e5 53 48 83 ec 38 44 89 4d d0
[Thu Apr 17 13:57:49 2025] RSP: 002b:00007ffc854642d8 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[Thu Apr 17 13:57:49 2025] RAX: ffffffffffffffda RBX: 00005750d989a0e0 RCX: 00007ce6cecccff7
[Thu Apr 17 13:57:49 2025] RDX: 0000000000000141 RSI: 00005750d9896fd0 RDI: 0000000000000013
[Thu Apr 17 13:57:49 2025] RBP: 00007ffc854642e0 R08: 00005750d989a120 R09: 0000000000000014
[Thu Apr 17 13:57:49 2025] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[Thu Apr 17 13:57:49 2025] R13: 00005750d9896fd0 R14: 0000000000000141 R15: 0000000000000150
[Thu Apr 17 13:57:49 2025] </TASK>
[Thu Apr 17 13:57:49 2025] ---[ end trace 0000000000000000 ]---

RealPjotr · Apr 17, 2025

I upgraded my Epyc node to 8.4 and opt in kernel 6.14. This caused my TrueNAS VM to fail to start with nothing in the logs.

"qm start 108" would simply hang forever. Shutdown of the host would never timeout and I had to power off. The VM would start if I removed the PCIe SATA controller PCIe hardware shared to it. Motherboard is an Asrock Rack SIENAD8-2L2T and PCIE7 slot is set to SATA where I have 5 disks that the TrueNAS VM uses normally.

I reverted to kernel 6.11 and the VM booted fine again.

Some more info here: https://forum.proxmox.com/threads/t...e-to-8-4-need-urgent-help.165189/#post-764699

Giggling3999 · Apr 22, 2025

I have an lxc that suddenly gets the error

Code:

Failed to open /dev/vhost-net: No such file or directory

On this kernel ... Anyone any ideas?

janssensm · Apr 22, 2025

16 x AMD Opteron(tm) Processor 6380 (1 Socket)
PVE 8.4.1
ZFS, also as rootfs
Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
Kernel 6.14.0-2-pve
VM: PBS 3.4.1-1 with kernel 6.14.0-2-pve
No issues so far.

Giggling3999 · Apr 23, 2025

Hm - Not sure if I've had more issues since moving kernel

Couple of VMs died and a server reboot..One just now...

Unsure if it's me though - As I'm also trying get a stupid printer to work - Few extracts below - Also a line about tainted kernel?

Code:

Apr 23 14:43:45 proxServ kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Apr 23 14:43:45 proxServ kernel: #PF: supervisor instruction fetch in kernel mode
Apr 23 14:43:45 proxServ kernel: #PF: error_code(0x0010) - not-present page

Code:

qemu:cpus_kick_thread: Invalid argumentkvm: warning: Spice: playback:0 (0x570618701c00): c>
Apr 23 14:44:22 proxServ QEMU[7402]: kvm: warning: Spice: record:0 (0x570618b4f0b0): channel->thread_id (0x742c6821a480) != pth>

Code:

kvm_amd: kvm [88219]: vcpu5, guest rIP: 0xfffff85e8d33b8e5 Unhandled WRMSR(0xc0010115) = 0x0,

Giggling3999 · Apr 24, 2025

Yep - Another CPU 100%, then VM crash - I think is kernel related?

Code:

Apr 24 11:13:43 proxServ kernel: ------------[ cut here ]------------
Apr 24 11:13:43 proxServ kernel: WARNING: CPU: 9 PID: 88247 at arch/x86/kvm/svm/nested.c:1212 svm_free_nested+0xb2/0xe0 [kvm_amd]
Apr 24 11:13:43 proxServ kernel: Modules linked in: iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tcp_di>
Apr 24 11:13:43 proxServ kernel:  vfio iommufd parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 hid_jabra u>
Apr 24 11:13:43 proxServ kernel: CPU: 9 UID: 0 PID: 88247 Comm: CPU 3/KVM Tainted: P      D    OE      6.14.0-2-pve #1
Apr 24 11:13:43 proxServ kernel: Tainted: [P]=PROPRIETARY_MODULE, [D]=DIE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Apr 24 11:13:43 proxServ kernel: Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 5013 03/22/2024
Apr 24 11:13:43 proxServ kernel: RIP: 0010:svm_free_nested+0xb2/0xe0 [kvm_amd]
Apr 24 11:13:43 proxServ kernel: Code: 00 00 00 48 c7 83 68 1a 00 00 00 00 00 00 48 c7 83 a0 1a 00 00 ff ff ff ff 5b 41 5c 5d 31 c0 31>
Apr 24 11:13:43 proxServ kernel: RSP: 0018:ffffad31befdf8b8 EFLAGS: 00010206
Apr 24 11:13:43 proxServ kernel: RAX: ffff9a280e0a4000 RBX: ffff9a2c75cd39c0 RCX: 0000000000000000
Apr 24 11:13:43 proxServ kernel: RDX: 000000000000004d RSI: 0000000000000000 RDI: ffff9a2c75cd39c0
Apr 24 11:13:43 proxServ kernel: RBP: ffffad31befdf8c8 R08: 0000000000000000 R09: 0000000000000000
Apr 24 11:13:43 proxServ kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
Apr 24 11:13:43 proxServ kernel: R13: 0000000000005d01 R14: ffff9a2c75cd39c0 R15: 0000000000000001
Apr 24 11:13:43 proxServ kernel: FS:  00007d088ddff6c0(0000) GS:ffff9a45af080000(0000) knlGS:0000000000000000
Apr 24 11:13:43 proxServ kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 24 11:13:43 proxServ kernel: CR2: 00007d0630004014 CR3: 000000076cb36000 CR4: 0000000000f50ef0
Apr 24 11:13:43 proxServ kernel: PKRU: 55555554
Apr 24 11:13:43 proxServ kernel: Call Trace:
Apr 24 11:13:43 proxServ kernel:  <TASK>
Apr 24 11:13:43 proxServ kernel:  ? show_regs+0x6c/0x80
Apr 24 11:13:43 proxServ kernel:  ? __warn+0x8d/0x150
Apr 24 11:13:43 proxServ kernel:  ? svm_free_nested+0xb2/0xe0 [kvm_amd]
Apr 24 11:13:43 proxServ kernel:  ? report_bug+0x182/0x1b0
Apr 24 11:13:43 proxServ kernel:  ? handle_bug+0x6e/0xb0
Apr 24 11:13:43 proxServ kernel:  ? exc_invalid_op+0x18/0x80
Apr 24 11:13:43 proxServ kernel:  ? asm_exc_invalid_op+0x1b/0x20
Apr 24 11:13:43 proxServ kernel:  ? svm_free_nested+0xb2/0xe0 [kvm_amd]
Apr 24 11:13:43 proxServ kernel:  ? svm_set_gif+0xd4/0x1d0 [kvm_amd]
Apr 24 11:13:43 proxServ kernel:  svm_set_efer+0x142/0x170 [kvm_amd]
Apr 24 11:13:43 proxServ kernel:  __set_sregs_common.constprop.0+0x270/0x520 [kvm]
Apr 24 11:13:43 proxServ kernel:  kvm_arch_vcpu_ioctl+0x151f/0x1a20 [kvm]
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? __slab_free+0xdf/0x2a0
Apr 24 11:13:43 proxServ kernel:  ? __sigqueue_free+0x3d/0xa0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  kvm_vcpu_ioctl+0x70f/0xaa0 [kvm]
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? kvm_vcpu_ioctl+0x70f/0xaa0 [kvm]
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? get_sigframe+0x103/0x2f0
Apr 24 11:13:43 proxServ kernel:  __x64_sys_ioctl+0xa7/0xe0
Apr 24 11:13:43 proxServ kernel:  x64_sys_call+0xb45/0x2540
Apr 24 11:13:43 proxServ kernel:  do_syscall_64+0x7e/0x170
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? fpu__restore_sig+0x8e/0xc0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? restore_sigcontext+0x187/0x1f0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? __do_sys_rt_sigreturn+0xe2/0x100
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? do_syscall_64+0x8a/0x170
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? do_syscall_64+0x8a/0x170
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? arch_exit_to_user_mode_prepare.constprop.0+0x22/0xd0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? syscall_exit_to_user_mode+0x38/0x1d0
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? do_syscall_64+0x8a/0x170
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? irqentry_exit+0x43/0x50
Apr 24 11:13:43 proxServ kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Apr 24 11:13:43 proxServ kernel:  ? exc_page_fault+0x96/0x1e0
Apr 24 11:13:43 proxServ kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Apr 24 11:13:43 proxServ kernel: RIP: 0033:0x7d089915ad1b
Apr 24 11:13:43 proxServ kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89>
Apr 24 11:13:43 proxServ kernel: RSP: 002b:00007d088ddf9c40 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Apr 24 11:13:43 proxServ kernel: RAX: ffffffffffffffda RBX: 0000593a3c0f2000 RCX: 00007d089915ad1b
Apr 24 11:13:43 proxServ kernel: RDX: 00007d088ddf9dc0 RSI: 000000004140aecd RDI: 0000000000000029
Apr 24 11:13:43 proxServ kernel: RBP: 000000004140aecd R08: 0000000000000000 R09: 0000000000000000
Apr 24 11:13:43 proxServ kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007d088ddf9dc0
Apr 24 11:13:43 proxServ kernel: R13: 0000593a3c0f21c0 R14: 0000593a3c0f21f0 R15: 00007d088d5ff000
Apr 24 11:13:43 proxServ kernel:  </TASK>
Apr 24 11:13:43 proxServ kernel: ---[ end trace 0000000000000000 ]---

Giggling3999 · Apr 25, 2025

Kernel panic and proxmox crash, overnight

Stoiko Ivanov · Apr 25, 2025

Giggling3999 said:
Yep - Another CPU 100%, then VM crash - I think is kernel related?

might be - but it could also be due to the newer kernel expecting some things that were fixed in more recent version of BIOS/Firmware:

Giggling3999 said:
Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 5013 03/22/2024

seems there were some updates available for the board:
https://www.asus.com/de/motherboard...0-pro/helpdesk_bios?model2Name=PRIME-X570-PRO

running a memory test (as available on the PVE ISOs) can also help to find issues with memory.

If both updating the BIOS and checking the memory don't help - please open a new thread (feel free to mention me @Stoiko Ivanov )

Giggling3999 · Apr 26, 2025

I've updated BIOS - Given that I downgraded to 6.11 kernel and had another crash over night - Nothing in journalctl, this time.
It's been rock solid on 6.8 for a long time, so I wouldn't have thought it will be hardware...We shall see!

adolfotregosa · Apr 27, 2025

GPU passthrough from pre 6.12 upwards crashes my linux VMs (I have no idea what happens in windows, I don't use it) - I found the problematic kernel commit:
Problematic commit

This commit produces an instant VM crash in certain gpu utilizations (in my case a simple youtube video on brave browser) with the following report from the kernel:

Apr 11 17:28:32 pve QEMU[92966]: RAX=000018cc0e283840 RBX=0000000000000780 RCX=0000000000000780 RDX=0000000000000780
Apr 11 17:28:32 pve QEMU[92966]: RSI=000018cc0e283840 RDI=00007a71e5c0f000 RBP=00007a71ed884960 RSP=00007a71ed884960
Apr 11 17:28:32 pve QEMU[92966]: R8 =0000000000000780 R9 =0000000000000110 R10=00000000000003c0 R11=0000000000000800
Apr 11 17:28:32 pve QEMU[92966]: R12=0000000000000110 R13=00005597198ef358 R14=00007a71e5c0f000 R15=000018cc0e283840
Apr 11 17:28:32 pve QEMU[92966]: RIP=000055971ff041d0 RFL=00010202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
Apr 11 17:28:32 pve QEMU[92966]: ES =0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: CS =0033 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA]
Apr 11 17:28:32 pve QEMU[92966]: SS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
Apr 11 17:28:32 pve QEMU[92966]: DS =0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: FS =0000 00007a71ed8876c0 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: GS =0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: LDT=0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: TR =0040 fffffe1d5342d000 00004087 00008b00 DPL=0 TSS64-busy
Apr 11 17:28:32 pve QEMU[92966]: GDT= fffffe1d5342b000 0000007f
Apr 11 17:28:32 pve QEMU[92966]: IDT= fffffe0000000000 00000fff
Apr 11 17:28:32 pve QEMU[92966]: CR0=80050033 CR2=00007a71e5c0f000 CR3=000000011ec9a004 CR4=00772ef0
Apr 11 17:28:32 pve QEMU[92966]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Apr 11 17:28:32 pve QEMU[92966]: DR6=00000000ffff0ff0 DR7=0000000000000400
Apr 11 17:28:32 pve QEMU[92966]: EFER=0000000000000d01
Apr 11 17:28:32 pve QEMU[92966]: Code=cc cc cc cc 55 48 89 e5 48 89 f8 48 63 ca 48 89 f7 48 89 c6 <f3> a4 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc 55 48 89 e5 89 55 fc f3 0f 6f 07 f3 0f 6f

I did a bisect from 6.11.11 to 6.12-rc1 and got to the commit f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101

I reverted that commit from 6.14.4 and the crashes are gone. Another user here using an amd card reported crashes also. I've got then with Nvidia gpus

I'm assuming this is a kernel problem and not qemu fault. I don't have the knowledge to answer this.
I'm now running 6.14.4 with that commit reverted and so far no crashes.

Revert patch is attached.

rfox · Apr 27, 2025

After diving in head first with 6.14 - I somehow borked my cluster. going back to 6.11 got eveything back to "normal" - I have 5 different mini PCs running in the cluster - all over 10GB or LACP'd 2.5GB NW (aka 5GB) - I have no hight traffic and I broke the cardinal sin of having the corosync not on a dedicated NW - almost afraid to try and change that because everything was working until now . . .

Something in 6.14 has caused the 5 nodes to lose contact with eachother and I get many retransmit list errors - once I reverted all nodes back to 6.11 things stabilized again - If any of one of the nodes is 6.14 and the rest 6.11 it still has problems. FYI - I have ttwo stand alone devices and update to 6.14 without issue so far:

Apr 26 22:01:16 awowfox corosync[18076]: [TOTEM ] Retransmit List: 23
Apr 26 22:01:17 awowfox corosync[18076]: [TOTEM ] Retransmit List: 23
Apr 26 22:01:18 awowfox corosync[18076]: [TOTEM ] Retransmit List: 23
Apr 26 22:01:19 awowfox corosync[18076]: [TOTEM ] Retransmit List: 23
Apr 26 22:01:20 awowfox corosync[18076]: [TOTEM ] Retransmit List: 23

Apr 26 16:43:45 FoxN100 corosync[1128]: [QUORUM] Sync members[4]: 2 3 4 5
Apr 26 16:43:45 FoxN100 corosync[1128]: [QUORUM] Sync left[1]: 1
Apr 26 16:43:45 FoxN100 corosync[1128]: [TOTEM ] A new membership (2.27db) was formed. Members left: 1
Apr 26 16:43:45 FoxN100 corosync[1128]: [TOTEM ] Failed to receive the leave message. failed: 1
Apr 26 16:43:45 FoxN100 corosync[1128]: [QUORUM] Sync members[4]: 2 3 4 5
Apr 26 16:43:45 FoxN100 corosync[1128]: [QUORUM] Sync joined[3]: 2 3 5
Apr 26 16:43:45 FoxN100 corosync[1128]: [QUORUM] Sync left[4]: 1 2 3 5
Apr 26 16:43:45 FoxN100 corosync[1128]: [TOTEM ] A new membership (2.27e3) was formed. Members joined: 2 3 5 left: 2 3 5
Apr 26 16:43:45 FoxN100 corosync[1128]: [TOTEM ] Failed to receive the leave message. failed: 2 3 5
Apr 26 16:43:46 FoxN100 corosync[1128]: [QUORUM] Sync members[4]: 2 3 4 5
Apr 26 16:43:46 FoxN100 corosync[1128]: [QUORUM] Sync left[1]: 1
Apr 26 16:43:46 FoxN100 corosync[1128]: [TOTEM ] A new membership (2.27e7) was formed. Members

fiona · Apr 28, 2025

Hi,

adolfotregosa said:
GPU passthrough from pre 6.12 upwards crashes my linux VMs (I have no idea what happens in windows, I don't use it) - I found the problematic kernel commit:
Problematic commit

This commit produces an instant VM crash in certain gpu utilizations (in my case a simple youtube video on brave browser) with the following report from the kernel:

Apr 11 17:28:32 pve QEMU[92966]: RAX=000018cc0e283840 RBX=0000000000000780 RCX=0000000000000780 RDX=0000000000000780
Apr 11 17:28:32 pve QEMU[92966]: RSI=000018cc0e283840 RDI=00007a71e5c0f000 RBP=00007a71ed884960 RSP=00007a71ed884960
Apr 11 17:28:32 pve QEMU[92966]: R8 =0000000000000780 R9 =0000000000000110 R10=00000000000003c0 R11=0000000000000800
Apr 11 17:28:32 pve QEMU[92966]: R12=0000000000000110 R13=00005597198ef358 R14=00007a71e5c0f000 R15=000018cc0e283840
Apr 11 17:28:32 pve QEMU[92966]: RIP=000055971ff041d0 RFL=00010202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
Apr 11 17:28:32 pve QEMU[92966]: ES =0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: CS =0033 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA]
Apr 11 17:28:32 pve QEMU[92966]: SS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
Apr 11 17:28:32 pve QEMU[92966]: DS =0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: FS =0000 00007a71ed8876c0 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: GS =0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: LDT=0000 0000000000000000 ffffffff 00c00000
Apr 11 17:28:32 pve QEMU[92966]: TR =0040 fffffe1d5342d000 00004087 00008b00 DPL=0 TSS64-busy
Apr 11 17:28:32 pve QEMU[92966]: GDT= fffffe1d5342b000 0000007f
Apr 11 17:28:32 pve QEMU[92966]: IDT= fffffe0000000000 00000fff
Apr 11 17:28:32 pve QEMU[92966]: CR0=80050033 CR2=00007a71e5c0f000 CR3=000000011ec9a004 CR4=00772ef0
Apr 11 17:28:32 pve QEMU[92966]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Apr 11 17:28:32 pve QEMU[92966]: DR6=00000000ffff0ff0 DR7=0000000000000400
Apr 11 17:28:32 pve QEMU[92966]: EFER=0000000000000d01
Apr 11 17:28:32 pve QEMU[92966]: Code=cc cc cc cc 55 48 89 e5 48 89 f8 48 63 ca 48 89 f7 48 89 c6 <f3> a4 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc 55 48 89 e5 89 55 fc f3 0f 6f 07 f3 0f 6f

are you sure this is this the full log? No backtrace or further description of the error?

adolfotregosa said:
I did a bisect from 6.11.11 to 6.12-rc1 and got to the commit f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101

I reverted that commit from 6.14.4 and the crashes are gone. Another user here using an amd card reported crashes also. I've got then with Nvidia gpus

I'm assuming this is a kernel problem and not qemu fault. I don't have the knowledge to answer this.
I'm now running 6.14.4 with that commit reverted and so far no crashes.

Seems like there is a follow-up for the problematic commit: 09dfc8a5f2ce8 ("vfio/pci: Fallback huge faults for unaligned pfn")

Could you try with that applied on top instead of the revert?

adolfotregosa · Apr 29, 2025

fiona said:
Hi,

are you sure this is this the full log? No backtrace or further description of the error?

Seems like there is a follow-up for the problematic commit: 09dfc8a5f2ce8 ("vfio/pci: Fallback huge faults for unaligned pfn")

Could you try with that applied on top instead of the revert?

Yes, that's it pretty much ( it repeats 2 or 3 times ) and VM crashes.

I have. The revert patch is for both commits. I went down the rabbit hole. Vanilla 6.14.4 still crashed. I'm now running vanilla 6.14.4 with both commits reverted atm and no more crash.

I have opened a bugzilla about this.

Giggling3999 · Apr 29, 2025

Mine has been stable since I updated bios.

Like I said - 6.8 was stable on previous bios though, also

UltraCoder · May 1, 2025

UltraCoder said:
But we can clearly see, that upgrading kernel results in ~4-5% loss in throughput. Importantly, the test between PVE and PC shows no losses, so maybe this has something to do with changes in linux bridge?

I just did the test with OVS Bridge instead of Linux Bridge - same issue.

Taomyn · May 2, 2025

Just for general information as it relates to this kernel version, I upgraded two Fedora 41 servers to the first release of kernel 6.14 (at least that I have seen, as I update each week), and the first machine was fine, but it's a small system. The second server which I run much more on, MariaDB, containers, Nextcloud, it did not go so well - the first reboot it just hung, the second and subsequent boots are fine, but then within 10-15min things would stop responding, and I saw services like the guest agent amongst other simple rocket their CPU to 100%, complete loss of network and cannot even use the console. Every reset/reboot would do the same no matter what I tried. I had to reboot back to kernel 6.13, the last it was using, and so far after a few hours all is well again. Will be waiting for the next release before trying again.

As I found out from the above that Fedora 42 is now out, going to test that out on the smaller machine - yes, I'm a glutton for punishment, but it's fun.

luckman212 · May 3, 2025

CRCinAU said:
Some good news, updating to kernel 6.14 dropped the power usage in my rack by ~15 watts.

Two proxmox systems, a 3700x and 5700g CPU.

You can see when I rebooted both of them into the 6.14 kernel.

View attachment 84740

What tool is generating these graphs please?

uzumo · May 3, 2025

adolfotregosa said:
Yes, that's it pretty much ( it repeats 2 or 3 times ) and VM crashes.

I have. The revert patch is for both commits. I went down the rabbit hole. Vanilla 6.14.4 still crashed. I'm now running vanilla 6.14.4 with both commits reverted atm and no more crash.

I have opened a bugzilla about this.

Thanks so much!

Opt-in Linux 6.14 Kernel for Proxmox VE 8 available on test & no-subscription

Active Member

Distinguished Member

Member

Member

Member

Famous Member

Member

Member

Member

Proxmox Staff Member

Member

Well-Known Member

Attachments

Active Member

Proxmox Staff Member

Well-Known Member

Attachments

Member

New Member

Active Member

Renowned Member

Active Member

We value your privacy