Kernel Bug: kernel NULL pointer dereference, address: 0000000000000000

bobloadmire

New Member
May 17, 2024
16
2
3
Proxmox crashed last night, never seen this one. Just updated to pve-manager/8.2.4/faa83925c9641325 (running kernel: 6.8.8-1-pve) yesterday, related?
logs below:

Code:
Jun 20 02:05:39 pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Jun 20 02:05:28 pve kernel: note: ffmpeg[522914] exited with irqs disabled
Jun 20 02:05:28 pve kernel: PKRU: 55555554
/
Jun 20 02:05:28 pve kernel: CR2: 0000000000000000
Jun 20 02:05:28 pve kernel:  snd_intel_dspcfg snd_intel_sdw_acpi btusb btrtl snd_hda_codec btintel btbcm snd_hda_core drm_buddy btmtk ttm snd_hwdep bluetooth drm_display_helper snd_pcm cmdlinepart mei_hdcp mei_pxp cec spi_nor snd_timer rapl ecdh_gene>
Jun 20 02:05:28 pve kernel: Modules linked in: tcp_diag inet_diag cmac nls_utf8 cifs cifs_arc4 nls_ucs2_utils rdma_cm iw_cm ib_cm ib_core cifs_md4 netfs cfg80211 veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables >
Jun 20 02:05:28 pve kernel: Call Trace:
/
Jun 20 02:05:28 pve kernel: Hardware name: Default string Default string/Default string, BIOS 1744NP12V10R006 09/25/2023
Jun 20 02:05:28 pve kernel: CPU: 1 PID: 522914 Comm: ffmpeg Tainted: P    B D    O       6.8.8-1-pve #1
Jun 20 02:05:28 pve kernel: Oops: 0002 [#2] PREEMPT SMP NOPTI
Jun 20 02:05:28 pve kernel: PGD 0 P4D 0
Jun 20 02:05:28 pve kernel: #PF: error_code(0x0002) - not-present page
Jun 20 02:05:28 pve kernel: #PF: supervisor write access in kernel mode
Jun 20 02:05:28 pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Jun 20 02:05:19 pve pvestatd[861]: malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/share/perl5/PVE/Tools.pm line 1050, <GEN2575449> chunk 1.
Jun 20 02:05:19 pve kernel: note: journal-offline[522670] exited with irqs disabled
/
Jun 20 02:05:19 pve kernel:  snd_intel_dspcfg snd_intel_sdw_acpi btusb btrtl snd_hda_codec btintel btbcm snd_hda_core drm_buddy btmtk ttm snd_hwdep bluetooth drm_display_helper snd_pcm cmdlinepart mei_hdcp mei_pxp cec spi_nor snd_timer rapl ecdh_gene>
Jun 20 02:05:19 pve kernel: Modules linked in: tcp_diag inet_diag cmac nls_utf8 cifs cifs_arc4 nls_ucs2_utils rdma_cm iw_cm ib_cm ib_core cifs_md4 netfs cfg80211 veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables >
Jun 20 02:05:19 pve kernel:  </TASK>
/
Jun 20 02:05:19 pve kernel: Code: 48 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c 24 0c e8 73 6c f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 d3 6c f8 ff 8b 44 24
Jun 20 02:05:19 pve kernel: RIP: 0033:0x7c3600d1db3a
/
Jun 20 02:05:19 pve kernel: Hardware name: Default string Default string/Default string, BIOS 1744NP12V10R006 09/25/2023
Jun 20 02:05:19 pve kernel: CPU: 3 PID: 522670 Comm: journal-offline Tainted: P    B      O       6.8.8-1-pve #1
Jun 20 02:05:19 pve kernel: Oops: 0010 [#1] PREEMPT SMP NOPTI
Jun 20 02:05:19 pve kernel: PGD 0 P4D 0
Jun 20 02:05:19 pve kernel: #PF: error_code(0x0010) - not-present page
Jun 20 02:05:19 pve kernel: #PF: supervisor instruction fetch in kernel mode
Jun 20 02:05:19 pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Jun 20 02:05:19 pve kernel: BUG: non-zero pgtables_bytes on freeing mm: 57344
Jun 20 02:05:19 pve kernel: BUG: Bad rss-counter state mm:00000000d68669aa type:MM_ANONPAGES val:6990
/
Jun 20 02:05:19 pve kernel: Code: 06 00 00 c7 83 24 06 00 00 00 00 00 00 48 85 c0 74 2b c7 40 04 00 00 00 00 b8 08 00 00 00 0f 1f 44 00 00 48 8b 93 28 06 00 00 <c7> 04 02 00 00 00 00 48 83 c0 04 48 3d 10 01 00 00 75 e6 e8 51 c3
Jun 20 02:05:19 pve kernel: pvestatd[522662]: segfault at 8 ip 000055e941459717 sp 00007ffcab4be0e0 error 6 in perl[55e941325000+195000] likely on CPU 2 (core 8, socket 0)
Jun 20 02:05:19 pve kernel: mm/pgtable-generic.c:54: bad pmd 000000007e52757f(000000046a9c8467)
Jun 20 02:05:19 pve kernel: BUG: Bad rss-counter state mm:00000000ca8cfc6b type:MM_ANONPAGES val:229
Jun 20 02:05:19 pve kernel:  </TASK>
Jun 20 02:05:19 pve kernel: R13: 000055e942561e90 R14: 000055e9422e3fab R15: 0000000000000001
Jun 20 02:05:19 pve kernel: R10: 00007002b079fe50 R11: 0000000000000247 R12: 000055e947e33970
Jun 20 02:05:19 pve kernel: RBP: 00007ffcab4be100 R08: 000055e9422e3faf R09: 0000000000000009
Jun 20 02:05:19 pve kernel: RDX: 000055e942561e90 RSI: 000055e947e33970 RDI: 00007ffcab4be060
Jun 20 02:05:19 pve kernel: RAX: ffffffffffffffda RBX: 000055e947e7c5e0 RCX: 00007002b0862a17
Jun 20 02:05:19 pve kernel: RSP: 002b:00007ffcab4be058 EFLAGS: 00000247 ORIG_RAX: 000000000000003b
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0x7002b08629ed.
Jun 20 02:05:19 pve kernel: RIP: 0033:0x7002b0862a17
/
Jun 20 02:05:19 pve kernel: Call Trace:
Jun 20 02:05:19 pve kernel: Hardware name: Default string Default string/Default string, BIOS 1744NP12V10R006 09/25/2023
Jun 20 02:05:19 pve kernel: CPU: 3 PID: 522620 Comm: pvestatd Tainted: P    B      O       6.8.8-1-pve #1
Jun 20 02:05:19 pve kernel: file:(null) fault:0x0 mmap:0x0 read_folio:0x0
Jun 20 02:05:19 pve kernel: addr:000055e946e76000 vm_flags:08100073 anon_vma:ffff9a4c7daeb820 mapping:0000000000000000 index:55e946e76
Jun 20 02:05:19 pve kernel: BUG: Bad page map in process pvestatd  pte:84000007f989ea05 pmd:46a9cd067
Jun 20 02:05:19 pve kernel:  </TASK>
Jun 20 02:05:19 pve kernel: R13: 000055e942561e90 R14: 000055e9422e3fab R15: 0000000000000001
Jun 20 02:05:19 pve kernel: R10: 00007002b079fe50 R11: 0000000000000247 R12: 000055e947e33970
Jun 20 02:05:19 pve kernel: RBP: 00007ffcab4be100 R08: 000055e9422e3faf R09: 0000000000000009
Jun 20 02:05:19 pve kernel: RDX: 000055e942561e90 RSI: 000055e947e33970 RDI: 00007ffcab4be060
Jun 20 02:05:19 pve kernel: RAX: ffffffffffffffda RBX: 000055e947e7c5e0 RCX: 00007002b0862a17
Jun 20 02:05:19 pve kernel: RSP: 002b:00007ffcab4be058 EFLAGS: 00000247 ORIG_RAX: 000000000000003b
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0x7002b08629ed.
Jun 20 02:05:19 pve kernel: RIP: 0033:0x7002b0862a17
Jun 20 02:05:19 pve kernel:  entry_SYSCALL_64_after_hwframe+0x78/0x80
/
Jun 20 02:05:19 pve kernel: Call Trace:
Jun 20 02:05:19 pve kernel: Hardware name: Default string Default string/Default string, BIOS 1744NP12V10R006 09/25/2023
Jun 20 02:05:19 pve kernel: CPU: 3 PID: 522620 Comm: pvestatd Tainted: P    B      O       6.8.8-1-pve #1
Jun 20 02:05:19 pve kernel: file:(null) fault:0x0 mmap:0x0 read_folio:0x0
Jun 20 02:05:19 pve kernel: addr:000055e946e75000 vm_flags:08100073 anon_vma:ffff9a4c7daeb820 mapping:0000000000000000 index:55e946e75
Jun 20 02:05:19 pve kernel: BUG: Bad page map in process pvestatd  pte:84000007f989ca05 pmd:46a9cd067
Jun 20 02:05:19 pve kernel:  </TASK>
Jun 20 02:05:19 pve kernel: R13: 000055e942561e90 R14: 000055e9422e3fab R15: 0000000000000001
Jun 20 02:05:19 pve kernel: R10: 00007002b079fe50 R11: 0000000000000247 R12: 000055e947e33970
Jun 20 02:05:19 pve kernel: RBP: 00007ffcab4be100 R08: 000055e9422e3faf R09: 0000000000000009
Jun 20 02:05:19 pve kernel: RDX: 000055e942561e90 RSI: 000055e947e33970 RDI: 00007ffcab4be060
Jun 20 02:05:19 pve kernel: RAX: ffffffffffffffda RBX: 000055e947e7c5e0 RCX: 00007002b0862a17
Jun 20 02:05:19 pve kernel: RSP: 002b:00007ffcab4be058 EFLAGS: 00000247 ORIG_RAX: 000000000000003b
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0x7002b08629ed.
Jun 20 02:05:19 pve kernel: RIP: 0033:0x7002b0862a17
/
Jun 20 02:05:19 pve kernel: Call Trace:
Jun 20 02:05:19 pve kernel: Hardware name: Default string Default string/Default string, BIOS 1744NP12V10R006 09/25/2023
Jun 20 02:05:19 pve kernel: CPU: 3 PID: 522620 Comm: pvestatd Tainted: P    B      O       6.8.8-1-pve #1
Jun 20 02:05:19 pve kernel: file:(null) fault:0x0 mmap:0x0 read_folio:0x0
Jun 20 02:05:19 pve kernel: addr:000055e946e6f000 vm_flags:08100073 anon_vma:ffff9a4c7daeb820 mapping:0000000000000000 index:55e946e6f
Jun 20 02:05:19 pve kernel: BUG: Bad page map in process pvestatd  pte:84000007f9bcba05 pmd:46a9cd067
Jun 20 02:05:19 pve kernel:  </TASK>
Jun 20 02:05:19 pve kernel: R13: 000055e942561e90 R14: 000055e9422e3fab R15: 0000000000000001
Jun 20 02:05:19 pve kernel: R10: 00007002b079fe50 R11: 0000000000000247 R12: 000055e947e33970
Jun 20 02:05:19 pve kernel: RBP: 00007ffcab4be100 R08: 000055e9422e3faf R09: 0000000000000009
Jun 20 02:05:19 pve kernel: RDX: 000055e942561e90 RSI: 000055e947e33970 RDI: 00007ffcab4be060
Jun 20 02:05:19 pve kernel: RAX: ffffffffffffffda RBX: 000055e947e7c5e0 RCX: 00007002b0862a17
Jun 20 02:05:19 pve kernel: RSP: 002b:00007ffcab4be058 EFLAGS: 00000247 ORIG_RAX: 000000000000003b
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0x7002b08629ed.
/
Jun 20 02:05:28 pve kernel: #PF: error_code(0x0002) - not-present page
Jun 20 02:05:28 pve kernel: #PF: supervisor write access in kernel mode
Jun 20 02:05:28 pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Jun 20 02:05:19 pve pvestatd[861]: malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/share/perl5/PVE/Tools.pm line 1050, <GEN2575449> chunk 1.
Jun 20 02:05:19 pve kernel: note: journal-offline[522670] exited with irqs disabled
Jun 20 02:05:19 pve kernel: PKRU: 55555554
Jun 20 02:05:19 pve kernel: CR2: ffffffffffffffd6 CR3: 0000000114706000 CR4: 0000000000f52ef0
Jun 20 02:05:19 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 20 02:05:19 pve kernel: FS:  00007c35fe0006c0(0000) GS:ffff9a508f980000(0000) knlGS:0000000000000000
Jun 20 02:05:19 pve kernel: R13: ffff9a4913ec9848 R14: 0000000000000000 R15: 7fffffffffffffff
Jun 20 02:05:19 pve kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a49022af000
Jun 20 02:05:19 pve kernel: RBP: ffffb4ab8da63b00 R08: 0000000000000000 R09: 0000000000000000
Jun 20 02:05:19 pve kernel: RDX: 7fffffffffffffff RSI: ffffb4ab8da63bc0 RDI: ffff9a490ab3ce00
Jun 20 02:05:19 pve kernel: RAX: 0000000000000000 RBX: ffff9a490ab3ce00 RCX: 0000000000000000
Jun 20 02:05:19 pve kernel: RSP: 0018:ffffb4ab8da63ac0 EFLAGS: 00010246
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
Jun 20 02:05:19 pve kernel: RIP: 0010:0x0
Jun 20 02:05:19 pve kernel: ---[ end trace 0000000000000000 ]---
Jun 20 02:05:19 pve kernel: CR2: 0000000000000000
Jun 20 02:05:19 pve kernel:  snd_intel_dspcfg snd_intel_sdw_acpi btusb btrtl snd_hda_codec btintel btbcm snd_hda_core drm_buddy btmtk ttm snd_hwdep bluetooth drm_display_helper snd_pcm cmdlinepart mei_hdcp mei_pxp cec spi_nor snd_timer rapl ecdh_gene>
Jun 20 02:05:19 pve kernel: Modules linked in: tcp_diag inet_diag cmac nls_utf8 cifs cifs_arc4 nls_ucs2_utils rdma_cm iw_cm ib_cm ib_core cifs_md4 netfs cfg80211 veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables >
Jun 20 02:05:19 pve kernel:  </TASK>
Jun 20 02:05:19 pve kernel: R13: 0000000000000000 R14: 00007fff5ad83390 R15: 00007c35fd800000
Jun 20 02:05:19 pve kernel: R10: 00007c3600ca7f86 R11: 0000000000000293 R12: 00005cfe3a739c60
Jun 20 02:05:19 pve kernel: RBP: 00005cfe3a827c30 R08: 0000000000000000 R09: 00007c35fe0006c0
Jun 20 02:05:19 pve kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000018
Jun 20 02:05:19 pve kernel: RAX: ffffffffffffffda RBX: 00005cfe3a824fc0 RCX: 00007c3600d1db3a
Jun 20 02:05:19 pve kernel: RSP: 002b:00007c35fdfffc40 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
Jun 20 02:05:19 pve kernel: Code: 48 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c 24 0c e8 73 6c f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 d3 6c f8 ff 8b 44 24
/
Jun 20 02:05:19 pve kernel: Call Trace:
Jun 20 02:05:19 pve kernel: PKRU: 55555554
Jun 20 02:05:19 pve kernel: CR2: ffffffffffffffd6 CR3: 0000000114706000 CR4: 0000000000f52ef0
Jun 20 02:05:19 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 20 02:05:19 pve kernel: FS:  00007c35fe0006c0(0000) GS:ffff9a508f980000(0000) knlGS:0000000000000000
Jun 20 02:05:19 pve kernel: R13: ffff9a4913ec9848 R14: 0000000000000000 R15: 7fffffffffffffff
Jun 20 02:05:19 pve kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a49022af000
Jun 20 02:05:19 pve kernel: RBP: ffffb4ab8da63b00 R08: 0000000000000000 R09: 0000000000000000
Jun 20 02:05:19 pve kernel: RDX: 7fffffffffffffff RSI: ffffb4ab8da63bc0 RDI: ffff9a490ab3ce00
Jun 20 02:05:19 pve kernel: RAX: 0000000000000000 RBX: ffff9a490ab3ce00 RCX: 0000000000000000
Jun 20 02:05:19 pve kernel: RSP: 0018:ffffb4ab8da63ac0 EFLAGS: 00010246
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
Jun 20 02:05:19 pve kernel: RIP: 0010:0x0
Jun 20 02:05:19 pve kernel: Hardware name: Default string Default string/Default string, BIOS 1744NP12V10R006 09/25/2023
Jun 20 02:05:19 pve kernel: CPU: 3 PID: 522670 Comm: journal-offline Tainted: P    B      O       6.8.8-1-pve #1
Jun 20 02:05:19 pve kernel: Oops: 0010 [#1] PREEMPT SMP NOPTI
Jun 20 02:05:19 pve kernel: PGD 0 P4D 0
Jun 20 02:05:19 pve kernel: #PF: error_code(0x0010) - not-present page
Jun 20 02:05:19 pve kernel: #PF: supervisor instruction fetch in kernel mode
Jun 20 02:05:19 pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Jun 20 02:05:19 pve kernel: BUG: non-zero pgtables_bytes on freeing mm: 57344
Jun 20 02:05:19 pve kernel: BUG: Bad rss-counter state mm:00000000d68669aa type:MM_ANONPAGES val:6990
/
Jun 20 02:05:19 pve kernel: Code: 06 00 00 c7 83 24 06 00 00 00 00 00 00 48 85 c0 74 2b c7 40 04 00 00 00 00 b8 08 00 00 00 0f 1f 44 00 00 48 8b 93 28 06 00 00 <c7> 04 02 00 00 00 00 48 83 c0 04 48 3d 10 01 00 00 75 e6 e8 51 c3
Jun 20 02:05:19 pve kernel: pvestatd[522662]: segfault at 8 ip 000055e941459717 sp 00007ffcab4be0e0 error 6 in perl[55e941325000+195000] likely on CPU 2 (core 8, socket 0)
Jun 20 02:05:19 pve kernel: mm/pgtable-generic.c:54: bad pmd 000000007e52757f(000000046a9c8467)
Jun 20 02:05:19 pve kernel: BUG: Bad rss-counter state mm:00000000ca8cfc6b type:MM_ANONPAGES val:229
Jun 20 02:05:19 pve kernel:  </TASK>
Jun 20 02:05:19 pve kernel: R13: 000055e942561e90 R14: 000055e9422e3fab R15: 0000000000000001
Jun 20 02:05:19 pve kernel: R10: 00007002b079fe50 R11: 0000000000000247 R12: 000055e947e33970
Jun 20 02:05:19 pve kernel: RBP: 00007ffcab4be100 R08: 000055e9422e3faf R09: 0000000000000009
Jun 20 02:05:19 pve kernel: RDX: 000055e942561e90 RSI: 000055e947e33970 RDI: 00007ffcab4be060
Jun 20 02:05:19 pve kernel: RAX: ffffffffffffffda RBX: 000055e947e7c5e0 RCX: 00007002b0862a17
Jun 20 02:05:19 pve kernel: RSP: 002b:00007ffcab4be058 EFLAGS: 00000247 ORIG_RAX: 000000000000003b
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0x7002b08629ed.
Jun 20 02:05:19 pve kernel: RIP: 0033:0x7002b0862a17
/
Jun 20 02:05:19 pve kernel: Call Trace:
Jun 20 02:05:19 pve kernel: Hardware name: Default string Default string/Default string, BIOS 1744NP12V10R006 09/25/2023
Jun 20 02:05:19 pve kernel: CPU: 3 PID: 522620 Comm: pvestatd Tainted: P    B      O       6.8.8-1-pve #1
Jun 20 02:05:19 pve kernel: file:(null) fault:0x0 mmap:0x0 read_folio:0x0
Jun 20 02:05:19 pve kernel: addr:000055e946e76000 vm_flags:08100073 anon_vma:ffff9a4c7daeb820 mapping:0000000000000000 index:55e946e76
Jun 20 02:05:19 pve kernel: BUG: Bad page map in process pvestatd  pte:84000007f989ea05 pmd:46a9cd067
Jun 20 02:05:19 pve kernel:  </TASK>
/
Jun 20 02:05:19 pve kernel: Code: Unable to access opcode bytes at 0x7002b08629ed.
 
Same issue on new install. Only VM running is pfsense. The firewall was still working without issue but I could not get into proxmox via network and the console was full of "watchdog: BUG: soft lockup - CPU#2 stuck" type messages.


Jun 22 07:11:14 server01 kernel: BUG: scheduling while atomic: kworker/3:1/299145/0x00000002
Jun 22 07:11:14 server01 kernel: Modules linked in: veth tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables 8021q garp mrp bonding tls sunrpc nfnetlink_log nfnetlink binfmt_misc snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils coretemp snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi kvm_intel soundwire_generic_allocation soundwire_bus kvm snd_soc_core crct10dif_pclmul polyval_clmulni polyval_generic snd_compress ghash_clmulni_intel ac97_bus sha256_ssse3 snd_pcm_dmaengine sha1_ssse3 aesni_intel iwlmvm crypto_simd snd_hda_intel cryptd mac80211 btusb snd_intel_dspcfg btrtl libarc4 snd_intel_sdw_acpi i915 btintel
Jun 22 07:11:14 server01 kernel: snd_hda_codec btbcm cmdlinepart btmtk mei_pxp mei_hdcp rapl spi_nor bluetooth snd_hda_core ecdh_generic snd_hwdep mtd intel_cstate think_lmi drm_buddy ecc ttm pcspkr iwlwifi intel_wmi_thunderbolt intel_pmc_core snd_pcm ee1004 firmware_attributes_class wmi_bmof snd_timer drm_display_helper mei_me intel_vsec input_leds snd cec pmt_telemetry rc_core joydev soundcore cfg80211 mei pmt_class acpi_pad mac_hid acpi_tad zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core irqbypass vfio_iommu_type1 vfio iommufd efi_pstore dmi_sysfs ip_tables x_tables autofs4 dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio usbmouse hid_generic usbkbd usbhid hid uas usb_storage btrfs blake2b_generic xor raid6_pq libcrc32c xhci_pci igb xhci_pci_renesas nvme nvme_core e1000e spi_intel_pci crc32_pclmul nvme_auth xhci_hcd spi_intel i2c_i801 i2c_algo_bit i2c_smbus dca video wmi
Jun 22 07:11:14 server01 kernel: CPU: 3 PID: 299145 Comm: kworker/3:1 Tainted: P O 6.8.8-1-pve #1
Jun 22 07:11:14 server01 kernel: Hardware name: LENOVO 10S10001US/3135, BIOS M1UKT75A 01/26/2024
Jun 22 07:11:14 server01 kernel: Workqueue: events e1000_watchdog_task [e1000e]
Jun 22 07:11:14 server01 kernel: Call Trace:
Jun 22 07:11:14 server01 kernel: <TASK>
Jun 22 07:11:14 server01 kernel: dump_stack_lvl+0x76/0xa0
Jun 22 07:11:14 server01 kernel: dump_stack+0x10/0x20
Jun 22 07:11:14 server01 kernel: __schedule_bug+0x64/0x80
Jun 22 07:11:14 server01 kernel: __schedule+0x10f1/0x15e0
Jun 22 07:11:14 server01 kernel: ? clockevents_program_event+0xb3/0x140
Jun 22 07:11:14 server01 kernel: ? tick_program_event+0x43/0xa0
Jun 22 07:11:14 server01 kernel: ? hrtimer_reprogram+0x88/0xe0
Jun 22 07:11:14 server01 kernel: ? hrtimer_start_range_ns+0x138/0x390
Jun 22 07:11:14 server01 kernel: schedule+0x33/0x110
Jun 22 07:11:14 server01 kernel: schedule_hrtimeout_range_clock+0xbc/0x130
Jun 22 07:11:14 server01 kernel: ? __pfx_hrtimer_wakeup+0x10/0x10
Jun 22 07:11:14 server01 kernel: schedule_hrtimeout_range+0x13/0x30
Jun 22 07:11:14 server01 kernel: usleep_range_state+0x65/0xa0
Jun 22 07:11:14 server01 kernel: e1000e_read_phy_reg_mdic+0x98/0x2a0 [e1000e]
Jun 22 07:11:14 server01 kernel: e1000e_update_stats+0x52b/0x730 [e1000e]
Jun 22 07:11:14 server01 kernel: e1000_watchdog_task+0xf7/0xa90 [e1000e]
Jun 22 07:11:14 server01 kernel: process_one_work+0x16a/0x350
Jun 22 07:11:14 server01 kernel: worker_thread+0x306/0x440
Jun 22 07:11:14 server01 kernel: ? __pfx_worker_thread+0x10/0x10
Jun 22 07:11:14 server01 kernel: kthread+0xef/0x120
Jun 22 07:11:14 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:14 server01 kernel: ret_from_fork+0x44/0x70
Jun 22 07:11:14 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:14 server01 kernel: ret_from_fork_asm+0x1b/0x30
Jun 22 07:11:14 server01 kernel: </TASK>
Jun 22 07:11:14 server01 kernel: vmbr0: port 1(eno2) entered blocking state
Jun 22 07:11:14 server01 kernel: vmbr0: port 1(eno2) entered forwarding state
Jun 22 07:11:41 server01 kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [kworker/3:0:300043]
Jun 22 07:11:41 server01 kernel: Modules linked in: veth tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables 8021q garp mrp bonding tls sunrpc nfnetlink_log nfnetlink binfmt_misc snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils coretemp snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi kvm_intel soundwire_generic_allocation soundwire_bus kvm snd_soc_core crct10dif_pclmul polyval_clmulni polyval_generic snd_compress ghash_clmulni_intel ac97_bus sha256_ssse3 snd_pcm_dmaengine sha1_ssse3 aesni_intel iwlmvm crypto_simd snd_hda_intel cryptd mac80211 btusb snd_intel_dspcfg btrtl libarc4 snd_intel_sdw_acpi i915 btintel
Jun 22 07:11:41 server01 kernel: snd_hda_codec btbcm cmdlinepart btmtk mei_pxp mei_hdcp rapl spi_nor bluetooth snd_hda_core ecdh_generic snd_hwdep mtd intel_cstate think_lmi drm_buddy ecc ttm pcspkr iwlwifi intel_wmi_thunderbolt intel_pmc_core snd_pcm ee1004 firmware_attributes_class wmi_bmof snd_timer drm_display_helper mei_me intel_vsec input_leds snd cec pmt_telemetry rc_core joydev soundcore cfg80211 mei pmt_class acpi_pad mac_hid acpi_tad zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core irqbypass vfio_iommu_type1 vfio iommufd efi_pstore dmi_sysfs ip_tables x_tables autofs4 dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio usbmouse hid_generic usbkbd usbhid hid uas usb_storage btrfs blake2b_generic xor raid6_pq libcrc32c xhci_pci igb xhci_pci_renesas nvme nvme_core e1000e spi_intel_pci crc32_pclmul nvme_auth xhci_hcd spi_intel i2c_i801 i2c_algo_bit i2c_smbus dca video wmi
Jun 22 07:11:41 server01 kernel: CPU: 3 PID: 300043 Comm: kworker/3:0 Tainted: P W O 6.8.8-1-pve #1
Jun 22 07:11:41 server01 kernel: Hardware name: LENOVO 10S10001US/3135, BIOS M1UKT75A 01/26/2024
Jun 22 07:11:41 server01 kernel: Workqueue: events linkwatch_event
Jun 22 07:11:41 server01 kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x7f/0x2d0
Jun 22 07:11:41 server01 kernel: Code: 00 00 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 5f 85 c0 74 10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 89 03 5b 41 5c 41 5d 41 5e
Jun 22 07:11:41 server01 kernel: RSP: 0018:ffffacb5d2efbb50 EFLAGS: 00000202
Jun 22 07:11:41 server01 kernel: RAX: 0000000000000001 RBX: ffff8c46140d7448 RCX: 0000000000000000
Jun 22 07:11:41 server01 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8c46140d7448
Jun 22 07:11:41 server01 kernel: RBP: ffffacb5d2efbb70 R08: 0000000000000000 R09: 0000000000000004
Jun 22 07:11:41 server01 kernel: R10: ffff8c47e4ad1204 R11: 0000000000000011 R12: ffff8c47e4ad113c
Jun 22 07:11:41 server01 kernel: R13: ffff8c46140d7448 R14: ffff8c47e4ad1000 R15: 0000000000000000
Jun 22 07:11:41 server01 kernel: FS: 0000000000000000(0000) GS:ffff8c557f580000(0000) knlGS:0000000000000000
Jun 22 07:11:41 server01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 22 07:11:41 server01 kernel: CR2: 000005b256d49368 CR3: 000000104e636002 CR4: 00000000003726f0
Jun 22 07:11:41 server01 kernel: Call Trace:
Jun 22 07:11:41 server01 kernel: <IRQ>
Jun 22 07:11:41 server01 kernel: ? show_regs+0x6d/0x80
Jun 22 07:11:41 server01 kernel: ? watchdog_timer_fn+0x206/0x290
Jun 22 07:11:41 server01 kernel: ? __pfx_watchdog_timer_fn+0x10/0x10
Jun 22 07:11:41 server01 kernel: ? __hrtimer_run_queues+0x105/0x280
Jun 22 07:11:41 server01 kernel: ? clockevents_program_event+0xb3/0x140
Jun 22 07:11:41 server01 kernel: ? hrtimer_interrupt+0xf6/0x250
Jun 22 07:11:41 server01 kernel: ? __sysvec_apic_timer_interrupt+0x4e/0x150
Jun 22 07:11:41 server01 kernel: ? sysvec_apic_timer_interrupt+0x8d/0xd0
Jun 22 07:11:41 server01 kernel: </IRQ>
Jun 22 07:11:41 server01 kernel: <TASK>
Jun 22 07:11:41 server01 kernel: ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
Jun 22 07:11:41 server01 kernel: ? native_queued_spin_lock_slowpath+0x7f/0x2d0
Jun 22 07:11:41 server01 kernel: _raw_spin_lock+0x3f/0x60
Jun 22 07:11:41 server01 kernel: e1000e_get_stats64+0x23/0x140 [e1000e]
Jun 22 07:11:41 server01 kernel: dev_get_stats+0x5e/0x120
Jun 22 07:11:41 server01 kernel: rtnl_fill_stats+0x40/0x140
Jun 22 07:11:41 server01 kernel: rtnl_fill_ifinfo+0x921/0x16f0
Jun 22 07:11:41 server01 kernel: rtmsg_ifinfo_build_skb+0xa4/0x120
Jun 22 07:11:41 server01 kernel: rtmsg_ifinfo+0x4d/0xb0
Jun 22 07:11:41 server01 kernel: netdev_state_change+0x95/0xa0
Jun 22 07:11:41 server01 kernel: linkwatch_do_dev+0x5b/0x70
Jun 22 07:11:41 server01 kernel: __linkwatch_run_queue+0xdf/0x200
Jun 22 07:11:41 server01 kernel: linkwatch_event+0x31/0x40
Jun 22 07:11:41 server01 kernel: process_one_work+0x16a/0x350
Jun 22 07:11:41 server01 kernel: worker_thread+0x306/0x440
Jun 22 07:11:41 server01 kernel: ? __pfx_worker_thread+0x10/0x10
Jun 22 07:11:41 server01 kernel: kthread+0xef/0x120
Jun 22 07:11:41 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:41 server01 kernel: ret_from_fork+0x44/0x70
Jun 22 07:11:41 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:41 server01 kernel: ret_from_fork_asm+0x1b/0x30
Jun 22 07:11:41 server01 kernel: </TASK>
 
I seem to have the same Error (and also on a Lenovo Maschine like @zaforic, M920x Tiny in my Case). After an update to Proxmox 8.2.4 / Kernel 6.8.8-pve I got the same errors, and also my cluster would break down, because the nodes no longer saw each other and qorum was lost.
Rebooting with the older (and working before) Kernel 6.8.4-pve did fix the issue, and all is working well again. So it seems there is some bug in 6.8.8 causing this.
Edit: possibly its this bug: https://bugzilla.kernel.org/show_bug.cgi?id=218740
 
Last edited:
I have the same M920x with an Intel i350-T4. Since my last post I have not had any issues. I was plugging/unplugging network cables around the time the issues started. Below are the changes I have made since. I do not know if they are related in any way but maybe it will help.

1. Set "spec-ctrl" back to default for my pfsense VM. The processor is set to type HOST. The only other thing set is aes.
2. I turned off logging for pfBlockerNG running in pfsense. The disks are nvme with LVM but I saw a post about excessive disk IO and I do not need the logs.
3. Halted pfsense, then rebooted proxmox after these changes.

This might be related: https://forum.proxmox.com/threads/vm-cpu-issues-watchdog-bug-soft-lockup-cpu-7-stuck-for-22s.107379/
 
Same issue on new install. Only VM running is pfsense. The firewall was still working without issue but I could not get into proxmox via network and the console was full of "watchdog: BUG: soft lockup - CPU#2 stuck" type messages.


Jun 22 07:11:14 server01 kernel: BUG: scheduling while atomic: kworker/3:1/299145/0x00000002
Jun 22 07:11:14 server01 kernel: Modules linked in: veth tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables 8021q garp mrp bonding tls sunrpc nfnetlink_log nfnetlink binfmt_misc snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils coretemp snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi kvm_intel soundwire_generic_allocation soundwire_bus kvm snd_soc_core crct10dif_pclmul polyval_clmulni polyval_generic snd_compress ghash_clmulni_intel ac97_bus sha256_ssse3 snd_pcm_dmaengine sha1_ssse3 aesni_intel iwlmvm crypto_simd snd_hda_intel cryptd mac80211 btusb snd_intel_dspcfg btrtl libarc4 snd_intel_sdw_acpi i915 btintel
Jun 22 07:11:14 server01 kernel: snd_hda_codec btbcm cmdlinepart btmtk mei_pxp mei_hdcp rapl spi_nor bluetooth snd_hda_core ecdh_generic snd_hwdep mtd intel_cstate think_lmi drm_buddy ecc ttm pcspkr iwlwifi intel_wmi_thunderbolt intel_pmc_core snd_pcm ee1004 firmware_attributes_class wmi_bmof snd_timer drm_display_helper mei_me intel_vsec input_leds snd cec pmt_telemetry rc_core joydev soundcore cfg80211 mei pmt_class acpi_pad mac_hid acpi_tad zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core irqbypass vfio_iommu_type1 vfio iommufd efi_pstore dmi_sysfs ip_tables x_tables autofs4 dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio usbmouse hid_generic usbkbd usbhid hid uas usb_storage btrfs blake2b_generic xor raid6_pq libcrc32c xhci_pci igb xhci_pci_renesas nvme nvme_core e1000e spi_intel_pci crc32_pclmul nvme_auth xhci_hcd spi_intel i2c_i801 i2c_algo_bit i2c_smbus dca video wmi
Jun 22 07:11:14 server01 kernel: CPU: 3 PID: 299145 Comm: kworker/3:1 Tainted: P O 6.8.8-1-pve #1
Jun 22 07:11:14 server01 kernel: Hardware name: LENOVO 10S10001US/3135, BIOS M1UKT75A 01/26/2024
Jun 22 07:11:14 server01 kernel: Workqueue: events e1000_watchdog_task [e1000e]
Jun 22 07:11:14 server01 kernel: Call Trace:
Jun 22 07:11:14 server01 kernel: <TASK>
Jun 22 07:11:14 server01 kernel: dump_stack_lvl+0x76/0xa0
Jun 22 07:11:14 server01 kernel: dump_stack+0x10/0x20
Jun 22 07:11:14 server01 kernel: __schedule_bug+0x64/0x80
Jun 22 07:11:14 server01 kernel: __schedule+0x10f1/0x15e0
Jun 22 07:11:14 server01 kernel: ? clockevents_program_event+0xb3/0x140
Jun 22 07:11:14 server01 kernel: ? tick_program_event+0x43/0xa0
Jun 22 07:11:14 server01 kernel: ? hrtimer_reprogram+0x88/0xe0
Jun 22 07:11:14 server01 kernel: ? hrtimer_start_range_ns+0x138/0x390
Jun 22 07:11:14 server01 kernel: schedule+0x33/0x110
Jun 22 07:11:14 server01 kernel: schedule_hrtimeout_range_clock+0xbc/0x130
Jun 22 07:11:14 server01 kernel: ? __pfx_hrtimer_wakeup+0x10/0x10
Jun 22 07:11:14 server01 kernel: schedule_hrtimeout_range+0x13/0x30
Jun 22 07:11:14 server01 kernel: usleep_range_state+0x65/0xa0
Jun 22 07:11:14 server01 kernel: e1000e_read_phy_reg_mdic+0x98/0x2a0 [e1000e]
Jun 22 07:11:14 server01 kernel: e1000e_update_stats+0x52b/0x730 [e1000e]
Jun 22 07:11:14 server01 kernel: e1000_watchdog_task+0xf7/0xa90 [e1000e]
Jun 22 07:11:14 server01 kernel: process_one_work+0x16a/0x350
Jun 22 07:11:14 server01 kernel: worker_thread+0x306/0x440
Jun 22 07:11:14 server01 kernel: ? __pfx_worker_thread+0x10/0x10
Jun 22 07:11:14 server01 kernel: kthread+0xef/0x120
Jun 22 07:11:14 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:14 server01 kernel: ret_from_fork+0x44/0x70
Jun 22 07:11:14 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:14 server01 kernel: ret_from_fork_asm+0x1b/0x30
Jun 22 07:11:14 server01 kernel: </TASK>
Jun 22 07:11:14 server01 kernel: vmbr0: port 1(eno2) entered blocking state
Jun 22 07:11:14 server01 kernel: vmbr0: port 1(eno2) entered forwarding state
Jun 22 07:11:41 server01 kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [kworker/3:0:300043]
Jun 22 07:11:41 server01 kernel: Modules linked in: veth tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter nf_tables 8021q garp mrp bonding tls sunrpc nfnetlink_log nfnetlink binfmt_misc snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils coretemp snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi kvm_intel soundwire_generic_allocation soundwire_bus kvm snd_soc_core crct10dif_pclmul polyval_clmulni polyval_generic snd_compress ghash_clmulni_intel ac97_bus sha256_ssse3 snd_pcm_dmaengine sha1_ssse3 aesni_intel iwlmvm crypto_simd snd_hda_intel cryptd mac80211 btusb snd_intel_dspcfg btrtl libarc4 snd_intel_sdw_acpi i915 btintel
Jun 22 07:11:41 server01 kernel: snd_hda_codec btbcm cmdlinepart btmtk mei_pxp mei_hdcp rapl spi_nor bluetooth snd_hda_core ecdh_generic snd_hwdep mtd intel_cstate think_lmi drm_buddy ecc ttm pcspkr iwlwifi intel_wmi_thunderbolt intel_pmc_core snd_pcm ee1004 firmware_attributes_class wmi_bmof snd_timer drm_display_helper mei_me intel_vsec input_leds snd cec pmt_telemetry rc_core joydev soundcore cfg80211 mei pmt_class acpi_pad mac_hid acpi_tad zfs(PO) spl(O) vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core irqbypass vfio_iommu_type1 vfio iommufd efi_pstore dmi_sysfs ip_tables x_tables autofs4 dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio usbmouse hid_generic usbkbd usbhid hid uas usb_storage btrfs blake2b_generic xor raid6_pq libcrc32c xhci_pci igb xhci_pci_renesas nvme nvme_core e1000e spi_intel_pci crc32_pclmul nvme_auth xhci_hcd spi_intel i2c_i801 i2c_algo_bit i2c_smbus dca video wmi
Jun 22 07:11:41 server01 kernel: CPU: 3 PID: 300043 Comm: kworker/3:0 Tainted: P W O 6.8.8-1-pve #1
Jun 22 07:11:41 server01 kernel: Hardware name: LENOVO 10S10001US/3135, BIOS M1UKT75A 01/26/2024
Jun 22 07:11:41 server01 kernel: Workqueue: events linkwatch_event
Jun 22 07:11:41 server01 kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x7f/0x2d0
Jun 22 07:11:41 server01 kernel: Code: 00 00 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 5f 85 c0 74 10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 89 03 5b 41 5c 41 5d 41 5e
Jun 22 07:11:41 server01 kernel: RSP: 0018:ffffacb5d2efbb50 EFLAGS: 00000202
Jun 22 07:11:41 server01 kernel: RAX: 0000000000000001 RBX: ffff8c46140d7448 RCX: 0000000000000000
Jun 22 07:11:41 server01 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8c46140d7448
Jun 22 07:11:41 server01 kernel: RBP: ffffacb5d2efbb70 R08: 0000000000000000 R09: 0000000000000004
Jun 22 07:11:41 server01 kernel: R10: ffff8c47e4ad1204 R11: 0000000000000011 R12: ffff8c47e4ad113c
Jun 22 07:11:41 server01 kernel: R13: ffff8c46140d7448 R14: ffff8c47e4ad1000 R15: 0000000000000000
Jun 22 07:11:41 server01 kernel: FS: 0000000000000000(0000) GS:ffff8c557f580000(0000) knlGS:0000000000000000
Jun 22 07:11:41 server01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 22 07:11:41 server01 kernel: CR2: 000005b256d49368 CR3: 000000104e636002 CR4: 00000000003726f0
Jun 22 07:11:41 server01 kernel: Call Trace:
Jun 22 07:11:41 server01 kernel: <IRQ>
Jun 22 07:11:41 server01 kernel: ? show_regs+0x6d/0x80
Jun 22 07:11:41 server01 kernel: ? watchdog_timer_fn+0x206/0x290
Jun 22 07:11:41 server01 kernel: ? __pfx_watchdog_timer_fn+0x10/0x10
Jun 22 07:11:41 server01 kernel: ? __hrtimer_run_queues+0x105/0x280
Jun 22 07:11:41 server01 kernel: ? clockevents_program_event+0xb3/0x140
Jun 22 07:11:41 server01 kernel: ? hrtimer_interrupt+0xf6/0x250
Jun 22 07:11:41 server01 kernel: ? __sysvec_apic_timer_interrupt+0x4e/0x150
Jun 22 07:11:41 server01 kernel: ? sysvec_apic_timer_interrupt+0x8d/0xd0
Jun 22 07:11:41 server01 kernel: </IRQ>
Jun 22 07:11:41 server01 kernel: <TASK>
Jun 22 07:11:41 server01 kernel: ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
Jun 22 07:11:41 server01 kernel: ? native_queued_spin_lock_slowpath+0x7f/0x2d0
Jun 22 07:11:41 server01 kernel: _raw_spin_lock+0x3f/0x60
Jun 22 07:11:41 server01 kernel: e1000e_get_stats64+0x23/0x140 [e1000e]
Jun 22 07:11:41 server01 kernel: dev_get_stats+0x5e/0x120
Jun 22 07:11:41 server01 kernel: rtnl_fill_stats+0x40/0x140
Jun 22 07:11:41 server01 kernel: rtnl_fill_ifinfo+0x921/0x16f0
Jun 22 07:11:41 server01 kernel: rtmsg_ifinfo_build_skb+0xa4/0x120
Jun 22 07:11:41 server01 kernel: rtmsg_ifinfo+0x4d/0xb0
Jun 22 07:11:41 server01 kernel: netdev_state_change+0x95/0xa0
Jun 22 07:11:41 server01 kernel: linkwatch_do_dev+0x5b/0x70
Jun 22 07:11:41 server01 kernel: __linkwatch_run_queue+0xdf/0x200
Jun 22 07:11:41 server01 kernel: linkwatch_event+0x31/0x40
Jun 22 07:11:41 server01 kernel: process_one_work+0x16a/0x350
Jun 22 07:11:41 server01 kernel: worker_thread+0x306/0x440
Jun 22 07:11:41 server01 kernel: ? __pfx_worker_thread+0x10/0x10
Jun 22 07:11:41 server01 kernel: kthread+0xef/0x120
Jun 22 07:11:41 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:41 server01 kernel: ret_from_fork+0x44/0x70
Jun 22 07:11:41 server01 kernel: ? __pfx_kthread+0x10/0x10
Jun 22 07:11:41 server01 kernel: ret_from_fork_asm+0x1b/0x30
Jun 22 07:11:41 server01 kernel: </TASK>
Similar to https://bbs.archlinux.org/viewtopic.php?pid=2165871#p2165871
https://github.com/torvalds/linux/commit/387f295cb2150ed164905b648d76dfcbd3621778
https://bugzilla.kernel.org/show_bug.cgi?id=218740
https://discussion.fedoraproject.org/t/system-crash-when-chaging-network-configuration/116219/9
When is kernel 6.8.10 or 6.9 coming for proxmox?
 
Same issue after upgrading to kernel: 6.8.8, it happened twice with in a week.

Jun 24 03:12:13 proxmox-node-01 kernel: e1000e 0000:00:1f.6 eno1: NIC Link is Down
Jun 24 03:12:13 proxmox-node-01 kernel: vmbr1: port 1(eno1) entered disabled state
Jun 24 03:12:21 proxmox-node-01 kernel: e1000e 0000:00:1f.6 eno1: NIC Link is Up 10 Mbps Half Duplex, Flow Control: Rx/Tx
Jun 24 03:12:21 proxmox-node-01 kernel: BUG: scheduling while atomic: kworker/6:0/274052/0x00000002
Jun 24 03:12:21 proxmox-node-01 kernel: Modules linked in: udp_diag tcp_diag inet_diag cfg80211 vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd vet>
Jun 24 03:12:21 proxmox-node-01 kernel: int3403_thermal firmware_attributes_class mtd wmi_bmof mei i2c_algo_bit intel_vsec acpi_tad int340x_thermal_zon>
Jun 24 03:12:21 proxmox-node-01 kernel: CPU: 6 PID: 274052 Comm: kworker/6:0 Tainted: P O 6.8.8-1-pve #1
Jun 24 03:12:21 proxmox-node-01 kernel: Hardware name: LENOVO 11XJCTO1WW/3309, BIOS M4GKT29A 07/05/2023
Jun 24 03:12:21 proxmox-node-01 kernel: Workqueue: events e1000_watchdog_task [e1000e]
Jun 24 03:12:21 proxmox-node-01 kernel: Call Trace:
Jun 24 03:12:21 proxmox-node-01 kernel: <TASK>
Jun 24 03:12:21 proxmox-node-01 kernel: dump_stack_lvl+0x76/0xa0
Jun 24 03:12:21 proxmox-node-01 kernel: dump_stack+0x10/0x20
Jun 24 03:12:21 proxmox-node-01 kernel: __schedule_bug+0x64/0x80
Jun 24 03:12:21 proxmox-node-01 kernel: __schedule+0x10f1/0x15e0
Jun 24 03:12:21 proxmox-node-01 kernel: ? clockevents_program_event+0xb3/0x140
Jun 24 03:12:21 proxmox-node-01 kernel: ? tick_program_event+0x43/0xa0
Jun 24 03:12:21 proxmox-node-01 kernel: ? hrtimer_reprogram+0x88/0xe0
Jun 24 03:12:21 proxmox-node-01 kernel: ? hrtimer_start_range_ns+0x138/0x390
Jun 24 03:12:21 proxmox-node-01 kernel: schedule+0x33/0x110
Jun 24 03:12:21 proxmox-node-01 kernel: schedule_hrtimeout_range_clock+0xbc/0x130
Jun 24 03:12:21 proxmox-node-01 kernel: ? __pfx_hrtimer_wakeup+0x10/0x10
Jun 24 03:12:21 proxmox-node-01 kernel: schedule_hrtimeout_range+0x13/0x30
Jun 24 03:12:21 proxmox-node-01 kernel: usleep_range_state+0x65/0xa0
Jun 24 03:12:21 proxmox-node-01 kernel: e1000e_read_phy_reg_mdic+0x98/0x2a0 [e1000e]
Jun 24 03:12:21 proxmox-node-01 kernel: e1000e_update_stats+0x52b/0x730 [e1000e]
Jun 24 03:12:21 proxmox-node-01 kernel: e1000_watchdog_task+0xf7/0xa90 [e1000e]
Jun 24 03:12:21 proxmox-node-01 kernel: process_one_work+0x16a/0x350
Jun 24 03:12:21 proxmox-node-01 kernel: worker_thread+0x306/0x440
Jun 24 03:12:21 proxmox-node-01 kernel: ? __pfx_worker_thread+0x10/0x10
Jun 24 03:12:21 proxmox-node-01 kernel: kthread+0xef/0x120
Jun 24 03:12:21 proxmox-node-01 kernel: ? __pfx_kthread+0x10/0x10
Jun 24 03:12:21 proxmox-node-01 kernel: ret_from_fork+0x44/0x70
Jun 24 03:12:21 proxmox-node-01 kernel: ? __pfx_kthread+0x10/0x10
Jun 24 03:12:21 proxmox-node-01 kernel: ret_from_fork_asm+0x1b/0x30
Jun 24 03:12:21 proxmox-node-01 kernel: </TASK>
Jun 24 03:12:21 proxmox-node-01 kernel: vmbr1: port 1(eno1) entered blocking state
Jun 24 03:12:21 proxmox-node-01 kernel: vmbr1: port 1(eno1) entered forwarding state
Jun 24 03:12:41 proxmox-node-01 pvedaemon[508378]: <root@pam> successful auth for user 'monitoring@pve'
Jun 24 03:12:46 proxmox-node-01 kernel: watchdog: BUG: soft lockup - CPU#8 stuck for 23s! [pvestatd:1226]
Jun 24 03:12:46 proxmox-node-01 kernel: Modules linked in: udp_diag tcp_diag inet_diag cfg80211 vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd vet>
Jun 24 03:12:46 proxmox-node-01 kernel: int3403_thermal firmware_attributes_class mtd wmi_bmof mei i2c_algo_bit intel_vsec acpi_tad int340x_thermal_zon>
Jun 24 03:12:46 proxmox-node-01 kernel: CPU: 8 PID: 1226 Comm: pvestatd Tainted: P W O 6.8.8-1-pve #1
Jun 24 03:12:46 proxmox-node-01 kernel: Hardware name: LENOVO 11XJCTO1WW/3309, BIOS M4GKT29A 07/05/2023
Jun 24 03:12:46 proxmox-node-01 kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x229/0x2d0
Jun 24 03:12:46 proxmox-node-01 kernel: Code: 4e 01 41 c1 e5 10 c1 e1 12 44 09 e9 89 c8 c1 e8 10 66 87 43 02 89 c2 c1 e2 10 81 fa ff ff 00 00 77 37 31 d>
Jun 24 03:12:46 proxmox-node-01 kernel: RSP: 0018:ffffb84bc55e3b40 EFLAGS: 00000202
Jun 24 03:12:46 proxmox-node-01 kernel: RAX: 0000000000240101 RBX: ffff8f3093ea7448 RCX: 0000000000240000
Jun 24 03:12:46 proxmox-node-01 kernel: RDX: 0000000000000000 RSI: 0000000000000101 RDI: ffff8f3093ea7448
Jun 24 03:12:46 proxmox-node-01 kernel: RBP: ffffb84bc55e3b60 R08: 0000000000000000 R09: 0000000000000000
Jun 24 03:12:46 proxmox-node-01 kernel: R10: 00000000000001ca R11: 0000000001000001 R12: ffff8f36184359c0
Jun 24 03:12:46 proxmox-node-01 kernel: R13: 0000000000000000 R14: 0000000000000008 R15: ffff8f30840a5ac8
Jun 24 03:12:46 proxmox-node-01 kernel: FS: 000070166cd4c740(0000) GS:ffff8f3618400000(0000) knlGS:0000000000000000
Jun 24 03:12:46 proxmox-node-01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 24 03:12:46 proxmox-node-01 kernel: CR2: 00005815803846e9 CR3: 00000001037d0000 CR4: 0000000000f52ef0
Jun 24 03:12:46 proxmox-node-01 kernel: PKRU: 55555554
Jun 24 03:12:46 proxmox-node-01 kernel: Call Trace:
Jun 24 03:12:46 proxmox-node-01 kernel: <IRQ>
Jun 24 03:12:46 proxmox-node-01 kernel: ? show_regs+0x6d/0x80
Jun 24 03:12:46 proxmox-node-01 kernel: ? watchdog_timer_fn+0x206/0x290
Jun 24 03:12:46 proxmox-node-01 kernel: ? __pfx_watchdog_timer_fn+0x10/0x10
Jun 24 03:12:46 proxmox-node-01 kernel: ? __hrtimer_run_queues+0x105/0x280
Jun 24 03:12:46 proxmox-node-01 kernel: ? clockevents_program_event+0xb3/0x140
Jun 24 03:12:46 proxmox-node-01 kernel: ? hrtimer_interrupt+0xf6/0x250
Jun 24 03:12:46 proxmox-node-01 kernel: ? __sysvec_apic_timer_interrupt+0x4e/0x150
Jun 24 03:12:46 proxmox-node-01 kernel: ? sysvec_apic_timer_interrupt+0x8d/0xd0
Jun 24 03:12:46 proxmox-node-01 kernel: </IRQ>
Jun 24 03:12:46 proxmox-node-01 kernel: <TASK>
Jun 24 03:12:46 proxmox-node-01 kernel: ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
Jun 24 03:12:46 proxmox-node-01 kernel: ? native_queued_spin_lock_slowpath+0x229/0x2d0
Jun 24 03:12:46 proxmox-node-01 kernel: _raw_spin_lock+0x3f/0x60
Jun 24 03:12:46 proxmox-node-01 kernel: e1000e_get_stats64+0x23/0x140 [e1000e]
Jun 24 03:12:46 proxmox-node-01 kernel: dev_get_stats+0x5e/0x120
Jun 24 03:12:46 proxmox-node-01 kernel: dev_seq_printf_stats+0x49/0x100
Jun 24 03:12:46 proxmox-node-01 kernel: dev_seq_show+0x14/0x40
Jun 24 03:12:46 proxmox-node-01 kernel: seq_read_iter+0x2c6/0x4a0
Jun 24 03:12:46 proxmox-node-01 kernel: seq_read+0x11e/0x160
Jun 24 03:12:46 proxmox-node-01 kernel: proc_reg_read+0x69/0xb0
Jun 24 03:12:46 proxmox-node-01 kernel: vfs_read+0xad/0x390
Jun 24 03:12:46 proxmox-node-01 kernel: ? __pfx_proc_put_link+0x10/0x10
Jun 24 03:12:46 proxmox-node-01 kernel: ? __pfx_kfree_link+0x10/0x10
Jun 24 03:12:46 proxmox-node-01 kernel: ksys_read+0x73/0x100
Jun 24 03:12:46 proxmox-node-01 kernel: __x64_sys_read+0x19/0x30
Jun 24 03:12:46 proxmox-node-01 kernel: x64_sys_call+0x23f0/0x24b0
Jun 24 03:12:46 proxmox-node-01 kernel: do_syscall_64+0x81/0x170
Jun 24 03:12:46 proxmox-node-01 kernel: ? syscall_exit_to_user_mode+0x89/0x260
Jun 24 03:12:46 proxmox-node-01 kernel: ? do_syscall_64+0x8d/0x170
Jun 24 03:12:46 proxmox-node-01 kernel: ? exc_page_fault+0x94/0x1b0
Jun 24 03:12:46 proxmox-node-01 kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80
Jun 24 03:12:46 proxmox-node-01 kernel: RIP: 0033:0x70166ce8319d
Jun 24 03:12:46 proxmox-node-01 kernel: Code: 31 c0 e9 c6 fe ff ff 50 48 8d 3d 66 54 0a 00 e8 49 ff 01 00 66 0f 1f 84 00 00 00 00 00 80 3d 41 24 0e 00 0>
Jun 24 03:12:46 proxmox-node-01 kernel: RSP: 002b:00007fff778e57a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Jun 24 03:12:46 proxmox-node-01 kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000070166ce8319d
Jun 24 03:12:46 proxmox-node-01 kernel: RDX: 0000000000002000 RSI: 00005815805a92c0 RDI: 0000000000000008
Jun 24 03:12:46 proxmox-node-01 kernel: RBP: 0000000000002000 R08: 0000000000000000 R09: 000070166cf5dd30
Jun 24 03:12:46 proxmox-node-01 kernel: R10: 00005815805a92c0 R11: 0000000000000246 R12: 00005815805a92c0
Jun 24 03:12:46 proxmox-node-01 kernel: R13: 000058157a9ae2a0 R14: 0000000000000008 R15: 000058157cfd2c70
Jun 24 03:12:46 proxmox-node-01 kernel: </TASK>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!