Running into a similar problem.
Linux version 6.8.12-10-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-10 (2025-04-18T07:39Z) ()
The mother board is a server, updated to latest BIOS W680D4U-2L2T/G5/W680D4U-2L2T/G5, BIOS 22.01 10/01/2024
CPU i9-14900K
ECC memory fully tested
It randomly happens every 3-4 days and then Proxmox hangs up, so only solution is power cycle.
Any hints?
Thanks a lot in advance!
Linux version 6.8.12-10-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-10 (2025-04-18T07:39Z) ()
The mother board is a server, updated to latest BIOS W680D4U-2L2T/G5/W680D4U-2L2T/G5, BIOS 22.01 10/01/2024
CPU i9-14900K
ECC memory fully tested
It randomly happens every 3-4 days and then Proxmox hangs up, so only solution is power cycle.
Any hints?
Thanks a lot in advance!
Code:
May 25 05:40:21 oa-nas kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
May 25 05:40:21 oa-nas kernel: #PF: supervisor write access in kernel mode
May 25 05:40:21 oa-nas kernel: #PF: error_code(0x0002) - not-present page
May 25 05:40:21 oa-nas kernel: PGD 0 P4D 0
May 25 05:40:21 oa-nas kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
May 25 05:40:21 oa-nas kernel: CPU: 4 PID: 2285716 Comm: z_wr_int_3 Tainted: P O 6.8.12-10-pve #1
May 25 05:40:21 oa-nas kernel: Hardware name: W680D4U-2L2T/G5/W680D4U-2L2T/G5, BIOS 22.01 10/01/2024
May 25 05:40:21 oa-nas kernel: RIP: 0010:add_wait_queue_exclusive+0x3b/0x60
May 25 05:40:21 oa-nas kernel: Code: fb 83 0e 01 e8 76 b5 ff 00 49 8d 54 24 18 48 8d 4b 08 48 89 df 48 89 c6 48 8b 43 10 48 89 53 10 49 89 4c 24 18 49 89 44 24 20 <48> 89 10 e8 4d b6 ff 00 5b 41 5c 5d 31 c0 31 d>
May 25 05:40:21 oa-nas kernel: RSP: 0018:ffffb1a6ba817da8 EFLAGS: 00010046
May 25 05:40:21 oa-nas kernel: RAX: 0000000000000000 RBX: ffff9e631fc636c0 RCX: ffff9e631fc636c8
May 25 05:40:21 oa-nas kernel: RDX: ffffb1a6ba817e10 RSI: 0000000000000002 RDI: ffff9e631fc636c0
May 25 05:40:21 oa-nas kernel: RBP: ffffb1a6ba817db8 R08: 0000000000000000 R09: 0000000000000000
May 25 05:40:21 oa-nas kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffb1a6ba817df8
May 25 05:40:21 oa-nas kernel: R13: ffff9e631fc636c0 R14: ffff9e63b4386a80 R15: ffff9e631fc63600
May 25 05:40:21 oa-nas kernel: FS: 0000000000000000(0000) GS:ffff9e81fea00000(0000) knlGS:0000000000000000
May 25 05:40:21 oa-nas kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 25 05:40:21 oa-nas kernel: CR2: 0000000000000000 CR3: 00000015b2b16003 CR4: 0000000000f72ef0
May 25 05:40:21 oa-nas kernel: PKRU: 55555554
May 25 05:40:21 oa-nas kernel: Call Trace:
May 25 05:40:21 oa-nas kernel: <TASK>
May 25 05:40:21 oa-nas kernel: ? show_regs+0x6d/0x80
May 25 05:40:21 oa-nas kernel: ? __die+0x24/0x80
May 25 05:40:21 oa-nas kernel: ? page_fault_oops+0x176/0x500
May 25 05:40:21 oa-nas kernel: ? do_user_addr_fault+0x2f5/0x660
May 25 05:40:21 oa-nas kernel: ? exc_page_fault+0x83/0x1b0
May 25 05:40:21 oa-nas kernel: ? asm_exc_page_fault+0x27/0x30
May 25 05:40:21 oa-nas kernel: ? add_wait_queue_exclusive+0x3b/0x60
May 25 05:40:21 oa-nas kernel: ? add_wait_queue_exclusive+0x1a/0x60
May 25 05:40:21 oa-nas kernel: taskq_thread+0x3fd/0x4c0 [spl]
May 25 05:40:21 oa-nas kernel: ? __pfx_default_wake_function+0x10/0x10
May 25 05:40:21 oa-nas kernel: ? __pfx_zio_execute+0x10/0x10 [zfs]
May 25 05:40:21 oa-nas kernel: ? __pfx_taskq_thread+0x10/0x10 [spl]
May 25 05:40:21 oa-nas kernel: kthread+0xef/0x120
May 25 05:40:21 oa-nas kernel: ? __pfx_kthread+0x10/0x10
May 25 05:40:21 oa-nas kernel: ret_from_fork+0x44/0x70
May 25 05:40:21 oa-nas kernel: ? __pfx_kthread+0x10/0x10
May 25 05:40:21 oa-nas kernel: ret_from_fork_asm+0x1b/0x30
May 25 05:40:21 oa-nas kernel: </TASK>
May 25 05:40:21 oa-nas kernel: Modules linked in: nft_chain_nat xt_MASQUERADE nf_nat nft_compat cfg80211 veth nf_conntrack_netlink nfnetlink_acct udp_diag tcp_diag inet_diag wireguard curve25519_x86_64 libchacha>
May 25 05:40:21 oa-nas kernel: drm_exec gpu_sched drm_suballoc_helper snd_sof drm_ttm_helper x86_pkg_temp_thermal intel_powerclamp snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_so>
May 25 05:40:21 oa-nas kernel: hid_generic usbkbd usbmouse cdc_ether usbnet mii usbhid hid igb nvme xhci_pci xhci_pci_renesas i2c_algo_bit crc32_pclmul intel_lpss_pci spi_intel_pci nvme_core i40e i2c_i801 spi_i>
May 25 05:40:21 oa-nas kernel: CR2: 0000000000000000
May 25 05:40:21 oa-nas kernel: ---[ end trace 0000000000000000 ]---
May 25 05:40:21 oa-nas kernel: RIP: 0010:add_wait_queue_exclusive+0x3b/0x60
May 25 05:40:21 oa-nas kernel: Code: fb 83 0e 01 e8 76 b5 ff 00 49 8d 54 24 18 48 8d 4b 08 48 89 df 48 89 c6 48 8b 43 10 48 89 53 10 49 89 4c 24 18 49 89 44 24 20 <48> 89 10 e8 4d b6 ff 00 5b 41 5c 5d 31 c0 31 d>
May 25 05:40:21 oa-nas kernel: RSP: 0018:ffffb1a6ba817da8 EFLAGS: 00010046
May 25 05:40:21 oa-nas kernel: RAX: 0000000000000000 RBX: ffff9e631fc636c0 RCX: ffff9e631fc636c8
May 25 05:40:21 oa-nas kernel: RDX: ffffb1a6ba817e10 RSI: 0000000000000002 RDI: ffff9e631fc636c0
May 25 05:40:21 oa-nas kernel: RBP: ffffb1a6ba817db8 R08: 0000000000000000 R09: 0000000000000000
May 25 05:40:21 oa-nas kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffb1a6ba817df8
May 25 05:40:21 oa-nas kernel: R13: ffff9e631fc636c0 R14: ffff9e63b4386a80 R15: ffff9e631fc63600
May 25 05:40:21 oa-nas kernel: FS: 0000000000000000(0000) GS:ffff9e81fea00000(0000) knlGS:0000000000000000
May 25 05:40:21 oa-nas kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 25 05:40:21 oa-nas kernel: CR2: 0000000000000000 CR3: 00000015b2b16003 CR4: 0000000000f72ef0
May 25 05:40:21 oa-nas kernel: PKRU: 55555554
May 25 05:40:21 oa-nas kernel: note: z_wr_int_3[2285716] exited with irqs disabled
May 25 05:40:21 oa-nas kernel: note: z_wr_int_3[2285716] exited with preempt_count 2
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: It has been corrected by h/w and requires no further action
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: event severity: corrected
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: Error 0, type: corrected
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: fru_text: CorrectedErr
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: section_type: memory error
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: node:1 device:0
May 25 05:40:24 oa-nas kernel: {1}[Hardware Error]: error_type: 2, single-bit ECC