Random 6.8.4-2-pve kernel crashes

Hi
Someone upgraded to that last kernel version 6.8.8-1 and can tell if the server freeze issues stop?
I am thinking about upgrading to the last kernel version or downgrade to 6.5.13-5

Thanks.
 
No... 6.5 is good..Technically it's not a freeze, there's a kernel stack dump first. The kernel faults only happened in 6.8.x...
I see. If you see a kernel stack dump + freeze again, could you please copy it/take a screenshot? The full dump would be very useful for troubleshooting.
If it is in the console, you'll likely only see the last part of it -- in this case try to press Shift+PgUp to scroll to the beginning of the dump (most likely a line BUG: ...), and take another screenshot of the beginning of the dump. In case scrolling doesn't work anymore, please just try to provide as much as possible of the output.
Please also provide the exact kernel version you were running at the time of the freeze.

Previously I've pinned to 6.5 and that's been very stable. Only switched to this this newer 6.8.8 recently.
[...]
No, but that freeze could just be a coincidence. I agree it's likely thermal related... Gonna play more with this...

OK. It's possible there are multiple issues at play here, making troubleshooting more complicated. Let me know what you can find out.
 
  • Like
Reactions: snakeoilos
Can someone tell me why pinning the latest 6.5 kernel would be a problem? I need stable uptime more than anything.
 
Hi
Someone upgraded to that last kernel version 6.8.8-1 and can tell if the server freeze issues stop?
I am thinking about upgrading to the last kernel version or downgrade to 6.5.13-5

Thanks.
Hello,

My server is running with version 6.8.8-1 for just over 3 days without freezing.

(AMD Ryzen 3600 - 2 nvme drives with mdadm in RAID1)
 
Hi,

I have 6.8.8-2-pve installed and every time i have high read on the nvme i got a random reboot. I have 3x the same server.
The other 2 run with 6.8.4-3-pve without problem.

I will try 6.8.4-pve on the problem one later today.
 
Hi Guys,

I think I am running into this Issue. I am running kernel `6.8.8-3-pve` and am seeing random system resets. No indication of shutdown or anything in the logs. The system just starts from the beginning, like a hardware reset. Monitoring also does not show anything suspicious, no extreme CPU, Ram or IO usage.

I running with 2 NVME drives in ZFS mirror. Any Ideas how to debug the cause?

Kind regards
 
Hitting some related instability myself. Still validating its not a memory or hardware issue. Here some logs..

Looks related to this,

https://lore.kernel.org/linux-kerne.../T/#medf0876e578ca978ef27174c8c777820a3739b75

Code:
Linux proxbeast-pve 6.8.8-4-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.8-4 (2024-07-26T11:15Z) x86_64 GNU/Linux

Aug 04 21:39:23 proxbeast-pve kernel: BUG: kernel NULL pointer dereference, address: 0000000000000030
Aug 04 21:39:23 proxbeast-pve kernel: #PF: supervisor read access in kernel mode
Aug 04 21:39:23 proxbeast-pve kernel: #PF: error_code(0x0000) - not-present page
Aug 04 21:39:23 proxbeast-pve kernel: PGD 0 P4D 0
Aug 04 21:39:23 proxbeast-pve kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Aug 04 21:39:23 proxbeast-pve kernel: CPU: 10 PID: 1627660 Comm: stress-ng-appar Tainted: P        W  O       6.8.8-4-pve #1
Aug 04 21:39:23 proxbeast-pve kernel: Hardware name: Micro-Star International Co., Ltd. MS-7E10/MPG B650 EDGE WIFI (MS-7E10), BIOS 1.G1 07/09/2024
Aug 04 21:39:23 proxbeast-pve kernel: RIP: 0010:aafs_create.constprop.0+0x7f/0x130
Aug 04 21:39:23 proxbeast-pve kernel: Code: 4c 63 e0 48 83 c4 18 4c 89 e0 5b 41 5c 41 5d 41 5e 41 5f 5d 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 45 31 d2 e9 8c 1c c3 00 <4d> 8b 55 30 4d 8d ba a0 00 00 00 4c 89 55 c0 4c 89 ff e8 2a 13 a8
Aug 04 21:39:23 proxbeast-pve kernel: RSP: 0018:ffff9960ea65bb90 EFLAGS: 00010246
Aug 04 21:39:23 proxbeast-pve kernel: RAX: 0000000000000000 RBX: 00000000000041ed RCX: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: RBP: ffff9960ea65bbd0 R08: 0000000000000000 R09: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8fbe157b
Aug 04 21:39:23 proxbeast-pve kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: FS:  00007048f1a94d00(0000) GS:ffff8a74ee900000(0000) knlGS:0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 04 21:39:23 proxbeast-pve kernel: CR2: 0000000000000030 CR3: 00000005f6ba8000 CR4: 0000000000f50ef0
Aug 04 21:39:23 proxbeast-pve kernel: PKRU: 55555554
Aug 04 21:39:23 proxbeast-pve kernel: Call Trace:
Aug 04 21:39:23 proxbeast-pve kernel:  <TASK>
Aug 04 21:39:23 proxbeast-pve kernel:  ? show_regs+0x6d/0x80
Aug 04 21:39:23 proxbeast-pve kernel:  ? __die+0x24/0x80
Aug 04 21:39:23 proxbeast-pve kernel:  ? page_fault_oops+0x176/0x500
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? perf_trace_run_bpf_submit+0x75/0xe0
Aug 04 21:39:23 proxbeast-pve kernel:  ? do_user_addr_fault+0x2f9/0x6b0
Aug 04 21:39:23 proxbeast-pve kernel:  ? exc_page_fault+0x83/0x1b0
Aug 04 21:39:23 proxbeast-pve kernel:  ? asm_exc_page_fault+0x27/0x30
Aug 04 21:39:23 proxbeast-pve kernel:  ? aafs_create.constprop.0+0x7f/0x130
Aug 04 21:39:23 proxbeast-pve kernel:  __aafs_profile_mkdir+0x3d6/0x480
Aug 04 21:39:23 proxbeast-pve kernel:  aa_replace_profiles+0x862/0x1270
Aug 04 21:39:23 proxbeast-pve kernel:  policy_update+0xe3/0x180
Aug 04 21:39:23 proxbeast-pve kernel:  profile_replace+0xbc/0x150
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? rw_verify_area+0x47/0x140
Aug 04 21:39:23 proxbeast-pve kernel:  vfs_write+0xfd/0x480
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? perf_trace_run_bpf_submit+0x75/0xe0
Aug 04 21:39:23 proxbeast-pve kernel:  ksys_write+0x73/0x100
Aug 04 21:39:23 proxbeast-pve kernel:  __x64_sys_write+0x19/0x30
Aug 04 21:39:23 proxbeast-pve kernel:  x64_sys_call+0x23e1/0x24b0
Aug 04 21:39:23 proxbeast-pve kernel:  do_syscall_64+0x81/0x170
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? syscall_exit_to_user_mode_prepare+0x14d/0x1a0
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? syscall_exit_to_user_mode+0x89/0x260
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? do_syscall_64+0x8d/0x170
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? do_syscall_64+0x8d/0x170
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? do_syscall_64+0x8d/0x170
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? syscall_exit_to_user_mode_prepare+0x14d/0x1a0
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? syscall_exit_to_user_mode+0x89/0x260
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  ? do_syscall_64+0x8d/0x170
Aug 04 21:39:23 proxbeast-pve kernel:  ? irqentry_exit+0x43/0x50
Aug 04 21:39:23 proxbeast-pve kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 04 21:39:23 proxbeast-pve kernel:  entry_SYSCALL_64_after_hwframe+0x78/0x80
Aug 04 21:39:23 proxbeast-pve kernel: RIP: 0033:0x7048f2369240
Aug 04 21:39:23 proxbeast-pve kernel: Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
Aug 04 21:39:23 proxbeast-pve kernel: RSP: 002b:00007ffeb1ceac68 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
Aug 04 21:39:23 proxbeast-pve kernel: RAX: ffffffffffffffda RBX: 00005e0ab2e77a90 RCX: 00007048f2369240
Aug 04 21:39:23 proxbeast-pve kernel: RDX: 00000000000110da RSI: 00005e0ab2e77b30 RDI: 0000000000000053
Aug 04 21:39:23 proxbeast-pve kernel: RBP: 00000000000110da R08: 0000000000000007 R09: 00005e0ab2e77920
Aug 04 21:39:23 proxbeast-pve kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00005e0ab2e77b30
Aug 04 21:39:23 proxbeast-pve kernel: R13: 0000000000000053 R14: 000000000000000c R15: 000000000000000c
Aug 04 21:39:23 proxbeast-pve kernel:  </TASK>
Aug 04 21:39:23 proxbeast-pve kernel: Modules linked in: chacha_generic chacha_x86_64 libchacha xxhash_generic wp512 streebog_generic sm3_generic sm3_avx_x86_64 sm3 rmd160 poly1305_generic poly1305_x86_64 nhpoly1305_avx2 nhpoly1305_sse2 nhpoly1305 libpoly1305 michael_mic md4 algif_rng twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common sm4_generic sm4_aesni_avx2_x86_64 sm4_aesni_avx_x86_>
Aug 04 21:39:23 proxbeast-pve kernel:  iptable_filter ccm vmw_vsock_vmci_transport vsock vmw_vmci scsi_transport_iscsi nf_tables nvme_fabrics overlay binder_linux qrtr bonding tls cmac algif_hash algif_skcipher af_alg bnep softdog sunrpc nfnetlink_log nfnetlink binfmt_misc input_leds btusb btrtl btintel btbcm btmtk snd_usb_audio bluetooth snd_usbmidi_lib snd_ump snd_rawmidi ecdh_generic snd_seq_device ecc mc joydev inte>
Aug 04 21:39:23 proxbeast-pve kernel:  blake2b_generic xor raid6_pq libcrc32c hid_generic usbkbd usbmouse usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec nvme xhci_pci rc_core nvme_core ahci xhci_pci_renesas r8169 video crc32_pclmul xhci_hcd libahci i2c_piix4 realtek nvme_auth wmi gpio_amdpt z3fold zstd
Aug 04 21:39:23 proxbeast-pve kernel: CR2: 0000000000000030
Aug 04 21:39:23 proxbeast-pve kernel: ---[ end trace 0000000000000000 ]---
Aug 04 21:39:23 proxbeast-pve kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 2078136400 wd_nsec: 2078128295
Aug 04 21:39:23 proxbeast-pve kernel: RIP: 0010:aafs_create.constprop.0+0x7f/0x130
Aug 04 21:39:23 proxbeast-pve kernel: Code: 4c 63 e0 48 83 c4 18 4c 89 e0 5b 41 5c 41 5d 41 5e 41 5f 5d 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 45 31 d2 e9 8c 1c c3 00 <4d> 8b 55 30 4d 8d ba a0 00 00 00 4c 89 55 c0 4c 89 ff e8 2a 13 a8
Aug 04 21:39:23 proxbeast-pve kernel: RSP: 0018:ffff9960ea65bb90 EFLAGS: 00010246
Aug 04 21:39:23 proxbeast-pve kernel: RAX: 0000000000000000 RBX: 00000000000041ed RCX: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: RBP: ffff9960ea65bbd0 R08: 0000000000000000 R09: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8fbe157b
Aug 04 21:39:23 proxbeast-pve kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: FS:  00007048f1a94d00(0000) GS:ffff8a74ee900000(0000) knlGS:0000000000000000
Aug 04 21:39:23 proxbeast-pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 04 21:39:23 proxbeast-pve kernel: CR2: 0000000000000030 CR3: 00000005f6ba8000 CR4: 0000000000f50ef0
Aug 04 21:39:23 proxbeast-pve kernel: PKRU: 55555554
Aug 04 21:39:23 proxbeast-pve kernel: note: stress-ng-appar[1627660] exited with irqs disabled
 
Last edited:
  • Like
Reactions: snakeoilos
I think I am running into this Issue. I am running kernel `6.8.8-3-pve` and am seeing random system resets. No indication of shutdown or anything in the logs. The system just starts from the beginning, like a hardware reset. Monitoring also does not show anything suspicious, no extreme CPU, Ram or IO usage.

I running with 2 NVME drives in ZFS mirror. Any Ideas how to debug the cause?
Do you still see these sudden resets issue? If yes, can you send a journal that starts a few hours before the freeze and ends a few hours after the freeze? Can you try whether they also happen with kernel 6.5?

Hitting some related instability myself. Still validating its not a memory or hardware issue. Here some logs..

Looks related to this,

https://lore.kernel.org/linux-kerne.../T/#medf0876e578ca978ef27174c8c777820a3739b75
Thanks for the report and the pointer -- @fiona sent a patch [1] to backport the upstream fix to our kernel.

EDIT: Kernel 6.8.12-1-pve contains the fix [2].

[1] https://lists.proxmox.com/pipermail/pve-devel/2024-August/065034.html
[2] https://git.proxmox.com/?p=pve-kernel.git;a=commit;h=40e698c64b48c6c865117c7c604f54484b834aa8
 
Last edited:
Hi im actually running and having a crash every days.
Hardware is Asus EXPERTCENTER PN64 with intel cpu i7 12700h.
I did everything i could find on the forums : update bios, update firmware, microcodes, disable c-state and lot more
I'm kinda lost...

Edit: its not a hardware problem cuz i was running esxi without any problem since last year
 
Last edited:
Hi im actually running and having a crash every days.
Hardware is Asus EXPERTCENTER PN64 with intel cpu i7 12700h.
I did everything i could find on the forums : update bios, update firmware, microcodes, disable c-state and lot more
I'm kinda lost...

Edit: its not a hardware problem cuz i was running esxi without any problem since last year
Anything you can see in the console? Is it a hard freeze, or is there a kernel stack trace?

Try switching back to 6.5 and see how it goes.. If that works for you pin it.

FWIW I'm back on 6.8.12-1 after updating to the latest intel microcode. Been stable for around 3 days (but still counting).
 
  • Like
Reactions: hd--
Hi all, same case here. New installation with PVE 8.2.4 and kernel 6.8.12-1-pve in Asus PRIME B760-PLUS and Intel i7 13700k. We have a Windows 2019 VM with ZFS storage, we have various kernel panics in this week and the server has been installed Monday.

Code:
Aug 30 01:32:47 pve2 kernel: VERIFY3(remove_reference(hdr, hdr) > 0) failed (0 > 0)
Aug 30 01:32:47 pve2 kernel: PANIC at arc.c:6622:arc_write_done()
Aug 30 01:32:47 pve2 kernel: Showing stack for process 323116
Aug 30 01:32:47 pve2 kernel: CPU: 19 PID: 323116 Comm: z_wr_int_2 Tainted: P    B D    O       6.8.12-1-pve #1
Aug 30 01:32:47 pve2 kernel: Hardware name: ASUS System Product Name/PRIME B760-PLUS, BIOS 1661 06/25/2024
Aug 30 01:32:47 pve2 kernel: Call Trace:
Aug 30 01:32:47 pve2 kernel:  <TASK>
Aug 30 01:32:47 pve2 kernel:  dump_stack_lvl+0x76/0xa0
Aug 30 01:32:47 pve2 kernel:  dump_stack+0x10/0x20
Aug 30 01:32:47 pve2 kernel:  spl_dumpstack+0x29/0x40 [spl]
Aug 30 01:32:47 pve2 kernel:  spl_panic+0xfc/0x120 [spl]
Aug 30 01:32:47 pve2 kernel:  arc_write_done+0x44f/0x550 [zfs]
Aug 30 01:32:47 pve2 kernel:  zio_done+0x289/0x10b0 [zfs]
Aug 30 01:32:47 pve2 kernel:  zio_execute+0x88/0x130 [zfs]
Aug 30 01:32:47 pve2 kernel:  taskq_thread+0x27f/0x4c0 [spl]
Aug 30 01:32:47 pve2 kernel:  ? __pfx_default_wake_function+0x10/0x10
Aug 30 01:32:47 pve2 kernel:  ? __pfx_zio_execute+0x10/0x10 [zfs]
Aug 30 01:32:47 pve2 kernel:  ? __pfx_taskq_thread+0x10/0x10 [spl]
Aug 30 01:32:47 pve2 kernel:  kthread+0xef/0x120
Aug 30 01:32:47 pve2 kernel:  ? __pfx_kthread+0x10/0x10
Aug 30 01:32:47 pve2 kernel:  ret_from_fork+0x44/0x70
Aug 30 01:32:47 pve2 kernel:  ? __pfx_kthread+0x10/0x10
Aug 30 01:32:47 pve2 kernel:  ret_from_fork_asm+0x1b/0x30
Aug 30 01:32:47 pve2 kernel:  </TASK>
Aug 30 01:32:51 pve2 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000060
Aug 30 01:32:51 pve2 kernel: #PF: supervisor read access in kernel mode
Aug 30 01:32:51 pve2 kernel: #PF: error_code(0x0000) - not-present page
Aug 30 01:32:51 pve2 kernel: PGD 0 P4D 0 
Aug 30 01:32:51 pve2 kernel: Oops: 0000 [#2] PREEMPT SMP NOPTI
Aug 30 01:32:51 pve2 kernel: CPU: 6 PID: 440 Comm: zvol_tq-2 Tainted: P    B D    O       6.8.12-1-pve #1
Aug 30 01:32:51 pve2 kernel: Hardware name: ASUS System Product Name/PRIME B760-PLUS, BIOS 1661 06/25/2024
Aug 30 01:32:51 pve2 kernel: RIP: 0010:arc_buf_access+0x15/0x1c0 [zfs]
Aug 30 01:32:51 pve2 kernel: Code: 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 8b 1f <48> 81 7b 60 80 60 a3 c0 0f 84 f5 00 00 00 48 8>
Aug 30 01:32:51 pve2 kernel: RSP: 0018:ffff99a40b24bba0 EFLAGS: 00010282
Aug 30 01:32:51 pve2 kernel: RAX: ffff8bec52190498 RBX: 0000000000000000 RCX: 0000000000000000
Aug 30 01:32:51 pve2 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8beb29df29e0
Aug 30 01:32:51 pve2 kernel: RBP: ffff99a40b24bbc8 R08: 0000000000000000 R09: 0000000000000000
Aug 30 01:32:51 pve2 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Aug 30 01:32:51 pve2 kernel: R13: ffff8bec52190498 R14: ffff8bde19577728 R15: ffff99a40b24bc48
Aug 30 01:32:51 pve2 kernel: FS:  0000000000000000(0000) GS:ffff8becfef00000(0000) knlGS:0000000000000000
Aug 30 01:32:51 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 30 01:32:51 pve2 kernel: CR2: 0000000000000060 CR3: 0000000181e28002 CR4: 0000000000f72ef0
Aug 30 01:32:51 pve2 kernel: PKRU: 55555554
Aug 30 01:32:51 pve2 kernel: Call Trace:
Aug 30 01:32:51 pve2 kernel:  <TASK>
Aug 30 01:32:51 pve2 kernel:  ? show_regs+0x6d/0x80
Aug 30 01:32:51 pve2 kernel:  ? __die+0x24/0x80
Aug 30 01:32:51 pve2 kernel:  ? page_fault_oops+0x176/0x4f0
Aug 30 01:32:51 pve2 kernel:  ? do_user_addr_fault+0x2ed/0x650
Aug 30 01:32:51 pve2 kernel:  ? exc_page_fault+0x83/0x1b0
Aug 30 01:32:51 pve2 kernel:  ? asm_exc_page_fault+0x27/0x30
Aug 30 01:32:51 pve2 kernel:  ? arc_buf_access+0x15/0x1c0 [zfs]
Aug 30 01:32:51 pve2 kernel:  dbuf_hold_impl+0x9b/0x750 [zfs]
Aug 30 01:32:51 pve2 kernel:  dmu_tx_check_ioerr+0x61/0x110 [zfs]
Aug 30 01:32:51 pve2 kernel:  dmu_tx_count_write+0x18c/0x1b0 [zfs]
Aug 30 01:32:51 pve2 kernel:  dmu_tx_hold_write_by_dnode+0x3a/0x60 [zfs]
Aug 30 01:32:51 pve2 kernel:  zvol_write+0x223/0x670 [zfs]
Aug 30 01:32:51 pve2 kernel:  zvol_write_task+0x12/0x30 [zfs]
Aug 30 01:32:51 pve2 kernel:  taskq_thread+0x27f/0x4c0 [spl]
Aug 30 01:32:51 pve2 kernel:  ? finish_task_switch.isra.0+0x90/0x2e0
Aug 30 01:32:51 pve2 kernel:  ? __pfx_default_wake_function+0x10/0x10
Aug 30 01:32:51 pve2 kernel:  ? __pfx_zvol_write_task+0x10/0x10 [zfs]
Aug 30 01:32:51 pve2 kernel:  ? __pfx_taskq_thread+0x10/0x10 [spl]
Aug 30 01:32:51 pve2 kernel:  kthread+0xef/0x120
Aug 30 01:32:51 pve2 kernel:  ? __pfx_kthread+0x10/0x10
Aug 30 01:32:51 pve2 kernel:  ret_from_fork+0x44/0x70
Aug 30 01:32:51 pve2 kernel:  ? __pfx_kthread+0x10/0x10
Aug 30 01:32:51 pve2 kernel:  ret_from_fork_asm+0x1b/0x30
Aug 30 01:32:51 pve2 kernel:  </TASK>
Aug 30 01:32:51 pve2 kernel: Modules linked in: tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter softdog nf_tables sunrpc binfmt_misc bonding>
Aug 30 01:32:51 pve2 kernel:  snd_hda_core aesni_intel drm_buddy snd_hwdep cmdlinepart ttm crypto_simd snd_pcm cryptd spi_nor drm_display_helper mei_hdcp mei_pxp snd_timer rapl intel_cstate snd eeepc_wmi pcspk>
Aug 30 01:32:51 pve2 kernel: CR2: 0000000000000060
Aug 30 01:32:51 pve2 kernel: ---[ end trace 0000000000000000 ]---
Aug 30 01:32:51 pve2 kernel: RIP: 0010:do_dentry_open+0x2a0/0x570
Aug 30 01:32:51 pve2 kernel: Code: 89 53 14 f6 c2 04 74 12 48 8b 83 b0 00 00 00 48 83 78 08 00 0f 84 53 02 00 00 48 8b 8b d8 00 00 00 48 8b 41 68 48 85 c0 74 0e <48> 83 78 58 00 74 07 81 4b 14 00 00 40 00 8b 5>
Aug 30 01:32:51 pve2 kernel: RSP: 0018:ffff99a400a63a28 EFLAGS: 00010202
Aug 30 01:32:51 pve2 kernel: RAX: 000000ffff8bddd6 RBX: ffff8beafd7e4800 RCX: ffff8bddd6db6363
Aug 30 01:32:51 pve2 kernel: RDX: 00000000000a801d RSI: 0000000000008000 RDI: ffff8bddd6db61a0
Aug 30 01:32:51 pve2 kernel: RBP: ffff99a400a63a50 R08: 0000000000000000 R09: 0000000000000000
Aug 30 01:32:51 pve2 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8bddd6db61a0
Aug 30 01:32:51 pve2 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff8beafd7e4898
Aug 30 01:32:51 pve2 kernel: FS:  0000000000000000(0000) GS:ffff8becfef00000(0000) knlGS:0000000000000000
Aug 30 01:32:51 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 30 01:32:51 pve2 kernel: CR2: 0000000000000060 CR3: 0000000181e28002 CR4: 0000000000f72ef0
Aug 30 01:32:51 pve2 kernel: PKRU: 55555554
Aug 30 01:32:51 pve2 kernel: note: zvol_tq-2[440] exited with irqs disabled
Aug 30 01:49:43 pve2 kernel: general protection fault, probably for non-canonical address 0xddda1255c0000058: 0000 [#3] PREEMPT SMP NOPTI
Aug 30 01:49:43 pve2 kernel: CPU: 6 PID: 325623 Comm: ps Tainted: P    B D    O       6.8.12-1-pve #1
Aug 30 01:49:43 pve2 kernel: Hardware name: ASUS System Product Name/PRIME B760-PLUS, BIOS 1661 06/25/2024
Aug 30 01:49:43 pve2 kernel: RIP: 0010:do_dentry_open+0x2a0/0x570
Aug 30 01:49:43 pve2 kernel: Code: 89 53 14 f6 c2 04 74 12 48 8b 83 b0 00 00 00 48 83 78 08 00 0f 84 53 02 00 00 48 8b 8b d8 00 00 00 48 8b 41 68 48 85 c0 74 0e <48> 83 78 58 00 74 07 81 4b 14 00 00 40 00 8b 5>
Aug 30 01:49:43 pve2 kernel: RSP: 0018:ffff99a4292fbc68 EFLAGS: 00010286
Aug 30 01:49:43 pve2 kernel: RAX: ddda1255c0000000 RBX: ffff8bec93bc8800 RCX: ffff8bddda125555
Aug 30 01:49:43 pve2 kernel: RDX: 00000000000a800d RSI: 0000000000000000 RDI: 0000000000000000
Aug 30 01:49:43 pve2 kernel: RBP: ffff99a4292fbc90 R08: 0000000000000000 R09: 0000000000000000
Aug 30 01:49:43 pve2 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8bddda1253b8
Aug 30 01:49:43 pve2 kernel: R13: 0000000000000000 R14: ffffffffa3741a10 R15: ffff8bec93bc8898
Aug 30 01:49:43 pve2 kernel: FS:  00007baff481d480(0000) GS:ffff8becfef00000(0000) knlGS:0000000000000000
Aug 30 01:49:43 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 30 01:49:43 pve2 kernel: CR2: 00005707b046b018 CR3: 0000000e76316004 CR4: 0000000000f72ef0
Aug 30 01:49:43 pve2 kernel: PKRU: 55555554
Aug 30 01:49:43 pve2 kernel: Call Trace:
Aug 30 01:49:43 pve2 kernel:  <TASK>
Aug 30 01:49:43 pve2 kernel:  ? show_regs+0x6d/0x80
Aug 30 01:49:43 pve2 kernel:  ? die_addr+0x37/0xa0
Aug 30 01:49:43 pve2 kernel:  ? exc_general_protection+0x1db/0x480
Aug 30 01:49:43 pve2 kernel:  ? asm_exc_general_protection+0x27/0x30
Aug 30 01:49:43 pve2 kernel:  ? __pfx_proc_single_open+0x10/0x10
Aug 30 01:49:43 pve2 kernel:  ? do_dentry_open+0x2a0/0x570
Aug 30 01:49:43 pve2 kernel:  ? do_dentry_open+0x21d/0x570
Aug 30 01:49:43 pve2 kernel:  vfs_open+0x33/0x50
Aug 30 01:49:43 pve2 kernel:  path_openat+0xb1c/0x1190
Aug 30 01:49:43 pve2 kernel:  do_filp_open+0xaf/0x170
Aug 30 01:49:43 pve2 kernel:  do_sys_openat2+0xb3/0xe0
Aug 30 01:49:43 pve2 kernel:  __x64_sys_openat+0x6c/0xa0
Aug 30 01:49:43 pve2 kernel:  x64_sys_call+0x189a/0x24b0
Aug 30 01:49:43 pve2 kernel:  do_syscall_64+0x81/0x170
Aug 30 01:49:43 pve2 kernel:  ? do_syscall_64+0x8d/0x170
Aug 30 01:49:43 pve2 kernel:  ? exc_page_fault+0x94/0x1b0
Aug 30 01:49:43 pve2 kernel:  entry_SYSCALL_64_after_hwframe+0x78/0x80
Aug 30 01:49:43 pve2 kernel: RIP: 0033:0x7baff4c76f01
Aug 30 01:49:43 pve2 kernel: Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d ea 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 93 00 00 00 48 8b 54 2>
Aug 30 01:49:43 pve2 kernel: RSP: 002b:00007ffe9ede30f0 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
Aug 30 01:49:43 pve2 kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007baff4c76f01
Aug 30 01:49:43 pve2 kernel: RDX: 0000000000000000 RSI: 00007ffe9ede3180 RDI: 00000000ffffff9c
Aug 30 01:49:43 pve2 kernel: RBP: 00007ffe9ede3180 R08: 0000000000000000 R09: 0000000000000073
Aug 30 01:49:43 pve2 kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffe9ede3180
Aug 30 01:49:43 pve2 kernel: R13: 00005707b0456858 R14: 0000000000000000 R15: 00007baff481c7d0
Aug 30 01:49:43 pve2 kernel:  </TASK>
Aug 30 01:49:43 pve2 kernel: Modules linked in: tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter softdog nf_tables sunrpc binfmt_misc bonding>
Aug 30 01:49:43 pve2 kernel:  snd_hda_core aesni_intel drm_buddy snd_hwdep cmdlinepart ttm crypto_simd snd_pcm cryptd spi_nor drm_display_helper mei_hdcp mei_pxp snd_timer rapl intel_cstate snd eeepc_wmi pcspk>
Aug 30 01:49:43 pve2 kernel: ---[ end trace 0000000000000000 ]---
Aug 30 01:49:43 pve2 kernel: RIP: 0010:do_dentry_open+0x2a0/0x570
Aug 30 01:49:43 pve2 kernel: Code: 89 53 14 f6 c2 04 74 12 48 8b 83 b0 00 00 00 48 83 78 08 00 0f 84 53 02 00 00 48 8b 8b d8 00 00 00 48 8b 41 68 48 85 c0 74 0e <48> 83 78 58 00 74 07 81 4b 14 00 00 40 00 8b 5>
Aug 30 01:49:43 pve2 kernel: RSP: 0018:ffff99a400a63a28 EFLAGS: 00010202
Aug 30 01:49:43 pve2 kernel: RAX: 000000ffff8bddd6 RBX: ffff8beafd7e4800 RCX: ffff8bddd6db6363
Aug 30 01:49:43 pve2 kernel: RDX: 00000000000a801d RSI: 0000000000008000 RDI: ffff8bddd6db61a0
Aug 30 01:49:43 pve2 kernel: RBP: ffff99a400a63a50 R08: 0000000000000000 R09: 0000000000000000
Aug 30 01:49:43 pve2 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8bddd6db61a0
Aug 30 01:49:43 pve2 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff8beafd7e4898
Aug 30 01:49:43 pve2 kernel: FS:  00007baff481d480(0000) GS:ffff8becfef00000(0000) knlGS:0000000000000000
Aug 30 01:49:43 pve2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 30 01:49:43 pve2 kernel: CR2: 00005707b046b018 CR3: 0000000e76316004 CR4: 0000000000f72ef0
Aug 30 01:49:43 pve2 kernel: PKRU: 55555554

Attach file have all kernel panics in our server. Hope this help.
Thanks for all, Fernando.
 

Attachments

New installation
Also new HW? Have you tested RAM etc? BIOS looks like new.

Also I see it has onboard Realtek NIC. This could also be causing the problem.

Maybe try it on older kernel & see if errors disappear. Maybe download older PVE version from here.
 
Also new HW? Have you tested RAM etc? BIOS looks like new.

Also I see it has onboard Realtek NIC. This could also be causing the problem.

Maybe try it on older kernel & see if errors disappear. Maybe download older PVE version from here.
Yes, new hardware and tested with other OS without problems. When we installed the server, the BIOS update of intel microcode 0x129 was beta.

30 minutes later, we have another kernel panic. I pined the 6.8.4-2-pve and reboot.

If we have more crash I'll try with the kernel 6.5.13-6-pve. It's better other kernel version?

Thanks for all, Fernando.
 
BIOS update of intel microcode 0x129
Yes, when I was looking a your post, noticing your i7 13700k CPU, the first thing I thought of was the reports for that CPU's instability issues.

Yes, new hardware and tested with other OS without problems
Stress tested?

As I said above, you've got onboard Realtek NIC - which can also cause Linux issues/kernel dependencies.
 
Yes, when I was looking a your post, noticing your i7 13700k CPU, the first thing I thought of was the reports for that CPU's instability issues.


Stress tested?

As I said above, you've got onboard Realtek NIC - which can also cause Linux issues/kernel dependencies.
Stress test I think not... Good point for the future ; )

We have update BIOS with intel microcode 0x129 and we hope fix the instability...

Never heard nothing about Realtek NIC problems, thanks for the info. If we have more problems we'll change it too.

Thanks for all, Fernando.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!