Hi all,
I have some serious issues with random freezes. The server was freshly installed with 8.1.10.x a couple of weeks ago.
Proxmox runs on two NVMe Samsung MZQL21T9HCJR-00A07 Enterprise SSDs on ZFS (RAID1), Intel i9-13900, 64 GB DDR5 ECC RAM
Maybe a ZFS issue?
Scrub runs without any issues. Compression is on, Dedup off.
(I attached the complete logs)
Does anyone run into the same problem?
I have some serious issues with random freezes. The server was freshly installed with 8.1.10.x a couple of weeks ago.
Proxmox runs on two NVMe Samsung MZQL21T9HCJR-00A07 Enterprise SSDs on ZFS (RAID1), Intel i9-13900, 64 GB DDR5 ECC RAM
Maybe a ZFS issue?
Scrub runs without any issues. Compression is on, Dedup off.
Code:
[...]
Apr 21 08:36:35 srv02 kernel: VERIFY3(remove_reference(hdr, hdr) > 0) failed (0 > 0)
Apr 21 08:36:35 srv02 kernel: PANIC at arc.c:6610:arc_write_done()
Apr 21 08:36:35 srv02 kernel: Showing stack for process 779
Apr 21 08:36:35 srv02 kernel: CPU: 0 PID: 779 Comm: z_wr_int_0 Tainted: P O 6.5.13-5-pve #1
Apr 21 08:36:35 srv02 kernel: Hardware name: ASUSTeK COMPUTER INC. System Product Name/W680/MB DC, BIOS 2007-HET0003-24020101 02/01/2024
Apr 21 08:36:35 srv02 kernel: Call Trace:
Apr 21 08:36:35 srv02 kernel: <TASK>
Apr 21 08:36:35 srv02 kernel: dump_stack_lvl+0x48/0x70
Apr 21 08:36:35 srv02 kernel: dump_stack+0x10/0x20
Apr 21 08:36:35 srv02 kernel: spl_dumpstack+0x29/0x40 [spl]
Apr 21 08:36:35 srv02 kernel: spl_panic+0xfc/0x120 [spl]
Apr 21 08:36:35 srv02 kernel: arc_write_done+0x44f/0x550 [zfs]
Apr 21 08:36:35 srv02 kernel: zio_done+0x289/0x10b0 [zfs]
Apr 21 08:36:35 srv02 kernel: ? kfree+0x78/0x120
Apr 21 08:36:35 srv02 kernel: zio_execute+0x88/0x130 [zfs]
Apr 21 08:36:35 srv02 kernel: taskq_thread+0x27f/0x490 [spl]
Apr 21 08:36:35 srv02 kernel: ? __pfx_default_wake_function+0x10/0x10
Apr 21 08:36:35 srv02 kernel: ? __pfx_zio_execute+0x10/0x10 [zfs]
Apr 21 08:36:35 srv02 kernel: ? __pfx_taskq_thread+0x10/0x10 [spl]
Apr 21 08:36:35 srv02 kernel: kthread+0xef/0x120
Apr 21 08:36:35 srv02 kernel: ? __pfx_kthread+0x10/0x10
Apr 21 08:36:35 srv02 kernel: ret_from_fork+0x44/0x70
Apr 21 08:36:35 srv02 kernel: ? __pfx_kthread+0x10/0x10
Apr 21 08:36:35 srv02 kernel: ret_from_fork_asm+0x1b/0x30
Apr 21 08:36:35 srv02 kernel: </TASK>
[...]
Code:
uname -a
Linux srv02 6.5.13-5-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-5 (2024-04-05T11:03Z) x86_64 GNU/Linux
Code:
cat /proc/cmdline
initrd=\EFI\proxmox\6.5.13-5-pve\initrd.img-6.5.13-5-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs pcie_aspm.policy=performance split_lock_detect=off
Code:
cat /sys/module/zfs/parameters/zfs_arc_max
6714032128
Code:
zpool status
pool: rpool
state: ONLINE
scan: scrub repaired 0B in 00:06:56 with 0 errors on Sun Apr 21 10:22:48 2024
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme-eui.36344730571023490025385300000001-part3 ONLINE 0 0 0
nvme-eui.36344730571023550025385300000001-part3 ONLINE 0 0 0
errors: No known data errors
Does anyone run into the same problem?