WARNING for zfs 2.2 kernel proxmox-kernel-6.5.11-7-pve-signed DO NOT ENABLE AUTOTRIM

glaeken2

New Member
Jun 7, 2023
16
0
1
Kernel: proxmox-kernel-6.5.11-7-pve-signed
zfs 2.2 (ZFS: Loaded module v2.2.0-pve4, pve-manager/8.1.3)

Enabling trim, "autotrim=yes" on zfs pools, causes vdev_autotrim kernel threads to go into "D" state causing artificial high machine load. May also cause other unforseen consequences.
Confirmed on multiple machines with multiple different configurations including hardware, md and zfs raids, both in raidz and mirror configurations.
Kernels affected for sure: 6.5.11-6-pve and 6.5.11-7-pve

[900927.659938] INFO: task vdev_autotrim:7851 blocked for more than 241 seconds. [900927.660540] Tainted: P S O 6.5.11-6-pve #1 [900927.661230] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [900927.661854] task:vdev_autotrim state:D stack:0 pid:7851 ppid:2 flags:0x00004000 [900927.662496] Call Trace: [900927.663141] <TASK> [900927.663770] __schedule+0x3fd/0x1450 [900927.664379] ? __wake_up_common_lock+0x8b/0xd0 [900927.664984] schedule+0x63/0x110 [900927.665566] cv_wait_common+0x109/0x140 [spl] [900927.666190] ? __pfx_autoremove_wake_function+0x10/0x10 [900927.666776] __cv_wait+0x15/0x30 [spl] [900927.667422] vdev_autotrim_thread+0x797/0x9a0 [zfs] [900927.668777] ? __pfx_vdev_autotrim_thread+0x10/0x10 [zfs] [900927.670120] ? __pfx_thread_generic_wrapper+0x10/0x10 [spl] [900927.670746] thread_generic_wrapper+0x5c/0x70 [spl] [900927.671418] kthread+0xef/0x120 [900927.672030] ? __pfx_kthread+0x10/0x10 [900927.672723] ret_from_fork+0x44/0x70 [900927.673319] ? __pfx_kthread+0x10/0x10 [900927.673909] ret_from_fork_asm+0x1b/0x30 [900927.674488] </TASK>
 
Oh god, i've found another ZFS issue.
ZFS 2.2 does not honour l2arc_mfuonly setting anymore. It almost killed an nvme drive with severe writes to L2ARC.
I know that zfs does not honor "secondarycache=none" and "secondarycache=metdata", but the "l2arc_mfuonly" worked in 2.1 version.
In 2.2 l2arc_mfuonly does not work AND secondarycache= also doesn't work. I had to disconnect caches from pools to avoid killing nvme drives when constantly reading large amounts of random data.
 
i see such issues even with autotrim=off on my proxmox backup server, since upgrade to kernel 6.5 and zfs 2.2

Code:
[160947.401342] INFO: task tokio-runtime-w:22541 blocked for more than 120 seconds.
[160947.401706]       Tainted: P           O       6.5.11-7-pve #1
[160947.401953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[160947.402163] task:tokio-runtime-w state:D stack:0     pid:22541 ppid:1      flags:0x00000002
[160947.402179] Call Trace:
[160947.402182]  <TASK>
[160947.402209]  __schedule+0x3fd/0x1450
[160947.402217]  ? __queue_delayed_work+0x83/0xf0
[160947.402223]  ? _raw_spin_unlock_irq+0xe/0x50
[160947.402228]  schedule+0x63/0x110
[160947.402231]  wb_wait_for_completion+0x89/0xc0
[160947.402236]  ? __pfx_autoremove_wake_function+0x10/0x10
[160947.402253]  __writeback_inodes_sb_nr+0x9d/0xd0
[160947.402257]  writeback_inodes_sb+0x3c/0x60
[160947.402260]  sync_filesystem+0x3d/0xb0
[160947.402264]  __x64_sys_syncfs+0x49/0xb0
[160947.402281]  do_syscall_64+0x5b/0x90
[160947.402286]  ? syscall_exit_to_user_mode+0x37/0x60
[160947.402291]  ? do_syscall_64+0x67/0x90
[160947.402294]  ? __do_softirq+0xd4/0x303
[160947.402298]  ? handle_edge_irq+0xda/0x250
[160947.402310]  ? exit_to_user_mode_prepare+0x39/0x190
[160947.402314]  ? irqentry_exit_to_user_mode+0x17/0x20
[160947.402318]  ? irqentry_exit+0x43/0x50
[160947.402322]  ? common_interrupt+0x54/0xb0
[160947.402325]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[160947.402330] RIP: 0033:0x7f6d3471db57
[160947.402358] RSP: 002b:00007f6d25da42a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000132
[160947.402361] RAX: ffffffffffffffda RBX: 00007f6d25da42f8 RCX: 00007f6d3471db57
[160947.402363] RDX: 00007f6b4ac9f512 RSI: 0000000000000007 RDI: 000000000000003f
[160947.402364] RBP: 000000000000003f R08: 0000000000000007 R09: 00007f6cbc022b60
[160947.402366] R10: ef9208519acecf7a R11: 0000000000000202 R12: 0000000000000001
[160947.402368] R13: 00007f6c48068710 R14: 000000000000001c R15: 00007f6cbc08a500
[160947.402372]  </TASK>
[160947.402373] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
 
i see such issues even with autotrim=off on my proxmox backup server, since upgrade to kernel 6.5 and zfs 2.2

Code:
[160947.401342] INFO: task tokio-runtime-w:22541 blocked for more than 120 seconds.
[160947.401706]       Tainted: P           O       6.5.11-7-pve #1
[160947.401953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[160947.402163] task:tokio-runtime-w state:D stack:0     pid:22541 ppid:1      flags:0x00000002
[160947.402179] Call Trace:
[160947.402182]  <TASK>
[160947.402209]  __schedule+0x3fd/0x1450
[160947.402217]  ? __queue_delayed_work+0x83/0xf0
[160947.402223]  ? _raw_spin_unlock_irq+0xe/0x50
[160947.402228]  schedule+0x63/0x110
[160947.402231]  wb_wait_for_completion+0x89/0xc0
[160947.402236]  ? __pfx_autoremove_wake_function+0x10/0x10
[160947.402253]  __writeback_inodes_sb_nr+0x9d/0xd0
[160947.402257]  writeback_inodes_sb+0x3c/0x60
[160947.402260]  sync_filesystem+0x3d/0xb0
[160947.402264]  __x64_sys_syncfs+0x49/0xb0
[160947.402281]  do_syscall_64+0x5b/0x90
[160947.402286]  ? syscall_exit_to_user_mode+0x37/0x60
[160947.402291]  ? do_syscall_64+0x67/0x90
[160947.402294]  ? __do_softirq+0xd4/0x303
[160947.402298]  ? handle_edge_irq+0xda/0x250
[160947.402310]  ? exit_to_user_mode_prepare+0x39/0x190
[160947.402314]  ? irqentry_exit_to_user_mode+0x17/0x20
[160947.402318]  ? irqentry_exit+0x43/0x50
[160947.402322]  ? common_interrupt+0x54/0xb0
[160947.402325]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[160947.402330] RIP: 0033:0x7f6d3471db57
[160947.402358] RSP: 002b:00007f6d25da42a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000132
[160947.402361] RAX: ffffffffffffffda RBX: 00007f6d25da42f8 RCX: 00007f6d3471db57
[160947.402363] RDX: 00007f6b4ac9f512 RSI: 0000000000000007 RDI: 000000000000003f
[160947.402364] RBP: 000000000000003f R08: 0000000000000007 R09: 00007f6cbc022b60
[160947.402366] R10: ef9208519acecf7a R11: 0000000000000202 R12: 0000000000000001
[160947.402368] R13: 00007f6c48068710 R14: 000000000000001c R15: 00007f6cbc08a500
[160947.402372]  </TASK>
[160947.402373] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
Please don't double post and it's better to open a new thread if you're not sure it's the same issue.
 
  • Like
Reactions: RolandK