ProxmoxVE 8.1 - Higher server load with kernel 6.5.11 compared to 6.2.16

holger_p.

New Member
Nov 26, 2023
4
2
3
Hi, I am new to this forum - a home(-lab) user of Proxmox VE for quiet a while already - who updated to Proxmox VE 8.1 now.
As already mentioned by several others I am facing higher server load using kernel 6.5.11-4 on my system (a ProLiant MicroServer Gen8).

It increases - same load of/running VMs - when using this kernel and declines running under prior kernel 6.2.16-19 (see also screenshot).

Current (since 25-Nov-2023, ~13:30):
Code:
root@pve:~# pveversion
pve-manager/8.1.3/b46aac3b42da5d15 (running kernel: 6.2.16-19-pve)
 

Attachments

  • Server load kernel 6.5 versus 6.2.png
    Server load kernel 6.5 versus 6.2.png
    40.7 KB · Views: 17
Last edited:
Issue is still reproducible with updated kernel (6.5.11-6) and ZFS packages (2.2.2-pve1):

Edit: Updated an existing github issue for hanging autotrim processes.

Code:
root@pve:~# [ 8218.231630] INFO: task vdev_autotrim:133705 blocked for more than 362 seconds.
[ 8218.232450]       Tainted: P          IO       6.5.11-6-pve #1
[ 8218.233156] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8218.233570] task:vdev_autotrim   state:D stack:0     pid:133705 ppid:2      flags:0x00004000
[ 8218.234422] Call Trace:
[ 8218.234558]  <TASK>
[ 8218.235053]  __schedule+0x3fd/0x1450
[ 8218.235275]  ? __wake_up_common_lock+0x8b/0xd0
[ 8218.235913]  schedule+0x63/0x110
[ 8218.236482]  cv_wait_common+0x109/0x140 [spl]
[ 8218.237144]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 8218.237437]  __cv_wait+0x15/0x30 [spl]
[ 8218.237649]  vdev_autotrim_thread+0x797/0x9a0 [zfs]
[ 8218.238532]  ? __pfx_vdev_autotrim_thread+0x10/0x10 [zfs]
[ 8218.239042]  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
[ 8218.239339]  thread_generic_wrapper+0x5f/0x70 [spl]
[ 8218.240000]  kthread+0xf2/0x120
[ 8218.240580]  ? __pfx_kthread+0x10/0x10
[ 8218.240844]  ret_from_fork+0x47/0x70
[ 8218.241050]  ? __pfx_kthread+0x10/0x10
[ 8218.241249]  ret_from_fork_asm+0x1b/0x30
[ 8218.241451]  </TASK>
[ 8218.241583] INFO: task vdev_autotrim:133957 blocked for more than 362 seconds.
[ 8218.242346]       Tainted: P          IO       6.5.11-6-pve #1
[ 8218.243038] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8218.243458] task:vdev_autotrim   state:D stack:0     pid:133957 ppid:2      flags:0x00004000
[ 8218.244294] Call Trace:
[ 8218.244440]  <TASK>
[ 8218.244926]  __schedule+0x3fd/0x1450
[ 8218.245146]  ? __wake_up_common_lock+0x8b/0xd0
[ 8218.245772]  schedule+0x63/0x110
[ 8218.246504]  cv_wait_common+0x109/0x140 [spl]
[ 8218.247378]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 8218.247698]  __cv_wait+0x15/0x30 [spl]
[ 8218.247960]  vdev_autotrim_thread+0x797/0x9a0 [zfs]
[ 8218.248969]  ? __pfx_vdev_autotrim_thread+0x10/0x10 [zfs]
[ 8218.249616]  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
[ 8218.249938]  thread_generic_wrapper+0x5f/0x70 [spl]
[ 8218.250584]  kthread+0xf2/0x120
[ 8218.251172]  ? __pfx_kthread+0x10/0x10
[ 8218.251396]  ret_from_fork+0x47/0x70
[ 8218.251608]  ? __pfx_kthread+0x10/0x10
[ 8218.251808]  ret_from_fork_asm+0x1b/0x30
[ 8218.252014]  </TASK>
 
Last edited:
  • Like
Reactions: juliokele
i see such issues even with autotrim=off on my proxmox backup server, since upgrade to kernel 6.5 and zfs 2.2

Code:
[160947.401342] INFO: task tokio-runtime-w:22541 blocked for more than 120 seconds.
[160947.401706]       Tainted: P           O       6.5.11-7-pve #1
[160947.401953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[160947.402163] task:tokio-runtime-w state:D stack:0     pid:22541 ppid:1      flags:0x00000002
[160947.402179] Call Trace:
[160947.402182]  <TASK>
[160947.402209]  __schedule+0x3fd/0x1450
[160947.402217]  ? __queue_delayed_work+0x83/0xf0
[160947.402223]  ? _raw_spin_unlock_irq+0xe/0x50
[160947.402228]  schedule+0x63/0x110
[160947.402231]  wb_wait_for_completion+0x89/0xc0
[160947.402236]  ? __pfx_autoremove_wake_function+0x10/0x10
[160947.402253]  __writeback_inodes_sb_nr+0x9d/0xd0
[160947.402257]  writeback_inodes_sb+0x3c/0x60
[160947.402260]  sync_filesystem+0x3d/0xb0
[160947.402264]  __x64_sys_syncfs+0x49/0xb0
[160947.402281]  do_syscall_64+0x5b/0x90
[160947.402286]  ? syscall_exit_to_user_mode+0x37/0x60
[160947.402291]  ? do_syscall_64+0x67/0x90
[160947.402294]  ? __do_softirq+0xd4/0x303
[160947.402298]  ? handle_edge_irq+0xda/0x250
[160947.402310]  ? exit_to_user_mode_prepare+0x39/0x190
[160947.402314]  ? irqentry_exit_to_user_mode+0x17/0x20
[160947.402318]  ? irqentry_exit+0x43/0x50
[160947.402322]  ? common_interrupt+0x54/0xb0
[160947.402325]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[160947.402330] RIP: 0033:0x7f6d3471db57
[160947.402358] RSP: 002b:00007f6d25da42a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000132
[160947.402361] RAX: ffffffffffffffda RBX: 00007f6d25da42f8 RCX: 00007f6d3471db57
[160947.402363] RDX: 00007f6b4ac9f512 RSI: 0000000000000007 RDI: 000000000000003f
[160947.402364] RBP: 000000000000003f R08: 0000000000000007 R09: 00007f6cbc022b60
[160947.402366] R10: ef9208519acecf7a R11: 0000000000000202 R12: 0000000000000001
[160947.402368] R13: 00007f6c48068710 R14: 000000000000001c R15: 00007f6cbc08a500
[160947.402372]  </TASK>
[160947.402373] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
 
Hi,
i see such issues even with autotrim=off on my proxmox backup server, since upgrade to kernel 6.5 and zfs 2.2

Code:
[160947.401342] INFO: task tokio-runtime-w:22541 blocked for more than 120 seconds.
[160947.401706]       Tainted: P           O       6.5.11-7-pve #1
[160947.401953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[160947.402163] task:tokio-runtime-w state:D stack:0     pid:22541 ppid:1      flags:0x00000002
[160947.402179] Call Trace:
[160947.402182]  <TASK>
[160947.402209]  __schedule+0x3fd/0x1450
[160947.402217]  ? __queue_delayed_work+0x83/0xf0
[160947.402223]  ? _raw_spin_unlock_irq+0xe/0x50
[160947.402228]  schedule+0x63/0x110
[160947.402231]  wb_wait_for_completion+0x89/0xc0
[160947.402236]  ? __pfx_autoremove_wake_function+0x10/0x10
[160947.402253]  __writeback_inodes_sb_nr+0x9d/0xd0
[160947.402257]  writeback_inodes_sb+0x3c/0x60
[160947.402260]  sync_filesystem+0x3d/0xb0
[160947.402264]  __x64_sys_syncfs+0x49/0xb0
[160947.402281]  do_syscall_64+0x5b/0x90
[160947.402286]  ? syscall_exit_to_user_mode+0x37/0x60
[160947.402291]  ? do_syscall_64+0x67/0x90
[160947.402294]  ? __do_softirq+0xd4/0x303
[160947.402298]  ? handle_edge_irq+0xda/0x250
[160947.402310]  ? exit_to_user_mode_prepare+0x39/0x190
[160947.402314]  ? irqentry_exit_to_user_mode+0x17/0x20
[160947.402318]  ? irqentry_exit+0x43/0x50
[160947.402322]  ? common_interrupt+0x54/0xb0
[160947.402325]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[160947.402330] RIP: 0033:0x7f6d3471db57
[160947.402358] RSP: 002b:00007f6d25da42a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000132
[160947.402361] RAX: ffffffffffffffda RBX: 00007f6d25da42f8 RCX: 00007f6d3471db57
[160947.402363] RDX: 00007f6b4ac9f512 RSI: 0000000000000007 RDI: 000000000000003f
[160947.402364] RBP: 000000000000003f R08: 0000000000000007 R09: 00007f6cbc022b60
[160947.402366] R10: ef9208519acecf7a R11: 0000000000000202 R12: 0000000000000001
[160947.402368] R13: 00007f6c48068710 R14: 000000000000001c R15: 00007f6cbc08a500
[160947.402372]  </TASK>
[160947.402373] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
the call trace is completely different (seems to be related to general filesystem sync, not ZFS autotrim) so most likely a different issue. Please provide more details about your hardware and file system configuration. What did the server load look like during the time the issue happened?