Server node Cluster Problem: INFO: task z_zvol:779 blocked for more than 120 seconds.

supermicro_server

Well-Known Member
Sep 13, 2017
107
7
58
44
Dear All,
we have a cluster, 3 nodes, with local storage (Intel SSD with Raid ZFS-2)
We use Proxmox VE 5.1
the servers are new, CPU dual Xeon E5-2630v4, 256 GB RAM, 8 SSD Intel, ... so the servers are strong!
the server run Well! very low iodelay!
However, sometimes, without any apparent reasons, we see strange (alarming) messages on dmege:

[Sat Mar 31 12:06:04 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:06:04 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:06:04 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:06:04 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:06:04 2018] Call Trace:
[Sat Mar 31 12:06:04 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:06:04 2018] schedule+0x36/0x80
[Sat Mar 31 12:06:04 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:06:04 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:06:04 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:06:04 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:06:04 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:06:04 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:06:04 2018] kthread+0x10c/0x140
[Sat Mar 31 12:06:04 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:06:04 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:06:04 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:08:05 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:08:05 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:08:05 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:08:05 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:08:05 2018] Call Trace:
[Sat Mar 31 12:08:05 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:08:05 2018] schedule+0x36/0x80
[Sat Mar 31 12:08:05 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:08:05 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:08:05 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:08:05 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:08:05 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:08:05 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:08:05 2018] kthread+0x10c/0x140
[Sat Mar 31 12:08:05 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:08:05 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:08:05 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:10:06 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:10:06 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:10:06 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:10:06 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:10:06 2018] Call Trace:
[Sat Mar 31 12:10:06 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:10:06 2018] schedule+0x36/0x80
[Sat Mar 31 12:10:06 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:10:06 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:10:06 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:10:06 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:10:06 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:10:06 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:10:06 2018] kthread+0x10c/0x140
[Sat Mar 31 12:10:06 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:10:06 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:10:06 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:12:07 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:12:07 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:12:07 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:12:07 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:12:07 2018] Call Trace:
[Sat Mar 31 12:12:07 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:12:07 2018] schedule+0x36/0x80
[Sat Mar 31 12:12:07 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:12:07 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:12:07 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:12:07 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:12:07 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:12:07 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:12:07 2018] kthread+0x10c/0x140
[Sat Mar 31 12:12:07 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:12:07 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:12:07 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:14:08 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:14:08 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:14:08 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:14:08 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:14:08 2018] Call Trace:
[Sat Mar 31 12:14:08 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:14:08 2018] schedule+0x36/0x80
[Sat Mar 31 12:14:08 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:14:08 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:14:08 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:14:08 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:14:08 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:14:08 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:14:08 2018] kthread+0x10c/0x140
[Sat Mar 31 12:14:08 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:14:08 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:14:08 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:16:09 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:16:09 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:16:09 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:16:09 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:16:09 2018] Call Trace:
[Sat Mar 31 12:16:09 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:16:09 2018] schedule+0x36/0x80
[Sat Mar 31 12:16:09 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:16:09 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:16:09 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:16:09 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:16:09 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:16:09 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:16:09 2018] kthread+0x10c/0x140
[Sat Mar 31 12:16:09 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:16:09 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:16:09 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:18:09 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:18:09 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:18:09 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:18:09 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:18:09 2018] Call Trace:
[Sat Mar 31 12:18:09 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:18:09 2018] schedule+0x36/0x80
[Sat Mar 31 12:18:09 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:18:09 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:18:09 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:18:09 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:18:09 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:18:09 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:18:09 2018] kthread+0x10c/0x140
[Sat Mar 31 12:18:09 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:18:09 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:18:09 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:20:10 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:20:10 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:20:10 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:20:10 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:20:10 2018] Call Trace:
[Sat Mar 31 12:20:10 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:20:10 2018] schedule+0x36/0x80
[Sat Mar 31 12:20:10 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:20:10 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:20:10 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:20:10 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:20:10 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:20:10 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:20:10 2018] kthread+0x10c/0x140
[Sat Mar 31 12:20:10 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:20:10 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:20:10 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:22:11 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:22:11 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:22:11 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:22:11 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:22:11 2018] Call Trace:
[Sat Mar 31 12:22:11 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:22:11 2018] schedule+0x36/0x80
[Sat Mar 31 12:22:11 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:22:11 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:22:11 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:22:11 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:22:11 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:22:11 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:22:11 2018] kthread+0x10c/0x140
[Sat Mar 31 12:22:11 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:22:11 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:22:11 2018] ret_from_fork+0x35/0x40
[Sat Mar 31 12:24:12 2018] INFO: task z_zvol:902 blocked for more than 120 seconds.
[Sat Mar 31 12:24:12 2018] Tainted: P O 4.13.13-6-pve #1
[Sat Mar 31 12:24:12 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Mar 31 12:24:12 2018] z_zvol D 0 902 2 0x00000000
[Sat Mar 31 12:24:12 2018] Call Trace:
[Sat Mar 31 12:24:12 2018] __schedule+0x3e0/0x870
[Sat Mar 31 12:24:12 2018] schedule+0x36/0x80
[Sat Mar 31 12:24:12 2018] taskq_wait_outstanding+0x8c/0xd0 [spl]
[Sat Mar 31 12:24:12 2018] ? wait_woken+0x80/0x80
[Sat Mar 31 12:24:12 2018] zvol_task_cb+0x1f9/0x5b0 [zfs]
[Sat Mar 31 12:24:12 2018] ? __schedule+0x3e8/0x870
[Sat Mar 31 12:24:12 2018] taskq_thread+0x2ae/0x4d0 [spl]
[Sat Mar 31 12:24:12 2018] ? wake_up_q+0x80/0x80
[Sat Mar 31 12:24:12 2018] kthread+0x10c/0x140
[Sat Mar 31 12:24:12 2018] ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Mar 31 12:24:12 2018] ? kthread_create_on_node+0x70/0x70
[Sat Mar 31 12:24:12 2018] ret_from_fork+0x35/0x40

it seems that the server (storage) is busy and not answar...
somebody know what happen?
How can I study in deep the problem?
 
i forgot one important thing... we haven't any raid controller... the ssd are directly plugged on motherboard.
 
Hi fabian,
thanks for you answer.
about: github.com/zfsonlinux/zfs/issues/6330
it can be my case!

studied in deep the problem: the messages appares in the log (dmesg) 5 or 10 minutes later the write intensive operation. (move disk from the nfs storage to the server).
In my case, the server is not completly blocked.
but the bug ware closed a few days ago...

below my "as is" situation:

pveversion -v
proxmox-ve: 5.1-40 (running kernel: 4.13.13-6-pve)
pve-manager: 5.1-46 (running version: 5.1-46/ae8241d4)
pve-kernel-4.13.13-6-pve: 4.13.13-40
pve-kernel-4.13.4-1-pve: 4.13.4-26
corosync: 2.4.2-pve3
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-common-perl: 5.0-27
libpve-guest-common-perl: 2.0-14
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-17
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-2
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-11
pve-cluster: 5.0-20
pve-container: 2.0-19
pve-docs: 5.1-16
pve-firewall: 3.0-5
pve-firmware: 2.0-3
pve-ha-manager: 2.0-5
pve-i18n: 1.0-4
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.9.1-8
pve-xtermjs: 1.0-2
qemu-server: 5.0-21
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.4-pve2~bpo9

do you know if the new packages of proxmox can fix the problem?
do you know if I had better to update my system to fix the problem?

thanks!
 
the fix for that specific issue was slated for 0.7.8 (but in the meantime there has been a regression fix released as 0.7.8, so it will probably be part of 0.7.9).
 
This looks suspiciously like the issue I'm running into on 4.4. Will 0.7.9 be coming in an update to 4.4?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!