INFO: task send_traverse:17000 blocked for more than 120 seconds.

mailinglists

Renowned Member
Mar 14, 2012
641
68
93
Hi guys,

on two PM (proxmox-ve: 5.4-1 (running kernel: 4.15.18-16-pve)) servers, I see messages like:

Code:
[Thu Jun 27 23:02:34 2019] INFO: task send_traverse:16723 blocked for more than 120 seconds.
[Thu Jun 27 23:02:34 2019]       Tainted: P           O     4.15.18-16-pve #1
[Thu Jun 27 23:02:34 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:34 2019] send_traverse   D    0 16723      2 0x80000000
[Thu Jun 27 23:02:34 2019] Call Trace:
[Thu Jun 27 23:02:34 2019]  __schedule+0x3e0/0x870
[Thu Jun 27 23:02:34 2019]  schedule+0x36/0x80
[Thu Jun 27 23:02:34 2019]  cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:34 2019]  ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:34 2019]  __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:34 2019]  bqueue_enqueue+0x62/0xe0 [zfs]
[Thu Jun 27 23:02:34 2019]  send_cb+0x14a/0x190 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_visitbp+0x459/0x980 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_dnode+0xaa/0x1c0 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_visitbp+0x878/0x980 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_impl+0x1e3/0x450 [zfs]
[Thu Jun 27 23:02:34 2019]  ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Thu Jun 27 23:02:34 2019]  ? byteswap_record+0x2b0/0x2b0 [zfs]
[Thu Jun 27 23:02:34 2019]  traverse_dataset_resume+0x46/0x50 [zfs]
[Thu Jun 27 23:02:34 2019]  ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Thu Jun 27 23:02:34 2019]  send_traverse_thread+0x59/0xb0 [zfs]
[Thu Jun 27 23:02:34 2019]  thread_generic_wrapper+0x74/0x90 [spl]
[Thu Jun 27 23:02:34 2019]  kthread+0x105/0x140
[Thu Jun 27 23:02:34 2019]  ? __thread_exit+0x20/0x20 [spl]
[Thu Jun 27 23:02:34 2019]  ? kthread_create_worker_on_cpu+0x70/0x70
[Thu Jun 27 23:02:34 2019]  ret_from_fork+0x35/0x40

Code:
[Tue Jul  2 23:07:18 2019] INFO: task send_traverse:17000 blocked for more than 120 seconds.
[Tue Jul  2 23:07:18 2019]       Tainted: P           O     4.15.18-16-pve #1
[Tue Jul  2 23:07:18 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jul  2 23:07:18 2019] send_traverse   D    0 17000      2 0x80000000
[Tue Jul  2 23:07:18 2019] Call Trace:
[Tue Jul  2 23:07:18 2019]  __schedule+0x3e0/0x870
[Tue Jul  2 23:07:18 2019]  schedule+0x36/0x80
[Tue Jul  2 23:07:18 2019]  cv_wait_common+0x11e/0x140 [spl]
[Tue Jul  2 23:07:18 2019]  ? wait_woken+0x80/0x80
[Tue Jul  2 23:07:18 2019]  __cv_wait+0x15/0x20 [spl]
[Tue Jul  2 23:07:18 2019]  bqueue_enqueue+0x62/0xe0 [zfs]
[Tue Jul  2 23:07:18 2019]  send_cb+0x14a/0x190 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x49c/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_dnode+0xaa/0x1c0 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x70e/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_dnode+0xaa/0x1c0 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_visitbp+0x878/0x980 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_impl+0x1e3/0x450 [zfs]
[Tue Jul  2 23:07:18 2019]  ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Tue Jul  2 23:07:18 2019]  ? byteswap_record+0x2b0/0x2b0 [zfs]
[Tue Jul  2 23:07:18 2019]  traverse_dataset_resume+0x46/0x50 [zfs]
[Tue Jul  2 23:07:18 2019]  ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Tue Jul  2 23:07:18 2019]  send_traverse_thread+0x59/0xb0 [zfs]
[Tue Jul  2 23:07:18 2019]  thread_generic_wrapper+0x74/0x90 [spl]
[Tue Jul  2 23:07:18 2019]  kthread+0x105/0x140
[Tue Jul  2 23:07:18 2019]  ? __thread_exit+0x20/0x20 [spl]
[Tue Jul  2 23:07:18 2019]  ? kthread_create_worker_on_cpu+0x70/0x70
[Tue Jul  2 23:07:18 2019]  ret_from_fork+0x35/0x40

I got a "feeling" that this might be related to big ZFS network sends to an overloaded ZFS receiving server.
This is how it looks on ZFS backup / receiving server:

Code:
[Thu Jun 27 23:02:38 2019] INFO: task zfs:10360 blocked for more than 120 seconds.
[Thu Jun 27 23:02:38 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Thu Jun 27 23:02:38 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:38 2019] zfs             D    0 10360  10353 0x80000000
[Thu Jun 27 23:02:38 2019] Call Trace:
[Thu Jun 27 23:02:38 2019]  __schedule+0x3e0/0x870
[Thu Jun 27 23:02:38 2019]  schedule+0x36/0x80
[Thu Jun 27 23:02:38 2019]  cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:38 2019]  ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:38 2019]  __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:38 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Thu Jun 27 23:02:38 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Thu Jun 27 23:02:38 2019]  ? __enqueue_entity+0x5c/0x60
[Thu Jun 27 23:02:38 2019]  ? sched_clock+0x9/0x10
[Thu Jun 27 23:02:38 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Thu Jun 27 23:02:38 2019]  ? security_capable+0x4e/0x70
[Thu Jun 27 23:02:38 2019]  ? ns_capable_common+0x68/0x80
[Thu Jun 27 23:02:38 2019]  ? capable+0x1c/0x20
[Thu Jun 27 23:02:38 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Thu Jun 27 23:02:38 2019]  ? _cond_resched+0x1a/0x50
[Thu Jun 27 23:02:38 2019]  ? __kmalloc+0x178/0x200
[Thu Jun 27 23:02:38 2019]  ? strdup+0x45/0x70 [spl]
[Thu Jun 27 23:02:38 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Thu Jun 27 23:02:38 2019]  do_vfs_ioctl+0xa6/0x620
[Thu Jun 27 23:02:38 2019]  ? handle_mm_fault+0xce/0x1b0
[Thu Jun 27 23:02:38 2019]  ? __do_page_fault+0x27e/0x500
[Thu Jun 27 23:02:38 2019]  SyS_ioctl+0x79/0x90
[Thu Jun 27 23:02:38 2019]  do_syscall_64+0x73/0x130
[Thu Jun 27 23:02:38 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Thu Jun 27 23:02:38 2019] RIP: 0033:0x7f9394e50017
[Thu Jun 27 23:02:38 2019] RSP: 002b:00007ffcfd678208 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Thu Jun 27 23:02:38 2019] RAX: ffffffffffffffda RBX: 00007ffcfd67b950 RCX: 00007f9394e50017
[Thu Jun 27 23:02:38 2019] RDX: 00007ffcfd6783a0 RSI: 0000000000005a1b RDI: 0000000000000005
[Thu Jun 27 23:02:38 2019] RBP: 00007ffcfd6783a0 R08: 0000000000000003 R09: 0000000000020010
[Thu Jun 27 23:02:38 2019] R10: 000055bd860f5150 R11: 0000000000000246 R12: 0000000000000000
[Thu Jun 27 23:02:38 2019] R13: 00007ffcfd686548 R14: 00007ffcfd680090 R15: 0000000000000000
[Thu Jun 27 23:02:38 2019] INFO: task zfs:12319 blocked for more than 120 seconds.
[Thu Jun 27 23:02:38 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Thu Jun 27 23:02:38 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:38 2019] zfs             D    0 12319  12310 0x80000000
[Thu Jun 27 23:02:38 2019] Call Trace:
[Thu Jun 27 23:02:38 2019]  __schedule+0x3e0/0x870
[Thu Jun 27 23:02:38 2019]  schedule+0x36/0x80
[Thu Jun 27 23:02:38 2019]  cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:38 2019]  ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:38 2019]  __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:38 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Thu Jun 27 23:02:38 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Thu Jun 27 23:02:38 2019]  ? dbuf_rele+0x38/0x40 [zfs]
[Thu Jun 27 23:02:38 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Thu Jun 27 23:02:38 2019]  ? security_capable+0x4e/0x70
[Thu Jun 27 23:02:38 2019]  ? ns_capable_common+0x68/0x80
[Thu Jun 27 23:02:38 2019]  ? capable+0x1c/0x20
[Thu Jun 27 23:02:38 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Thu Jun 27 23:02:38 2019]  ? _cond_resched+0x1a/0x50
[Thu Jun 27 23:02:38 2019]  ? __kmalloc+0x178/0x200
[Thu Jun 27 23:02:38 2019]  ? strdup+0x45/0x70 [spl]
[Thu Jun 27 23:02:38 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Thu Jun 27 23:02:38 2019]  do_vfs_ioctl+0xa6/0x620
[Thu Jun 27 23:02:38 2019]  ? handle_mm_fault+0xce/0x1b0
[Thu Jun 27 23:02:38 2019]  ? __do_page_fault+0x27e/0x500
[Thu Jun 27 23:02:38 2019]  SyS_ioctl+0x79/0x90
[Thu Jun 27 23:02:38 2019]  do_syscall_64+0x73/0x130
[Thu Jun 27 23:02:38 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Thu Jun 27 23:02:38 2019] RIP: 0033:0x7f895fbf2017
[Thu Jun 27 23:02:38 2019] RSP: 002b:00007fff20b19eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Thu Jun 27 23:02:38 2019] RAX: ffffffffffffffda RBX: 00007fff20b1d600 RCX: 00007f895fbf2017
[Thu Jun 27 23:02:38 2019] RDX: 00007fff20b1a050 RSI: 0000000000005a1b RDI: 0000000000000005
[Thu Jun 27 23:02:38 2019] RBP: 00007fff20b1a050 R08: 0000000000000003 R09: 0000000000020010
[Thu Jun 27 23:02:38 2019] R10: 0000563173759150 R11: 0000000000000246 R12: 0000000000000000
[Thu Jun 27 23:02:38 2019] R13: 00007fff20b281f8 R14: 00007fff20b21d40 R15: 0000000000000000
[Thu Jun 27 23:02:38 2019] INFO: task zfs:12461 blocked for more than 120 seconds.
[Thu Jun 27 23:02:38 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Thu Jun 27 23:02:38 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:38 2019] zfs             D    0 12461  12428 0x80000000
[Thu Jun 27 23:02:38 2019] Call Trace:
[Thu Jun 27 23:02:38 2019]  __schedule+0x3e0/0x870
[Thu Jun 27 23:02:38 2019]  schedule+0x36/0x80
[Thu Jun 27 23:02:38 2019]  cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:38 2019]  ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:38 2019]  __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:38 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Thu Jun 27 23:02:38 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Thu Jun 27 23:02:38 2019]  ? ___slab_alloc+0x2a2/0x4e0
[Thu Jun 27 23:02:38 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Thu Jun 27 23:02:38 2019]  ? security_capable+0x4e/0x70
[Thu Jun 27 23:02:38 2019]  ? ns_capable_common+0x68/0x80
[Thu Jun 27 23:02:38 2019]  ? capable+0x1c/0x20
[Thu Jun 27 23:02:38 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Thu Jun 27 23:02:38 2019]  ? _cond_resched+0x1a/0x50
[Thu Jun 27 23:02:38 2019]  ? __kmalloc+0x178/0x200
[Thu Jun 27 23:02:38 2019]  ? strdup+0x45/0x70 [spl]
[Thu Jun 27 23:02:38 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Thu Jun 27 23:02:38 2019]  do_vfs_ioctl+0xa6/0x620
[Thu Jun 27 23:02:38 2019]  ? handle_mm_fault+0xce/0x1b0
[Thu Jun 27 23:02:38 2019]  ? __do_page_fault+0x27e/0x500
[Thu Jun 27 23:02:38 2019]  SyS_ioctl+0x79/0x90
[Thu Jun 27 23:02:38 2019]  do_syscall_64+0x73/0x130
[Thu Jun 27 23:02:38 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Thu Jun 27 23:02:38 2019] RIP: 0033:0x7f52b317e017
[Thu Jun 27 23:02:38 2019] RSP: 002b:00007fff318e6be8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Thu Jun 27 23:02:38 2019] RAX: ffffffffffffffda RBX: 00007fff318ea330 RCX: 00007f52b317e017
[Thu Jun 27 23:02:38 2019] RDX: 00007fff318e6d80 RSI: 0000000000005a1b RDI: 0000000000000005
[Thu Jun 27 23:02:38 2019] RBP: 00007fff318e6d80 R08: 0000000000000003 R09: 0000000000020010
[Thu Jun 27 23:02:38 2019] R10: 0000556b1a0dc150 R11: 0000000000000246 R12: 0000000000000000
[Thu Jun 27 23:02:38 2019] R13: 00007fff318f4f28 R14: 00007fff318eea70 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:14297 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs             D    0 14297  14287 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019]  __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019]  schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019]  cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019]  ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019]  __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019]  ? __enqueue_entity+0x5c/0x60
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019]  ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019]  ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019]  ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019]  ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019]  ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019]  ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019]  do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019]  ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019]  ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019]  SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019]  do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f9ac4f6f017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffe6e1f5848 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffe6e1f8f90 RCX: 00007f9ac4f6f017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffe6e1f59e0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffe6e1f59e0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 0000559d853e8150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffe6e203b88 R14: 00007ffe6e1fd6d0 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:14299 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs             D    0 14299  14286 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019]  __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019]  schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019]  cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019]  ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019]  __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019]  ? __enqueue_entity+0x5c/0x60
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019]  ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019]  ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019]  ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019]  ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019]  ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019]  ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019]  do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019]  ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019]  ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019]  SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019]  do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f9878c0a017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffd2e636db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffd2e63a500 RCX: 00007f9878c0a017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffd2e636f50 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffd2e636f50 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 000055e20dcb9150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffd2e6450f8 R14: 00007ffd2e63ec40 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:15342 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs             D    0 15342  15338 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019]  __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019]  schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019]  cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019]  ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019]  __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019]  ? dbuf_rele+0x38/0x40 [zfs]
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019]  ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019]  ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019]  ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019]  ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019]  ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019]  ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019]  do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019]  ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019]  ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019]  SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019]  do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f2e18c34017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffd0bd96738 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffd0bd99e80 RCX: 00007f2e18c34017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffd0bd968d0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffd0bd968d0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 000055c8f147e150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffd0bda4a78 R14: 00007ffd0bd9e5c0 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:16490 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs             D    0 16490  16486 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019]  __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019]  schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019]  cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019]  ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019]  __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019]  ? dbuf_rele+0x38/0x40 [zfs]
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019]  ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019]  ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019]  ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019]  ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019]  ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019]  ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019]  do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019]  ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019]  ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019]  SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019]  do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f9bda53a017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffcd02e8328 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffcd02eba70 RCX: 00007f9bda53a017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffcd02e84c0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffcd02e84c0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 0000562f3f27a150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffcd02f6668 R14: 00007ffcd02f01b0 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:16847 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs             D    0 16847  16843 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019]  __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019]  schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019]  cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019]  ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019]  __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019]  ? __enqueue_entity+0x5c/0x60
[Fri Jun 28 05:03:07 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019]  ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019]  ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019]  ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019]  ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019]  ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019]  ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019]  do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019]  ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019]  ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019]  SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019]  do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f8baad58017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffebb55fa48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffebb563190 RCX: 00007f8baad58017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffebb55fbe0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffebb55fbe0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 0000563b0aaee150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffebb56dd88 R14: 00007ffebb5678d0 R15: 0000000000000000
[Sat Jun 29 01:03:23 2019] INFO: task zfs:24254 blocked for more than 120 seconds.
[Sat Jun 29 01:03:23 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Sat Jun 29 01:03:23 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Jun 29 01:03:23 2019] zfs             D    0 24254  24242 0x80000000
[Sat Jun 29 01:03:23 2019] Call Trace:
[Sat Jun 29 01:03:23 2019]  __schedule+0x3e0/0x870
[Sat Jun 29 01:03:23 2019]  schedule+0x36/0x80
[Sat Jun 29 01:03:23 2019]  cv_wait_common+0x11e/0x140 [spl]
[Sat Jun 29 01:03:23 2019]  ? wait_woken+0x80/0x80
[Sat Jun 29 01:03:23 2019]  __cv_wait+0x15/0x20 [spl]
[Sat Jun 29 01:03:23 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Sat Jun 29 01:03:23 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Sat Jun 29 01:03:23 2019]  ? find_next_bit+0xb/0x10
[Sat Jun 29 01:03:23 2019]  ? cpumask_next_wrap+0x47/0x70
[Sat Jun 29 01:03:23 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Sat Jun 29 01:03:23 2019]  ? security_capable+0x4e/0x70
[Sat Jun 29 01:03:23 2019]  ? ns_capable_common+0x68/0x80
[Sat Jun 29 01:03:23 2019]  ? capable+0x1c/0x20
[Sat Jun 29 01:03:23 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Sat Jun 29 01:03:23 2019]  ? _cond_resched+0x1a/0x50
[Sat Jun 29 01:03:23 2019]  ? __kmalloc+0x178/0x200
[Sat Jun 29 01:03:23 2019]  ? strdup+0x45/0x70 [spl]
[Sat Jun 29 01:03:23 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Sat Jun 29 01:03:23 2019]  do_vfs_ioctl+0xa6/0x620
[Sat Jun 29 01:03:23 2019]  ? handle_mm_fault+0xce/0x1b0
[Sat Jun 29 01:03:23 2019]  ? __do_page_fault+0x27e/0x500
[Sat Jun 29 01:03:23 2019]  SyS_ioctl+0x79/0x90
[Sat Jun 29 01:03:23 2019]  do_syscall_64+0x73/0x130
[Sat Jun 29 01:03:23 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Sat Jun 29 01:03:23 2019] RIP: 0033:0x7f789f379017
[Sat Jun 29 01:03:23 2019] RSP: 002b:00007ffd2c7f6ff8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Sat Jun 29 01:03:23 2019] RAX: ffffffffffffffda RBX: 00007ffd2c7fa740 RCX: 00007f789f379017
[Sat Jun 29 01:03:23 2019] RDX: 00007ffd2c7f7190 RSI: 0000000000005a1b RDI: 0000000000000005
[Sat Jun 29 01:03:23 2019] RBP: 00007ffd2c7f7190 R08: 0000000000000003 R09: 0000000000020010
[Sat Jun 29 01:03:23 2019] R10: 0000557693405150 R11: 0000000000000246 R12: 0000000000000000
[Sat Jun 29 01:03:23 2019] R13: 00007ffd2c805338 R14: 00007ffd2c7fee80 R15: 0000000000000000
[Sat Jun 29 01:03:23 2019] INFO: task zfs:26034 blocked for more than 120 seconds.
[Sat Jun 29 01:03:23 2019]       Tainted: P          IO     4.15.18-16-pve #1
[Sat Jun 29 01:03:23 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Jun 29 01:03:23 2019] zfs             D    0 26034  26032 0x80000000
[Sat Jun 29 01:03:23 2019] Call Trace:
[Sat Jun 29 01:03:23 2019]  __schedule+0x3e0/0x870
[Sat Jun 29 01:03:23 2019]  schedule+0x36/0x80
[Sat Jun 29 01:03:23 2019]  cv_wait_common+0x11e/0x140 [spl]
[Sat Jun 29 01:03:23 2019]  ? wait_woken+0x80/0x80
[Sat Jun 29 01:03:23 2019]  __cv_wait+0x15/0x20 [spl]
[Sat Jun 29 01:03:23 2019]  dmu_recv_stream+0x7c6/0xef0 [zfs]
[Sat Jun 29 01:03:23 2019]  zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Sat Jun 29 01:03:23 2019]  ? __enqueue_entity+0x5c/0x60
[Sat Jun 29 01:03:23 2019]  ? __update_load_avg_blocked_se.isra.37+0xd1/0x150
[Sat Jun 29 01:03:23 2019]  ? __update_load_avg_blocked_se.isra.37+0xd1/0x150
[Sat Jun 29 01:03:23 2019]  zfs_ioc_recv+0x1f8/0x330 [zfs]
[Sat Jun 29 01:03:23 2019]  ? security_capable+0x4e/0x70
[Sat Jun 29 01:03:23 2019]  ? ns_capable_common+0x68/0x80
[Sat Jun 29 01:03:23 2019]  ? capable+0x1c/0x20
[Sat Jun 29 01:03:23 2019]  ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Sat Jun 29 01:03:23 2019]  ? _cond_resched+0x1a/0x50
[Sat Jun 29 01:03:23 2019]  ? __kmalloc+0x178/0x200
[Sat Jun 29 01:03:23 2019]  ? strdup+0x45/0x70 [spl]
[Sat Jun 29 01:03:23 2019]  zfsdev_ioctl+0x5e5/0x670 [zfs]
[Sat Jun 29 01:03:23 2019]  do_vfs_ioctl+0xa6/0x620
[Sat Jun 29 01:03:23 2019]  ? handle_mm_fault+0xce/0x1b0
[Sat Jun 29 01:03:23 2019]  ? __do_page_fault+0x27e/0x500
[Sat Jun 29 01:03:23 2019]  SyS_ioctl+0x79/0x90
[Sat Jun 29 01:03:23 2019]  do_syscall_64+0x73/0x130
[Sat Jun 29 01:03:23 2019]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Sat Jun 29 01:03:23 2019] RIP: 0033:0x7fd1b076b017
[Sat Jun 29 01:03:23 2019] RSP: 002b:00007fff429b5fe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Sat Jun 29 01:03:23 2019] RAX: ffffffffffffffda RBX: 00007fff429b9730 RCX: 00007fd1b076b017
[Sat Jun 29 01:03:23 2019] RDX: 00007fff429b6180 RSI: 0000000000005a1b RDI: 0000000000000005
[Sat Jun 29 01:03:23 2019] RBP: 00007fff429b6180 R08: 0000000000000003 R09: 0000000000020010
[Sat Jun 29 01:03:23 2019] R10: 000055d973501150 R11: 0000000000000246 R12: 0000000000000000
[Sat Jun 29 01:03:23 2019] R13: 00007fff429c4328 R14: 00007fff429bde70 R15: 0000000000000000

While I did not do extensive analysis, I think there is no impact on production VM instances running on the Cluster. If it just that ZFS sending (backups) are slow from time to time, I do not think it is a problem. BTW zfs sends are successfull every time, according to logs and zfs list -t all on backup server.

Can anyone explain these messages?
Do you think I should be worried?
Do you think, slow network ZFS sends can interfere with VMs running on send nodes in production?
 
if zfs send completes, then yes, this is an artifact of overloading somewhere along the send receive pipeline. just monitor the situation on both ends to find out where the bottle neck is..
 
Hi Fabian, tnx for your response.
I know the bottle neck is on the receiving "backup" server.
Do you think, slow network ZFS sends can interfere with VMs running on send nodes in production?
 
Had a machine fail last night with a very similar error, same kernel version. It was a hard hang with the following printed on the console, but no stack trace or logs persisted to disk:

Code:
pve2 login: [34680.462104] INFO: task send_traverse:26615 blocked for more than 120 seconds.
[34680.547896]       Tainted: P           O     4.15.18-16-pve #1
[34680.617987] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

I don’t believe any zfs send/receives were running at that time.
 
the ZFS message itself is not a hang - it just says that a kernel task has not made progress for 120s ..
 
FYI, mine work just fine. Even backup server seems to keep up now, so I do not get any more of these messages.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!