Hi guys,
on two PM (proxmox-ve: 5.4-1 (running kernel: 4.15.18-16-pve)) servers, I see messages like:
I got a "feeling" that this might be related to big ZFS network sends to an overloaded ZFS receiving server.
This is how it looks on ZFS backup / receiving server:
While I did not do extensive analysis, I think there is no impact on production VM instances running on the Cluster. If it just that ZFS sending (backups) are slow from time to time, I do not think it is a problem. BTW zfs sends are successfull every time, according to logs and zfs list -t all on backup server.
Can anyone explain these messages?
Do you think I should be worried?
Do you think, slow network ZFS sends can interfere with VMs running on send nodes in production?
on two PM (proxmox-ve: 5.4-1 (running kernel: 4.15.18-16-pve)) servers, I see messages like:
Code:
[Thu Jun 27 23:02:34 2019] INFO: task send_traverse:16723 blocked for more than 120 seconds.
[Thu Jun 27 23:02:34 2019] Tainted: P O 4.15.18-16-pve #1
[Thu Jun 27 23:02:34 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:34 2019] send_traverse D 0 16723 2 0x80000000
[Thu Jun 27 23:02:34 2019] Call Trace:
[Thu Jun 27 23:02:34 2019] __schedule+0x3e0/0x870
[Thu Jun 27 23:02:34 2019] schedule+0x36/0x80
[Thu Jun 27 23:02:34 2019] cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:34 2019] ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:34 2019] __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:34 2019] bqueue_enqueue+0x62/0xe0 [zfs]
[Thu Jun 27 23:02:34 2019] send_cb+0x14a/0x190 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_visitbp+0x459/0x980 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_dnode+0xaa/0x1c0 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_visitbp+0x878/0x980 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_impl+0x1e3/0x450 [zfs]
[Thu Jun 27 23:02:34 2019] ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Thu Jun 27 23:02:34 2019] ? byteswap_record+0x2b0/0x2b0 [zfs]
[Thu Jun 27 23:02:34 2019] traverse_dataset_resume+0x46/0x50 [zfs]
[Thu Jun 27 23:02:34 2019] ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Thu Jun 27 23:02:34 2019] send_traverse_thread+0x59/0xb0 [zfs]
[Thu Jun 27 23:02:34 2019] thread_generic_wrapper+0x74/0x90 [spl]
[Thu Jun 27 23:02:34 2019] kthread+0x105/0x140
[Thu Jun 27 23:02:34 2019] ? __thread_exit+0x20/0x20 [spl]
[Thu Jun 27 23:02:34 2019] ? kthread_create_worker_on_cpu+0x70/0x70
[Thu Jun 27 23:02:34 2019] ret_from_fork+0x35/0x40
Code:
[Tue Jul 2 23:07:18 2019] INFO: task send_traverse:17000 blocked for more than 120 seconds.
[Tue Jul 2 23:07:18 2019] Tainted: P O 4.15.18-16-pve #1
[Tue Jul 2 23:07:18 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Jul 2 23:07:18 2019] send_traverse D 0 17000 2 0x80000000
[Tue Jul 2 23:07:18 2019] Call Trace:
[Tue Jul 2 23:07:18 2019] __schedule+0x3e0/0x870
[Tue Jul 2 23:07:18 2019] schedule+0x36/0x80
[Tue Jul 2 23:07:18 2019] cv_wait_common+0x11e/0x140 [spl]
[Tue Jul 2 23:07:18 2019] ? wait_woken+0x80/0x80
[Tue Jul 2 23:07:18 2019] __cv_wait+0x15/0x20 [spl]
[Tue Jul 2 23:07:18 2019] bqueue_enqueue+0x62/0xe0 [zfs]
[Tue Jul 2 23:07:18 2019] send_cb+0x14a/0x190 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x49c/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_dnode+0xaa/0x1c0 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x70e/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x5db/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_dnode+0xaa/0x1c0 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_visitbp+0x878/0x980 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_impl+0x1e3/0x450 [zfs]
[Tue Jul 2 23:07:18 2019] ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Tue Jul 2 23:07:18 2019] ? byteswap_record+0x2b0/0x2b0 [zfs]
[Tue Jul 2 23:07:18 2019] traverse_dataset_resume+0x46/0x50 [zfs]
[Tue Jul 2 23:07:18 2019] ? dmu_send_impl+0x14d0/0x14d0 [zfs]
[Tue Jul 2 23:07:18 2019] send_traverse_thread+0x59/0xb0 [zfs]
[Tue Jul 2 23:07:18 2019] thread_generic_wrapper+0x74/0x90 [spl]
[Tue Jul 2 23:07:18 2019] kthread+0x105/0x140
[Tue Jul 2 23:07:18 2019] ? __thread_exit+0x20/0x20 [spl]
[Tue Jul 2 23:07:18 2019] ? kthread_create_worker_on_cpu+0x70/0x70
[Tue Jul 2 23:07:18 2019] ret_from_fork+0x35/0x40
I got a "feeling" that this might be related to big ZFS network sends to an overloaded ZFS receiving server.
This is how it looks on ZFS backup / receiving server:
Code:
[Thu Jun 27 23:02:38 2019] INFO: task zfs:10360 blocked for more than 120 seconds.
[Thu Jun 27 23:02:38 2019] Tainted: P IO 4.15.18-16-pve #1
[Thu Jun 27 23:02:38 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:38 2019] zfs D 0 10360 10353 0x80000000
[Thu Jun 27 23:02:38 2019] Call Trace:
[Thu Jun 27 23:02:38 2019] __schedule+0x3e0/0x870
[Thu Jun 27 23:02:38 2019] schedule+0x36/0x80
[Thu Jun 27 23:02:38 2019] cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:38 2019] ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:38 2019] __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:38 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Thu Jun 27 23:02:38 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Thu Jun 27 23:02:38 2019] ? __enqueue_entity+0x5c/0x60
[Thu Jun 27 23:02:38 2019] ? sched_clock+0x9/0x10
[Thu Jun 27 23:02:38 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Thu Jun 27 23:02:38 2019] ? security_capable+0x4e/0x70
[Thu Jun 27 23:02:38 2019] ? ns_capable_common+0x68/0x80
[Thu Jun 27 23:02:38 2019] ? capable+0x1c/0x20
[Thu Jun 27 23:02:38 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Thu Jun 27 23:02:38 2019] ? _cond_resched+0x1a/0x50
[Thu Jun 27 23:02:38 2019] ? __kmalloc+0x178/0x200
[Thu Jun 27 23:02:38 2019] ? strdup+0x45/0x70 [spl]
[Thu Jun 27 23:02:38 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Thu Jun 27 23:02:38 2019] do_vfs_ioctl+0xa6/0x620
[Thu Jun 27 23:02:38 2019] ? handle_mm_fault+0xce/0x1b0
[Thu Jun 27 23:02:38 2019] ? __do_page_fault+0x27e/0x500
[Thu Jun 27 23:02:38 2019] SyS_ioctl+0x79/0x90
[Thu Jun 27 23:02:38 2019] do_syscall_64+0x73/0x130
[Thu Jun 27 23:02:38 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Thu Jun 27 23:02:38 2019] RIP: 0033:0x7f9394e50017
[Thu Jun 27 23:02:38 2019] RSP: 002b:00007ffcfd678208 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Thu Jun 27 23:02:38 2019] RAX: ffffffffffffffda RBX: 00007ffcfd67b950 RCX: 00007f9394e50017
[Thu Jun 27 23:02:38 2019] RDX: 00007ffcfd6783a0 RSI: 0000000000005a1b RDI: 0000000000000005
[Thu Jun 27 23:02:38 2019] RBP: 00007ffcfd6783a0 R08: 0000000000000003 R09: 0000000000020010
[Thu Jun 27 23:02:38 2019] R10: 000055bd860f5150 R11: 0000000000000246 R12: 0000000000000000
[Thu Jun 27 23:02:38 2019] R13: 00007ffcfd686548 R14: 00007ffcfd680090 R15: 0000000000000000
[Thu Jun 27 23:02:38 2019] INFO: task zfs:12319 blocked for more than 120 seconds.
[Thu Jun 27 23:02:38 2019] Tainted: P IO 4.15.18-16-pve #1
[Thu Jun 27 23:02:38 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:38 2019] zfs D 0 12319 12310 0x80000000
[Thu Jun 27 23:02:38 2019] Call Trace:
[Thu Jun 27 23:02:38 2019] __schedule+0x3e0/0x870
[Thu Jun 27 23:02:38 2019] schedule+0x36/0x80
[Thu Jun 27 23:02:38 2019] cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:38 2019] ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:38 2019] __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:38 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Thu Jun 27 23:02:38 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Thu Jun 27 23:02:38 2019] ? dbuf_rele+0x38/0x40 [zfs]
[Thu Jun 27 23:02:38 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Thu Jun 27 23:02:38 2019] ? security_capable+0x4e/0x70
[Thu Jun 27 23:02:38 2019] ? ns_capable_common+0x68/0x80
[Thu Jun 27 23:02:38 2019] ? capable+0x1c/0x20
[Thu Jun 27 23:02:38 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Thu Jun 27 23:02:38 2019] ? _cond_resched+0x1a/0x50
[Thu Jun 27 23:02:38 2019] ? __kmalloc+0x178/0x200
[Thu Jun 27 23:02:38 2019] ? strdup+0x45/0x70 [spl]
[Thu Jun 27 23:02:38 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Thu Jun 27 23:02:38 2019] do_vfs_ioctl+0xa6/0x620
[Thu Jun 27 23:02:38 2019] ? handle_mm_fault+0xce/0x1b0
[Thu Jun 27 23:02:38 2019] ? __do_page_fault+0x27e/0x500
[Thu Jun 27 23:02:38 2019] SyS_ioctl+0x79/0x90
[Thu Jun 27 23:02:38 2019] do_syscall_64+0x73/0x130
[Thu Jun 27 23:02:38 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Thu Jun 27 23:02:38 2019] RIP: 0033:0x7f895fbf2017
[Thu Jun 27 23:02:38 2019] RSP: 002b:00007fff20b19eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Thu Jun 27 23:02:38 2019] RAX: ffffffffffffffda RBX: 00007fff20b1d600 RCX: 00007f895fbf2017
[Thu Jun 27 23:02:38 2019] RDX: 00007fff20b1a050 RSI: 0000000000005a1b RDI: 0000000000000005
[Thu Jun 27 23:02:38 2019] RBP: 00007fff20b1a050 R08: 0000000000000003 R09: 0000000000020010
[Thu Jun 27 23:02:38 2019] R10: 0000563173759150 R11: 0000000000000246 R12: 0000000000000000
[Thu Jun 27 23:02:38 2019] R13: 00007fff20b281f8 R14: 00007fff20b21d40 R15: 0000000000000000
[Thu Jun 27 23:02:38 2019] INFO: task zfs:12461 blocked for more than 120 seconds.
[Thu Jun 27 23:02:38 2019] Tainted: P IO 4.15.18-16-pve #1
[Thu Jun 27 23:02:38 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Jun 27 23:02:38 2019] zfs D 0 12461 12428 0x80000000
[Thu Jun 27 23:02:38 2019] Call Trace:
[Thu Jun 27 23:02:38 2019] __schedule+0x3e0/0x870
[Thu Jun 27 23:02:38 2019] schedule+0x36/0x80
[Thu Jun 27 23:02:38 2019] cv_wait_common+0x11e/0x140 [spl]
[Thu Jun 27 23:02:38 2019] ? wait_woken+0x80/0x80
[Thu Jun 27 23:02:38 2019] __cv_wait+0x15/0x20 [spl]
[Thu Jun 27 23:02:38 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Thu Jun 27 23:02:38 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Thu Jun 27 23:02:38 2019] ? ___slab_alloc+0x2a2/0x4e0
[Thu Jun 27 23:02:38 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Thu Jun 27 23:02:38 2019] ? security_capable+0x4e/0x70
[Thu Jun 27 23:02:38 2019] ? ns_capable_common+0x68/0x80
[Thu Jun 27 23:02:38 2019] ? capable+0x1c/0x20
[Thu Jun 27 23:02:38 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Thu Jun 27 23:02:38 2019] ? _cond_resched+0x1a/0x50
[Thu Jun 27 23:02:38 2019] ? __kmalloc+0x178/0x200
[Thu Jun 27 23:02:38 2019] ? strdup+0x45/0x70 [spl]
[Thu Jun 27 23:02:38 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Thu Jun 27 23:02:38 2019] do_vfs_ioctl+0xa6/0x620
[Thu Jun 27 23:02:38 2019] ? handle_mm_fault+0xce/0x1b0
[Thu Jun 27 23:02:38 2019] ? __do_page_fault+0x27e/0x500
[Thu Jun 27 23:02:38 2019] SyS_ioctl+0x79/0x90
[Thu Jun 27 23:02:38 2019] do_syscall_64+0x73/0x130
[Thu Jun 27 23:02:38 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Thu Jun 27 23:02:38 2019] RIP: 0033:0x7f52b317e017
[Thu Jun 27 23:02:38 2019] RSP: 002b:00007fff318e6be8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Thu Jun 27 23:02:38 2019] RAX: ffffffffffffffda RBX: 00007fff318ea330 RCX: 00007f52b317e017
[Thu Jun 27 23:02:38 2019] RDX: 00007fff318e6d80 RSI: 0000000000005a1b RDI: 0000000000000005
[Thu Jun 27 23:02:38 2019] RBP: 00007fff318e6d80 R08: 0000000000000003 R09: 0000000000020010
[Thu Jun 27 23:02:38 2019] R10: 0000556b1a0dc150 R11: 0000000000000246 R12: 0000000000000000
[Thu Jun 27 23:02:38 2019] R13: 00007fff318f4f28 R14: 00007fff318eea70 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:14297 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019] Tainted: P IO 4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs D 0 14297 14287 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019] __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019] schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019] cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019] ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019] __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019] ? __enqueue_entity+0x5c/0x60
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019] ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019] ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019] ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019] ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019] ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019] ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019] do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019] ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019] ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019] SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019] do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f9ac4f6f017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffe6e1f5848 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffe6e1f8f90 RCX: 00007f9ac4f6f017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffe6e1f59e0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffe6e1f59e0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 0000559d853e8150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffe6e203b88 R14: 00007ffe6e1fd6d0 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:14299 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019] Tainted: P IO 4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs D 0 14299 14286 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019] __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019] schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019] cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019] ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019] __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019] ? __enqueue_entity+0x5c/0x60
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019] ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019] ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019] ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019] ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019] ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019] ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019] do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019] ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019] ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019] SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019] do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f9878c0a017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffd2e636db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffd2e63a500 RCX: 00007f9878c0a017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffd2e636f50 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffd2e636f50 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 000055e20dcb9150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffd2e6450f8 R14: 00007ffd2e63ec40 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:15342 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019] Tainted: P IO 4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs D 0 15342 15338 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019] __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019] schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019] cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019] ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019] __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019] ? dbuf_rele+0x38/0x40 [zfs]
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019] ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019] ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019] ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019] ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019] ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019] ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019] do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019] ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019] ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019] SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019] do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f2e18c34017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffd0bd96738 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffd0bd99e80 RCX: 00007f2e18c34017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffd0bd968d0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffd0bd968d0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 000055c8f147e150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffd0bda4a78 R14: 00007ffd0bd9e5c0 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:16490 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019] Tainted: P IO 4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs D 0 16490 16486 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019] __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019] schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019] cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019] ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019] __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019] ? dbuf_rele+0x38/0x40 [zfs]
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019] ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019] ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019] ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019] ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019] ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019] ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019] do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019] ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019] ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019] SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019] do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f9bda53a017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffcd02e8328 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffcd02eba70 RCX: 00007f9bda53a017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffcd02e84c0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffcd02e84c0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 0000562f3f27a150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffcd02f6668 R14: 00007ffcd02f01b0 R15: 0000000000000000
[Fri Jun 28 05:03:07 2019] INFO: task zfs:16847 blocked for more than 120 seconds.
[Fri Jun 28 05:03:07 2019] Tainted: P IO 4.15.18-16-pve #1
[Fri Jun 28 05:03:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 28 05:03:07 2019] zfs D 0 16847 16843 0x80000000
[Fri Jun 28 05:03:07 2019] Call Trace:
[Fri Jun 28 05:03:07 2019] __schedule+0x3e0/0x870
[Fri Jun 28 05:03:07 2019] schedule+0x36/0x80
[Fri Jun 28 05:03:07 2019] cv_wait_common+0x11e/0x140 [spl]
[Fri Jun 28 05:03:07 2019] ? wait_woken+0x80/0x80
[Fri Jun 28 05:03:07 2019] __cv_wait+0x15/0x20 [spl]
[Fri Jun 28 05:03:07 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Fri Jun 28 05:03:07 2019] ? __enqueue_entity+0x5c/0x60
[Fri Jun 28 05:03:07 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Fri Jun 28 05:03:07 2019] ? security_capable+0x4e/0x70
[Fri Jun 28 05:03:07 2019] ? ns_capable_common+0x68/0x80
[Fri Jun 28 05:03:07 2019] ? capable+0x1c/0x20
[Fri Jun 28 05:03:07 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Fri Jun 28 05:03:07 2019] ? _cond_resched+0x1a/0x50
[Fri Jun 28 05:03:07 2019] ? __kmalloc+0x178/0x200
[Fri Jun 28 05:03:07 2019] ? strdup+0x45/0x70 [spl]
[Fri Jun 28 05:03:07 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Fri Jun 28 05:03:07 2019] do_vfs_ioctl+0xa6/0x620
[Fri Jun 28 05:03:07 2019] ? handle_mm_fault+0xce/0x1b0
[Fri Jun 28 05:03:07 2019] ? __do_page_fault+0x27e/0x500
[Fri Jun 28 05:03:07 2019] SyS_ioctl+0x79/0x90
[Fri Jun 28 05:03:07 2019] do_syscall_64+0x73/0x130
[Fri Jun 28 05:03:07 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Fri Jun 28 05:03:07 2019] RIP: 0033:0x7f8baad58017
[Fri Jun 28 05:03:07 2019] RSP: 002b:00007ffebb55fa48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Fri Jun 28 05:03:07 2019] RAX: ffffffffffffffda RBX: 00007ffebb563190 RCX: 00007f8baad58017
[Fri Jun 28 05:03:07 2019] RDX: 00007ffebb55fbe0 RSI: 0000000000005a1b RDI: 0000000000000005
[Fri Jun 28 05:03:07 2019] RBP: 00007ffebb55fbe0 R08: 0000000000000003 R09: 0000000000020010
[Fri Jun 28 05:03:07 2019] R10: 0000563b0aaee150 R11: 0000000000000246 R12: 0000000000000000
[Fri Jun 28 05:03:07 2019] R13: 00007ffebb56dd88 R14: 00007ffebb5678d0 R15: 0000000000000000
[Sat Jun 29 01:03:23 2019] INFO: task zfs:24254 blocked for more than 120 seconds.
[Sat Jun 29 01:03:23 2019] Tainted: P IO 4.15.18-16-pve #1
[Sat Jun 29 01:03:23 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Jun 29 01:03:23 2019] zfs D 0 24254 24242 0x80000000
[Sat Jun 29 01:03:23 2019] Call Trace:
[Sat Jun 29 01:03:23 2019] __schedule+0x3e0/0x870
[Sat Jun 29 01:03:23 2019] schedule+0x36/0x80
[Sat Jun 29 01:03:23 2019] cv_wait_common+0x11e/0x140 [spl]
[Sat Jun 29 01:03:23 2019] ? wait_woken+0x80/0x80
[Sat Jun 29 01:03:23 2019] __cv_wait+0x15/0x20 [spl]
[Sat Jun 29 01:03:23 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Sat Jun 29 01:03:23 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Sat Jun 29 01:03:23 2019] ? find_next_bit+0xb/0x10
[Sat Jun 29 01:03:23 2019] ? cpumask_next_wrap+0x47/0x70
[Sat Jun 29 01:03:23 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Sat Jun 29 01:03:23 2019] ? security_capable+0x4e/0x70
[Sat Jun 29 01:03:23 2019] ? ns_capable_common+0x68/0x80
[Sat Jun 29 01:03:23 2019] ? capable+0x1c/0x20
[Sat Jun 29 01:03:23 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Sat Jun 29 01:03:23 2019] ? _cond_resched+0x1a/0x50
[Sat Jun 29 01:03:23 2019] ? __kmalloc+0x178/0x200
[Sat Jun 29 01:03:23 2019] ? strdup+0x45/0x70 [spl]
[Sat Jun 29 01:03:23 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Sat Jun 29 01:03:23 2019] do_vfs_ioctl+0xa6/0x620
[Sat Jun 29 01:03:23 2019] ? handle_mm_fault+0xce/0x1b0
[Sat Jun 29 01:03:23 2019] ? __do_page_fault+0x27e/0x500
[Sat Jun 29 01:03:23 2019] SyS_ioctl+0x79/0x90
[Sat Jun 29 01:03:23 2019] do_syscall_64+0x73/0x130
[Sat Jun 29 01:03:23 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Sat Jun 29 01:03:23 2019] RIP: 0033:0x7f789f379017
[Sat Jun 29 01:03:23 2019] RSP: 002b:00007ffd2c7f6ff8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Sat Jun 29 01:03:23 2019] RAX: ffffffffffffffda RBX: 00007ffd2c7fa740 RCX: 00007f789f379017
[Sat Jun 29 01:03:23 2019] RDX: 00007ffd2c7f7190 RSI: 0000000000005a1b RDI: 0000000000000005
[Sat Jun 29 01:03:23 2019] RBP: 00007ffd2c7f7190 R08: 0000000000000003 R09: 0000000000020010
[Sat Jun 29 01:03:23 2019] R10: 0000557693405150 R11: 0000000000000246 R12: 0000000000000000
[Sat Jun 29 01:03:23 2019] R13: 00007ffd2c805338 R14: 00007ffd2c7fee80 R15: 0000000000000000
[Sat Jun 29 01:03:23 2019] INFO: task zfs:26034 blocked for more than 120 seconds.
[Sat Jun 29 01:03:23 2019] Tainted: P IO 4.15.18-16-pve #1
[Sat Jun 29 01:03:23 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Jun 29 01:03:23 2019] zfs D 0 26034 26032 0x80000000
[Sat Jun 29 01:03:23 2019] Call Trace:
[Sat Jun 29 01:03:23 2019] __schedule+0x3e0/0x870
[Sat Jun 29 01:03:23 2019] schedule+0x36/0x80
[Sat Jun 29 01:03:23 2019] cv_wait_common+0x11e/0x140 [spl]
[Sat Jun 29 01:03:23 2019] ? wait_woken+0x80/0x80
[Sat Jun 29 01:03:23 2019] __cv_wait+0x15/0x20 [spl]
[Sat Jun 29 01:03:23 2019] dmu_recv_stream+0x7c6/0xef0 [zfs]
[Sat Jun 29 01:03:23 2019] zfs_ioc_recv_impl+0x1015/0x10f0 [zfs]
[Sat Jun 29 01:03:23 2019] ? __enqueue_entity+0x5c/0x60
[Sat Jun 29 01:03:23 2019] ? __update_load_avg_blocked_se.isra.37+0xd1/0x150
[Sat Jun 29 01:03:23 2019] ? __update_load_avg_blocked_se.isra.37+0xd1/0x150
[Sat Jun 29 01:03:23 2019] zfs_ioc_recv+0x1f8/0x330 [zfs]
[Sat Jun 29 01:03:23 2019] ? security_capable+0x4e/0x70
[Sat Jun 29 01:03:23 2019] ? ns_capable_common+0x68/0x80
[Sat Jun 29 01:03:23 2019] ? capable+0x1c/0x20
[Sat Jun 29 01:03:23 2019] ? priv_policy.isra.3.part.4+0x11/0x20 [zfs]
[Sat Jun 29 01:03:23 2019] ? _cond_resched+0x1a/0x50
[Sat Jun 29 01:03:23 2019] ? __kmalloc+0x178/0x200
[Sat Jun 29 01:03:23 2019] ? strdup+0x45/0x70 [spl]
[Sat Jun 29 01:03:23 2019] zfsdev_ioctl+0x5e5/0x670 [zfs]
[Sat Jun 29 01:03:23 2019] do_vfs_ioctl+0xa6/0x620
[Sat Jun 29 01:03:23 2019] ? handle_mm_fault+0xce/0x1b0
[Sat Jun 29 01:03:23 2019] ? __do_page_fault+0x27e/0x500
[Sat Jun 29 01:03:23 2019] SyS_ioctl+0x79/0x90
[Sat Jun 29 01:03:23 2019] do_syscall_64+0x73/0x130
[Sat Jun 29 01:03:23 2019] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[Sat Jun 29 01:03:23 2019] RIP: 0033:0x7fd1b076b017
[Sat Jun 29 01:03:23 2019] RSP: 002b:00007fff429b5fe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Sat Jun 29 01:03:23 2019] RAX: ffffffffffffffda RBX: 00007fff429b9730 RCX: 00007fd1b076b017
[Sat Jun 29 01:03:23 2019] RDX: 00007fff429b6180 RSI: 0000000000005a1b RDI: 0000000000000005
[Sat Jun 29 01:03:23 2019] RBP: 00007fff429b6180 R08: 0000000000000003 R09: 0000000000020010
[Sat Jun 29 01:03:23 2019] R10: 000055d973501150 R11: 0000000000000246 R12: 0000000000000000
[Sat Jun 29 01:03:23 2019] R13: 00007fff429c4328 R14: 00007fff429bde70 R15: 0000000000000000
While I did not do extensive analysis, I think there is no impact on production VM instances running on the Cluster. If it just that ZFS sending (backups) are slow from time to time, I do not think it is a problem. BTW zfs sends are successfull every time, according to logs and zfs list -t all on backup server.
Can anyone explain these messages?
Do you think I should be worried?
Do you think, slow network ZFS sends can interfere with VMs running on send nodes in production?