New install, task pve-startall-de:993 blocked for more than 120 seconds.

jpnel · Dec 21, 2020

Hi,

I have a new 3 node install that isn't yet running any VM or containers, yet seems rather unstable.

A short while after booting, I'll start to get messages such as:

Code:

[  243.066934] INFO: task pve-startall-de:993 blocked for more than 120 seconds.
[  243.067064]       Tainted: P           O      5.4.65-1-pve #1
[  243.067098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.067143] pve-startall-de D    0   993      1 0x00004000
[  243.067146] Call Trace:
[  243.067158]  __schedule+0x2e6/0x6f0
[  243.067160]  ? _cond_resched+0x19/0x30
[  243.067163]  schedule+0x33/0xa0
[  243.067164]  rwsem_down_read_slowpath+0x370/0x470
[  243.067169]  ? link_path_walk.part.42+0x29e/0x540
[  243.067170]  down_read+0x95/0xa0
[  243.067172]  path_openat+0xa33/0x16f0
[  243.067176]  ? generic_file_read_iter+0xa06/0xd70
[  243.067179]  ? _cond_resched+0x19/0x30
[  243.067180]  ? down_read+0x12/0xa0
[  243.067183]  ? _copy_to_user+0x2b/0x40
[  243.067186]  do_filp_open+0x93/0x100
[  243.067189]  ? __alloc_fd+0x46/0x150
[  243.067191]  do_sys_open+0x177/0x280
[  243.067193]  __x64_sys_openat+0x20/0x30
[  243.067197]  do_syscall_64+0x57/0x190
[  243.067199]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  243.067202] RIP: 0033:0x7f870d9d1c8b
[  243.067209] Code: Bad RIP value.
[  243.067211] RSP: 002b:00007ffe5962f030 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[  243.067213] RAX: ffffffffffffffda RBX: 00007ffe5962f2b0 RCX: 00007f870d9d1c8b
[  243.067213] RDX: 0000000000080000 RSI: 000055ebf5338590 RDI: 00000000ffffff9c
[  243.067214] RBP: 000055ebf530a260 R08: 00007ffe5962f2b0 R09: 00000000ffffffff
[  243.067215] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  243.067216] R13: 000055ebf5338590 R14: 0000000000000000 R15: 0000000000000000
[  363.886927] INFO: task pve-startall-de:993 blocked for more than 241 seconds.
[  363.886977]       Tainted: P           O      5.4.65-1-pve #1
[  363.887009] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  363.887053] pve-startall-de D    0   993      1 0x00004000
[  363.887055] Call Trace:
[  363.887066]  __schedule+0x2e6/0x6f0
[  363.887069]  ? _cond_resched+0x19/0x30
[  363.887071]  schedule+0x33/0xa0
[  363.887073]  rwsem_down_read_slowpath+0x370/0x470
[  363.887078]  ? link_path_walk.part.42+0x29e/0x540
[  363.887079]  down_read+0x95/0xa0
[  363.887081]  path_openat+0xa33/0x16f0
[  363.887085]  ? generic_file_read_iter+0xa06/0xd70
[  363.887088]  ? _cond_resched+0x19/0x30
[  363.887089]  ? down_read+0x12/0xa0
[  363.887093]  ? _copy_to_user+0x2b/0x40
[  363.887095]  do_filp_open+0x93/0x100
[  363.887098]  ? __alloc_fd+0x46/0x150
[  363.887100]  do_sys_open+0x177/0x280
[  363.887103]  __x64_sys_openat+0x20/0x30
[  363.887106]  do_syscall_64+0x57/0x190
[  363.887108]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  363.887111] RIP: 0033:0x7f870d9d1c8b
[  363.887118] Code: Bad RIP value.
[  363.887119] RSP: 002b:00007ffe5962f030 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[  363.887121] RAX: ffffffffffffffda RBX: 00007ffe5962f2b0 RCX: 00007f870d9d1c8b
[  363.887122] RDX: 0000000000080000 RSI: 000055ebf5338590 RDI: 00000000ffffff9c
[  363.887123] RBP: 000055ebf530a260 R08: 00007ffe5962f2b0 R09: 00000000ffffffff
[  363.887123] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  363.887124] R13: 000055ebf5338590 R14: 0000000000000000 R15: 0000000000000000
[  484.712688] INFO: task pve-startall-de:993 blocked for more than 362 seconds.
[  484.712739]       Tainted: P           O      5.4.65-1-pve #1
[  484.712771] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  484.712815] pve-startall-de D    0   993      1 0x00004000
[  484.712819] Call Trace:
[  484.712830]  __schedule+0x2e6/0x6f0
[  484.712833]  ? _cond_resched+0x19/0x30
[  484.712835]  schedule+0x33/0xa0
[  484.712837]  rwsem_down_read_slowpath+0x370/0x470
[  484.712841]  ? link_path_walk.part.42+0x29e/0x540
[  484.712843]  down_read+0x95/0xa0
[  484.712844]  path_openat+0xa33/0x16f0
[  484.712849]  ? generic_file_read_iter+0xa06/0xd70
[  484.712851]  ? _cond_resched+0x19/0x30
[  484.712852]  ? down_read+0x12/0xa0
[  484.712856]  ? _copy_to_user+0x2b/0x40
[  484.712858]  do_filp_open+0x93/0x100
[  484.712861]  ? __alloc_fd+0x46/0x150
[  484.712863]  do_sys_open+0x177/0x280
[  484.712866]  __x64_sys_openat+0x20/0x30
[  484.712869]  do_syscall_64+0x57/0x190
[  484.712871]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  484.712874] RIP: 0033:0x7f870d9d1c8b
[  484.712880] Code: Bad RIP value.
[  484.712881] RSP: 002b:00007ffe5962f030 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[  484.712883] RAX: ffffffffffffffda RBX: 00007ffe5962f2b0 RCX: 00007f870d9d1c8b
[  484.712884] RDX: 0000000000080000 RSI: 000055ebf5338590 RDI: 00000000ffffff9c
[  484.712885] RBP: 000055ebf530a260 R08: 00007ffe5962f2b0 R09: 00000000ffffffff
[  484.712885] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  484.712886] R13: 000055ebf5338590 R14: 0000000000000000 R15: 0000000000000000
[  605.531891] INFO: task pve-startall-de:993 blocked for more than 483 seconds.
[  605.531941]       Tainted: P           O      5.4.65-1-pve #1
[  605.531974] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  605.532017] pve-startall-de D    0   993      1 0x00004000
[  605.532020] Call Trace:
[  605.532032]  __schedule+0x2e6/0x6f0
[  605.532035]  ? _cond_resched+0x19/0x30
[  605.532037]  schedule+0x33/0xa0
[  605.532039]  rwsem_down_read_slowpath+0x370/0x470
[  605.532043]  ? link_path_walk.part.42+0x29e/0x540
[  605.532045]  down_read+0x95/0xa0
[  605.532047]  path_openat+0xa33/0x16f0
[  605.532050]  ? generic_file_read_iter+0xa06/0xd70
[  605.532053]  ? _cond_resched+0x19/0x30
[  605.532054]  ? down_read+0x12/0xa0
[  605.532057]  ? _copy_to_user+0x2b/0x40
[  605.532060]  do_filp_open+0x93/0x100
[  605.532063]  ? __alloc_fd+0x46/0x150
[  605.532065]  do_sys_open+0x177/0x280
[  605.532067]  __x64_sys_openat+0x20/0x30
[  605.532071]  do_syscall_64+0x57/0x190
[  605.532073]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  605.532076] RIP: 0033:0x7f870d9d1c8b
[  605.532082] Code: Bad RIP value.
[  605.532083] RSP: 002b:00007ffe5962f030 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[  605.532085] RAX: ffffffffffffffda RBX: 00007ffe5962f2b0 RCX: 00007f870d9d1c8b
[  605.532086] RDX: 0000000000080000 RSI: 000055ebf5338590 RDI: 00000000ffffff9c
[  605.532087] RBP: 000055ebf530a260 R08: 00007ffe5962f2b0 R09: 00000000ffffffff
[  605.532088] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  605.532089] R13: 000055ebf5338590 R14: 0000000000000000 R15: 0000000000000000

The particular services that get stuck, however, aren't always predictable. On other nodes, for example, I'll get messages like the following

Code:

[ 5075.820911] INFO: task pvesr:12479 blocked for more than 362 seconds.
[ 5075.820980]       Tainted: P           O      5.4.65-1-pve #1
[ 5075.821013] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 5075.821057] pvesr           D    0 12479      1 0x00000004
[ 5075.821060] Call Trace:
[ 5075.821071]  __schedule+0x2e6/0x6f0
[ 5075.821076]  ? filename_parentat.isra.57.part.58+0xf7/0x180
[ 5075.821078]  schedule+0x33/0xa0
[ 5075.821083]  rwsem_down_write_slowpath+0x2ed/0x4a0
[ 5075.821085]  down_write+0x3d/0x40
[ 5075.821087]  filename_create+0x8e/0x180
[ 5075.821089]  do_mkdirat+0x59/0x110
[ 5075.821092]  __x64_sys_mkdir+0x1b/0x20
[ 5075.821095]  do_syscall_64+0x57/0x190
[ 5075.821097]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 5075.821100] RIP: 0033:0x7fc47e10b0d7
[ 5075.821106] Code: Bad RIP value.
[ 5075.821107] RSP: 002b:00007ffccf3ab0b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
[ 5075.821110] RAX: ffffffffffffffda RBX: 0000556c84bdc260 RCX: 00007fc47e10b0d7
[ 5075.821110] RDX: 0000556c841fe3d4 RSI: 00000000000001ff RDI: 0000556c88f95af0
[ 5075.821111] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fc47e15cb80
[ 5075.821112] R10: 0000000000000000 R11: 0000000000000246 R12: 0000556c860192e8
[ 5075.821113] R13: 0000556c88f95af0 R14: 0000556c88c3d6d0 R15: 00000000000001ff
[ 5075.821116] INFO: task vzdump:12531 blocked for more than 362 seconds.
[ 5075.821153]       Tainted: P           O      5.4.65-1-pve #1
[ 5075.821185] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 5075.821228] vzdump          D    0 12531      1 0x00000000
[ 5075.821230] Call Trace:
[ 5075.821233]  __schedule+0x2e6/0x6f0
[ 5075.821235]  ? filename_parentat.isra.57.part.58+0xf7/0x180
[ 5075.821237]  schedule+0x33/0xa0
[ 5075.821239]  rwsem_down_write_slowpath+0x2ed/0x4a0
[ 5075.821242]  down_write+0x3d/0x40
[ 5075.821244]  filename_create+0x8e/0x180
[ 5075.821246]  do_mkdirat+0x59/0x110
[ 5075.821248]  __x64_sys_mkdir+0x1b/0x20
[ 5075.821250]  do_syscall_64+0x57/0x190
[ 5075.821252]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 5075.821253] RIP: 0033:0x7f5be74fd0d7
[ 5075.821255] Code: Bad RIP value.
[ 5075.821256] RSP: 002b:00007ffdd1df4088 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
[ 5075.821257] RAX: ffffffffffffffda RBX: 0000557228b58260 RCX: 00007f5be74fd0d7
[ 5075.821258] RDX: 0000557226c223d4 RSI: 00000000000001ff RDI: 000055722d0ddb30
[ 5075.821259] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f5be754eb80
[ 5075.821260] R10: 0000000000000000 R11: 0000000000000246 R12: 0000557228f3e3c8
[ 5075.821261] R13: 000055722d0ddb30 R14: 000055722cc7e878 R15: 00000000000001ff

I'm currently on kernel 5.4.65-1-pve, which I realize is a little old. I saw identical behavior on the default kernel, 5.4.73-1-pve. Are there any suggestions on how to proceed?

Alwin · Dec 22, 2020

First thought, check your systems performance and hardware. Either it's overloaded or there is a hardware malfunction.

fabian · Dec 22, 2020

anything special about those nodes? are they identical hardware-wise?

jpnel · Dec 24, 2020

Thanks for the response. There's nothing particularly special about the nodes. They're somewhat low power for experimentation (odroid H2+) and I'm running the root file system off of a USB 3.0 flash drive. I realize that's probably not the most reliable, but things don't appear to be bottlenecked by the USB 3.0 I/O (iostat doesn't look unusual to me, and I'm able to sustain about 110 MB/s for sequential writes).

I blew away the old installation and collected a bit more data. I have 3 nodes, pve{1,2,3}.

After the initial installation, all three nodes are acting reasonably. I can access their consoles and I'm not getting any indications of any blocked services. I create a cluster on pve1 which is uneventful. Likewise, I join pve2 to the cluster, and things remain uneventful. The pve2 node joins the cluster quickly, networking appears undisturbed and the (now two node) cluster appears fine. No services report being blocked.

Things get much more interesting when joining the third node (pve3). In this case, pve3 appears to partially join the cluster and appears with the question mark next to it for 5-10 minutes or so. Meanwhile, networking on pve3, while not down, is unresponsive. I can ping it, but can't log into it. The web interface becomes totally unresponsive as well.

After a few minutes, I see the first blocked task on the third node trying to join the cluster, first with cron:

Code:

[  847.526842] INFO: task cron:932 blocked for more than 120 seconds.

...then followed by pvestatd, several pveproxy workers, pve-ha-lrm, sshd, and pvesr:

Code:

[  847.527011] INFO: task pvestatd:937 blocked for more than 120 seconds.
[  847.527077] INFO: task pveproxy worker:972 blocked for more than 120 seconds.
[  847.527121] INFO: task pveproxy worker:973 blocked for more than 120 seconds.
[  847.527163] INFO: task pveproxy worker:974 blocked for more than 120 seconds.
[  847.527207] INFO: task pve-ha-lrm:979 blocked for more than 120 seconds.
[  847.527267] INFO: task sshd:2694 blocked for more than 120 seconds.
[  847.527322] INFO: task pvesr:2730 blocked for more than 120 seconds.

A short time after, the initial node, pve1 starts to complain about blocked tasks. Throughout all of this, the second node, pve2 remains fine.

Code:

[ 5438.617483] INFO: task pveproxy worker:968 blocked for more than 120 seconds.
[ 5438.617768] INFO: task pvesr:13489 blocked for more than 120 seconds.

Shortly after all of this (around a minute or so later), corosync is now showing 100% CPU utilization on pve1 and 2, but not pve3. Throughout all of this, similar tasks are showing as blocked. This goes on for a couple of minutes but eventually settles down and pve3 now shows as green in the cluster. Do I simply accept that the node was successfully adopted into the cluster and proceed?

fabian · Dec 28, 2020

can you try repeating your experiment but make sure to upgrade to the current package versions on all nodes before setting up the cluster?

jpnel · Dec 29, 2020

Actually, I may have figured out my own problem. To answer your questions, I was already up to date with the latest kernel in my Wednesday response. That said, I noticed surprisingly high ping latencies that were consistently greater than 2ms even on a totally isolated VLAN. With a bit of investigation, I added pcie_aspm.policy=performance to the kernel boot command line and my ping latencies dropped to roughly 0.8 ms consistently. This is with the Realtek RTL8125B (2.5G) IP. Since then, I re-imaged the system and was able to join all three nodes into a cluster without the blocked processes, so I'm cautiously optimistic.

New install, task pve-startall-de:993 blocked for more than 120 seconds.

jpnel

Member

Alwin

Proxmox Retired Staff

fabian

Proxmox Staff Member

jpnel

Member

fabian

Proxmox Staff Member

jpnel

Member

We value your privacy