Since applying the most recent debian and proxmox updates, from the official debian (deb http://ftp.us.debian.org/debian) and proxmox sources (deb https://enterprise.proxmox.com/debian/pve) have been seeing hung pvestatd task hangs. When this occurs, the immediate set of symptoms can be very long delays or timeouts trying to login to the proxmox web interface, or if ssh to the proxmox host, many commands hang indefinitely. This may be hung access to the pve file system.
I have seen this on two different proxmox nodes. It appears to track with the movement and startup of a VM running FreeBSD 13 as a guest node.
I am writing to:
1) see if anyone else is experiencing this
2) if anyone else is experiencing this, see if there are any known workarounds or configuration to mitigate this
3) provide access or information to one of these nodes in distress, so that this can be diagnosed.
The stack trace is as follows (from /var/log/kern.log):
Aug 30 10:06:12 n2 kernel: [2246610.864709] INFO: task pvestatd:2284 blocked for more than 120 seconds.
Aug 30 10:06:12 n2 kernel: [2246610.865743] Tainted: P IO 5.4.124-1-pve #1
Aug 30 10:06:12 n2 kernel: [2246610.866652] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 30 10:06:12 n2 kernel: [2246610.867561] pvestatd D 0 2284 1 0x00000004
Aug 30 10:06:12 n2 kernel: [2246610.868469] Call Trace:
Aug 30 10:06:12 n2 kernel: [2246610.869433] __schedule+0x2e6/0x6f0
Aug 30 10:06:12 n2 kernel: [2246610.870316] ? filename_parentat.isra.55.part.56+0xf7/0x180
Aug 30 10:06:12 n2 kernel: [2246610.871202] schedule+0x33/0xa0
Aug 30 10:06:12 n2 kernel: [2246610.872072] rwsem_down_write_slowpath+0x2ed/0x4a0
Aug 30 10:06:12 n2 kernel: [2246610.873025] ? enqueue_hrtimer+0x3c/0x90
Aug 30 10:06:12 n2 kernel: [2246610.873890] down_write+0x3d/0x40
Aug 30 10:06:12 n2 kernel: [2246610.874748] filename_create+0x8e/0x180
Aug 30 10:06:12 n2 kernel: [2246610.875600] do_mkdirat+0x59/0x110
Aug 30 10:06:12 n2 kernel: [2246610.876441] __x64_sys_mkdir+0x1b/0x20
Aug 30 10:06:12 n2 kernel: [2246610.877352] do_syscall_64+0x57/0x190
Aug 30 10:06:12 n2 kernel: [2246610.878179] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 30 10:06:12 n2 kernel: [2246610.879004] RIP: 0033:0x7f8c82e0c0d7
Aug 30 10:06:12 n2 kernel: [2246610.879820] Code: Bad RIP value.
node n2:
% uname -a
Linux n2 5.4.128-1-pve #1 SMP PVE 5.4.128-1 (Wed, 21 Jul 2021 18:32:02 +0200) x86_64 GNU/Linux
% apt list --installed | grep pve-
libpve-access-control/stable,now 6.4-3 all [installed]
libpve-apiclient-perl/stable,now 3.1-3 all [installed]
libpve-cluster-api-perl/stable,now 6.4-1 all [installed]
libpve-cluster-perl/stable,now 6.4-1 all [installed]
libpve-common-perl/stable,now 6.4-3 all [installed]
libpve-guest-common-perl/stable,now 3.1-5 all [installed]
libpve-http-server-perl/stable,now 3.2-3 all [installed]
libpve-storage-perl/stable,now 6.4-1 all [installed]
libpve-u2f-server-perl/stable,now 1.1-1 amd64 [installed]
pve-cluster/stable,now 6.4-1 amd64 [installed]
pve-container/stable,now 3.3-6 all [installed]
pve-docs/stable,now 6.4-2 all [installed]
pve-edk2-firmware/stable,now 2.20200531-1 all [installed]
pve-firewall/stable,now 4.1-4 amd64 [installed]
pve-firmware/stable,now 3.2-4 all [installed]
pve-ha-manager/stable,now 3.1-1 amd64 [installed]
pve-i18n/stable,now 2.3-1 all [installed]
pve-kernel-5.4.124-1-pve/stable,now 5.4.124-2 amd64 [installed,automatic]
pve-kernel-5.4.128-1-pve/stable,now 5.4.128-1 amd64 [installed,automatic]
pve-kernel-5.4.34-1-pve/stable,now 5.4.34-2 amd64 [installed]
pve-kernel-5.4/stable,now 6.4-5 all [installed]
pve-kernel-helper/stable,now 6.4-5 all [installed]
pve-lxc-syscalld/stable,now 0.9.1-1 amd64 [installed]
pve-manager/stable,now 6.4-13 amd64 [installed]
pve-qemu-kvm/stable,now 5.2.0-6 amd64 [installed]
pve-xtermjs/stable,now 4.7.0-3 amd64 [installed]
Thank you,
-Steven Senator
I have seen this on two different proxmox nodes. It appears to track with the movement and startup of a VM running FreeBSD 13 as a guest node.
I am writing to:
1) see if anyone else is experiencing this
2) if anyone else is experiencing this, see if there are any known workarounds or configuration to mitigate this
3) provide access or information to one of these nodes in distress, so that this can be diagnosed.
The stack trace is as follows (from /var/log/kern.log):
Aug 30 10:06:12 n2 kernel: [2246610.864709] INFO: task pvestatd:2284 blocked for more than 120 seconds.
Aug 30 10:06:12 n2 kernel: [2246610.865743] Tainted: P IO 5.4.124-1-pve #1
Aug 30 10:06:12 n2 kernel: [2246610.866652] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 30 10:06:12 n2 kernel: [2246610.867561] pvestatd D 0 2284 1 0x00000004
Aug 30 10:06:12 n2 kernel: [2246610.868469] Call Trace:
Aug 30 10:06:12 n2 kernel: [2246610.869433] __schedule+0x2e6/0x6f0
Aug 30 10:06:12 n2 kernel: [2246610.870316] ? filename_parentat.isra.55.part.56+0xf7/0x180
Aug 30 10:06:12 n2 kernel: [2246610.871202] schedule+0x33/0xa0
Aug 30 10:06:12 n2 kernel: [2246610.872072] rwsem_down_write_slowpath+0x2ed/0x4a0
Aug 30 10:06:12 n2 kernel: [2246610.873025] ? enqueue_hrtimer+0x3c/0x90
Aug 30 10:06:12 n2 kernel: [2246610.873890] down_write+0x3d/0x40
Aug 30 10:06:12 n2 kernel: [2246610.874748] filename_create+0x8e/0x180
Aug 30 10:06:12 n2 kernel: [2246610.875600] do_mkdirat+0x59/0x110
Aug 30 10:06:12 n2 kernel: [2246610.876441] __x64_sys_mkdir+0x1b/0x20
Aug 30 10:06:12 n2 kernel: [2246610.877352] do_syscall_64+0x57/0x190
Aug 30 10:06:12 n2 kernel: [2246610.878179] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 30 10:06:12 n2 kernel: [2246610.879004] RIP: 0033:0x7f8c82e0c0d7
Aug 30 10:06:12 n2 kernel: [2246610.879820] Code: Bad RIP value.
node n2:
% uname -a
Linux n2 5.4.128-1-pve #1 SMP PVE 5.4.128-1 (Wed, 21 Jul 2021 18:32:02 +0200) x86_64 GNU/Linux
% apt list --installed | grep pve-
libpve-access-control/stable,now 6.4-3 all [installed]
libpve-apiclient-perl/stable,now 3.1-3 all [installed]
libpve-cluster-api-perl/stable,now 6.4-1 all [installed]
libpve-cluster-perl/stable,now 6.4-1 all [installed]
libpve-common-perl/stable,now 6.4-3 all [installed]
libpve-guest-common-perl/stable,now 3.1-5 all [installed]
libpve-http-server-perl/stable,now 3.2-3 all [installed]
libpve-storage-perl/stable,now 6.4-1 all [installed]
libpve-u2f-server-perl/stable,now 1.1-1 amd64 [installed]
pve-cluster/stable,now 6.4-1 amd64 [installed]
pve-container/stable,now 3.3-6 all [installed]
pve-docs/stable,now 6.4-2 all [installed]
pve-edk2-firmware/stable,now 2.20200531-1 all [installed]
pve-firewall/stable,now 4.1-4 amd64 [installed]
pve-firmware/stable,now 3.2-4 all [installed]
pve-ha-manager/stable,now 3.1-1 amd64 [installed]
pve-i18n/stable,now 2.3-1 all [installed]
pve-kernel-5.4.124-1-pve/stable,now 5.4.124-2 amd64 [installed,automatic]
pve-kernel-5.4.128-1-pve/stable,now 5.4.128-1 amd64 [installed,automatic]
pve-kernel-5.4.34-1-pve/stable,now 5.4.34-2 amd64 [installed]
pve-kernel-5.4/stable,now 6.4-5 all [installed]
pve-kernel-helper/stable,now 6.4-5 all [installed]
pve-lxc-syscalld/stable,now 0.9.1-1 amd64 [installed]
pve-manager/stable,now 6.4-13 amd64 [installed]
pve-qemu-kvm/stable,now 5.2.0-6 amd64 [installed]
pve-xtermjs/stable,now 4.7.0-3 amd64 [installed]
Thank you,
-Steven Senator
Last edited: