pvestatd not reaping properly ? process table full -- system slowdown

tarball

Member
Nov 1, 2011
24
1
23
Hi it looks like there's a potential issue with pvestatd ? On one of our systems we noticed that the process table was full (62K+ processes being defunct pvestatd).
The systems slows down to a crawl at that point.
After a stop/start of the daemon; everything's fine again.

Running pve-manager/3.4-6/102d4547 (running kernel: 2.6.32-39-pve)


root 977213 0.5 0.2 211732 38648 ? Ss 15:54 0:00 pvestatd
root 977239 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977241 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977270 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977272 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977304 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977306 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977324 0.0 0.0 4052 516 ? S 15:54 0:00 sleep 60
root 977340 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977342 0.0 0.0 0 0 ? Z 15:54 0:00 [pvestatd] <defunct>
root 977372 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977374 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977402 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977404 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977433 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977435 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977462 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977464 0.0 0.0 0 0 ? Z 15:55 0:00 [pvestatd] <defunct>
root 977474 0.0 0.0 16792 1284 pts/0 R+ 15:55 0:00 ps auxwww


├─pvestatd,978703
│ ├─(pvestatd,978740)
│ ├─(pvestatd,978742)
│ ├─(pvestatd,978769)
│ ├─(pvestatd,978771)
│ ├─(pvestatd,978799)
│ ├─(pvestatd,978801)
│ ├─(pvestatd,978835)
│ ├─(pvestatd,978837)
│ ├─(pvestatd,978870)
....
 
Last edited:
What kind of storage do you use? Please check if all storage are online and accessible with:

# pvesm status
 
Hi Dietmar,

pvesm status
zfs error: open3: exec of zpool list -o name -H failed at /usr/share/perl5/PVE/Tools.pm line 328
zfs error: open3: exec of zpool list -o name -H failed at /usr/share/perl5/PVE/Tools.pm line 328
local dir 1 1031992064 272851220 706712044 28.35%
ovz3-bk nfs 1 7546520832 4298153984 3248366848 57.46%
zfs zfspool 0 0 0 0 100.00%
zfs1 zfspool 0 0 0 0 100.00%

root@ovz3:~# lsmod|grep zfs
root@ovz3:~#


oot@ovz3:~# zpool list -o name -H
-bash: zpool: command not found
root@ovz3:~# which zpool
root@ovz3:~# whereis zpool
zpool:


We use regular software raid
root@ovz3:~# cat /proc/mdstat
Personalities : [raid10] [raid1]
md2 : active raid1 sde2[2] sdc2[3]
9715584 blocks super 1.2 [2/2] [UU]


md1 : active raid1 sde1[2] sdc1[3]
224574272 blocks super 1.2 [2/2] [UU]


md0 : active raid10 sda1[0] sdf1[3] sdd1[2] sdb1[1]
1953258496 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]


root@ovz3:~# pveversion -v
proxmox-ve-2.6.32: 3.4-156 (running kernel: 2.6.32-39-pve)
pve-manager: 3.4-6 (running version: 3.4-6/102d4547)
pve-kernel-2.6.32-32-pve: 2.6.32-136
pve-kernel-2.6.32-39-pve: 2.6.32-156
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-17
qemu-server: 3.4-6
pve-firmware: 1.1-4
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-33
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.2-10
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1
 
I think a ZFS pool been defined at the (non-HA) cluster level but this specific host (ovz3) does not have a ZFS pool; doesn't even have the various ZFS tools I think.

I just checked and a few older PVEs that were upgraded to the latest or at least the ZFS-enabled PVE are exhibiting the same behavior.
 
Last edited:
I think a ZFS pool been defined at the (non-HA) cluster level but this specific host (ovz3) does not have a ZFS pool; doesn't even have the various ZFS tools I think.

You should define the nodes if a storage is available (if not available on all nodes).
 
Ok I see the problem will fix it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!