VMs Are and Node Seem Ofline but Running

looks like a configured storage is offline, check your storage definitions.
 
Do you use the latest version? Does it help if you restart pvestatd

# service pvestatd restart
 
I noticed the exact same thing two days ago. It had been almost a month since I had last logged in so I don't know when the issue actually started. I only have local storage configured. I rebooted the server and everything came back properly.
 
Hello

I have they same problem right now. /etc/init.d/pvestatd restart gives me this error

/etc/init.d/pvestatd restart
Restarting PVE Status Daemon: pvestatdstart-stop-daemon: warning: failed to kill 1700: No such process



Regards,
Rocel
 
same problem encountered. Tried /etc/init.d/pvestatd restart gives me error below

"/etc/init.d/pvestatd restart
Restarting PVE Status Daemon: pvestatdstart-stop-daemon: warning: failed to kill 1700: No such process"
 
#/etc/init.d/pvestatd restart ?

and also update your packages to last stable.

We have a 13 node cluster of Proxmox 2.2 (installed a couple weeks ago) that was doing this. When we did a restart of the pvestatd service, it was in fact dead. I changed the mirror to use ca (Canada) instead of us (U.S.) and did an apt-get update && apt-get -y upgrade. It did upgrade the pve-manager (which owns the pvestatd binary), so maybe it will be stable now. We'll keep watch over it.
 
Then simply try with:

# service pvestatd start

Dietmar, the OP wasn't clear in his post, but the restart does work properly. It just lets you know that it didn't find a running program to stop, then it starts it properly.
 
Just like above, one of my nodes keeps showing red, but the VMs and all features appear to be working (mix of local and shared storage and all the VMs stay up). Pvestatd restart throws an error, but actually works and gets the system showing as green again. Any ideas why this is happening and what I should do to fix it?

Two other nodes with identical configurations have never switched to red.
 
I also found that my pvestatd was not running this morning.

After investigating i saw this in dmesg:

Out of memory: Kill process 3889 (pvestatd) score 406 or sacrifice child
OOM killed process 3889 (pvestatd) vm:28887328kB, rss:28311020kB, swap:357556kB

I cannot tell if pvestatd used all the memory but in the daemon log file i saw this as the last messages from pvestatd in daemon.log.1 on 5 december (one week later until i found out):
ec 5 19:06:08 NODE01 pvestatd[3889]: WARNING: command 'df -P -B 1 /var/lib/vz/' failed: open3: fork failed: Cannot allocate memory at /usr/share/perl5/PVE/
Tools.pm line 280
Dec 5 19:06:08 NODE01 snmpd[1854]: Connection from UDP: [127.0.0.1]:34268->[127.0.0.1]
Dec 5 19:06:09 NODE01 pvestatd[3889]: WARNING: command '/sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_
free' failed: open3: fork failed: Cannot allocate memory at /usr/share/perl5/PVE/Tools.pm line 280
Dec 5 19:06:09 NODE01 pvestatd[3889]: WARNING: command '/sbin/vgscan --ignorelockingfailure --mknodes' failed: open3: fork failed: Cannot allocate memory at
/usr/share/perl5/PVE/Tools.pm line 280
Dec 5 19:06:09 NODE01 pvestatd[3889]: WARNING: Use of uninitialized value $storeid in hash element at /usr/share/perl5/PVE/Storage.pm line 665.
Dec 5 19:06:09 NODE01 pvestatd[3889]: WARNING: Use of uninitialized value $storeid in hash element at /usr/share/perl5/PVE/Storage.pm line 666.
Dec 5 19:06:09 NODE01 pvestatd[3889]: status update time (5.206 seconds)
Dec 5 19:06:09 NODE01 snmpd[1854]: Connection from UDP: [127.0.0.1]:34268->[127.0.0.1]
Dec 5 19:06:10 NODE01 snmpd[1854]: Connection from UDP: [127.0.0.1]:34268->[127.0.0.1]
Dec 5 19:06:11 NODE01 snmpd[1854]: Connection from UDP: [127.0.0.1]:34268->[127.0.0.1]
Dec 5 19:06:17 NODE01 pvestatd[3889]: WARNING: Use of uninitialized value $vmid in concatenation (.) or string at /usr/share/perl5/PVE/QemuServer.pm line 13
87.
Dec 5 19:06:17 NODE01 pvestatd[3889]: qemu status update error: unknown file type 'nodes/NODE01/qemu-server/.conf'
Dec 5 19:06:17 NODE01 pvestatd[3889]: WARNING: command 'df -P -B 1 /var/lib/vz/' failed: open3: fork failed: Cannot allocate memory at /usr/share/perl5/PVE/
Tools.pm line 280
Dec 5 19:06:17 NODE01 pvestatd[3889]: WARNING: command '/sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_
free' failed: open3: fork failed: Cannot allocate memory at /usr/share/perl5/PVE/Tools.pm line 280
Dec 5 19:06:18 NODE01 pvestatd[3889]: WARNING: command '/sbin/vgscan --ignorelockingfailure --mknodes' failed: open3: fork failed: Cannot allocate memory at
/usr/share/perl5/PVE/Tools.pm line 280
Dec 5 19:06:18 NODE01 pvestatd[3889]: WARNING: Use of uninitialized value $storeid in hash element at /usr/share/perl5/PVE/Storage.pm line 665.
Dec 5 19:06:18 NODE01 pvestatd[3889]: WARNING: Use of uninitialized value $storeid in hash element at /usr/share/perl5/PVE/Storage.pm line 666.

These were the last messages and the pvestatd was killed. In my monitoring i can see that it freed up 30GB of memory.
On another cluster node i can see that pvestatd has consumed half of my memory right now so probably that one will be killed somewhere the coming days.
Almost all nodes up times are 118 days

Does an upgrade to 2.2 fix this?

I do have the logs so if you are interested i can post more information?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!