pveproxy stuck

greg

Active Member
Apr 6, 2011
122
1
43
Greetings

In the first days of this new year, my Proxmox cluster is in bad shape...


In one node, "pveproxy" is badly stuck:

Bash:
root     15639  0.0  0.5 295812 89408 pts/26   D     2021   0:00 /usr/bin/perl -T /usr/bin/pvesr status
root     24233  0.0  0.5 283276 83712 ?        Ds    2021   0:00 /usr/bin/perl -T /usr/bin/pveproxy restart
root     23262  0.1  0.5 283252 92200 ?        Ds   15:58   0:00 /usr/bin/perl -T /usr/bin/pveproxy stop

I cannot even force kill it:

Code:
kill -9 15639 24233 23262

gives nothing. Status is weird:

Code:
● pveproxy.service - PVE API Proxy Server
   Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
   Active: failed (Result: timeout) since Thu 2022-01-06 16:05:35 CET; 6min ago
 Main PID: 19825 (code=exited, status=0/SUCCESS)
    Tasks: 2 (limit: 4915)
   Memory: 164.9M
   CGroup: /system.slice/pveproxy.service
           ├─23262 /usr/bin/perl -T /usr/bin/pveproxy stop
           └─24233 /usr/bin/perl -T /usr/bin/pveproxy restart

janv. 06 16:01:04 sysv6 systemd[1]: pveproxy.service: State 'stop-sigterm' timed out. Killing.
janv. 06 16:01:04 sysv6 systemd[1]: pveproxy.service: Killing process 23262 (pveproxy) with signal SIGKILL.
janv. 06 16:01:04 sysv6 systemd[1]: pveproxy.service: Killing process 24233 (pveproxy) with signal SIGKILL.
janv. 06 16:02:35 sysv6 systemd[1]: pveproxy.service: Processes still around after SIGKILL. Ignoring.
janv. 06 16:04:05 sysv6 systemd[1]: pveproxy.service: State 'stop-final-sigterm' timed out. Killing.
janv. 06 16:04:05 sysv6 systemd[1]: pveproxy.service: Killing process 24233 (pveproxy) with signal SIGKILL.
janv. 06 16:04:05 sysv6 systemd[1]: pveproxy.service: Killing process 23262 (pveproxy) with signal SIGKILL.
janv. 06 16:05:35 sysv6 systemd[1]: pveproxy.service: Processes still around after final SIGKILL. Entering failed mode.
janv. 06 16:05:35 sysv6 systemd[1]: pveproxy.service: Failed with result 'timeout'.
janv. 06 16:05:35 sysv6 systemd[1]: Stopped PVE API Proxy Server.

Short of unplugging the server, what can I do?

Thanks in advance

Regards
 

greg

Active Member
Apr 6, 2011
122
1
43
BTW I tried various commands, such as
Code:
systemctl restart pveproxy pvedaemon

pvecm updatecerts

all hang.
 

greg

Active Member
Apr 6, 2011
122
1
43
Also: all pve related process are stuck on half the nodes of the cluster...
 

greg

Active Member
Apr 6, 2011
122
1
43
I had to electrically reboot the server. Now it doesn't start, I guess it's because there are only zfs partitions and grub cannot access any of them.

So basically my server is dead. Happy new year!
 

greg

Active Member
Apr 6, 2011
122
1
43
Good news, with idrac I was able to boot it by forcing UEFI, for some reasons it wasn't...
 
Last edited:

greg

Active Member
Apr 6, 2011
122
1
43
So back to the original problem... what can I do when all pve commands hang?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!