Failed to start PVE API Proxy Server can't acquire lock '/var/run/pveproxy/pveproxy.pid.lock

StephPris

New Member
Jul 4, 2022
3
0
1
Hello

My server ran some years without trouble but now I am facing a problem which I don't know how to resolve.
I firstly noticed backup was buggued, with a task stucked for 2 months now.
When I killed it manually by web interface I have just wait to see if the server would get it working the next day.
Now the webinterface doesn't answer I have timeout.
I am afraid to reboot the srv cause I don't have any backup for two months like I wrote.

The status pveproxy.service :
pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
Active: failed (Result: timeout) since Mon 2022-07-04 15:05:08 CEST; 24min ago
Process: 24651 ExecStop=/usr/bin/pveproxy stop (code=exited, status=0/SUCCESS)
Process: 10830 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/SUCCESS)
Main PID: 1775
Tasks: 8 (limit: 4915)
Memory: 520.5M
CPU: 600ms
CGroup: /system.slice/pveproxy.service
├─ 1090 pveproxy
├─ 1775 pveproxy
├─24898 pveproxy
├─26956 pveproxy worker (shutdown)
├─26957 pveproxy worker (shutdown)
├─26958 pveproxy worker (shutdown)
├─31272 pveproxy
└─32497 pveproxy

Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 24898 (pveproxy) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26956 (pveproxy worker) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26957 (pveproxy worker) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26958 (pveproxy worker) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 31272 (pveproxy) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 32497 (pveproxy) with signal SIGKILL.
Jul 04 15:05:08 chaletprox systemd[1]: pveproxy.service: Processes still around after final SIGKILL. Entering failed mode.
Jul 04 15:05:08 chaletprox systemd[1]: Failed to start PVE API Proxy Server.
Jul 04 15:05:08 chaletprox systemd[1]: pveproxy.service: Unit entered failed state.
Jul 04 15:05:08 chaletprox systemd[1]: pveproxy.service: Failed with result 'timeout'.

the journalctl -b -u pveproxy.service :
Jul 04 15:00:37 chaletprox systemd[1]: Starting PVE API Proxy Server...
Jul 04 15:00:43 chaletprox pveproxy[1090]: start failed - can't acquire lock '/var/run/pveproxy/pveproxy.pid.lock' - Resource
Jul 04 15:00:43 chaletprox pveproxy[1090]: start failed - can't acquire lock '/var/run/pveproxy/pveproxy.pid.lock' - Resource
Jul 04 15:02:08 chaletprox systemd[1]: pveproxy.service: Start operation timed out. Terminating.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: State 'stop-final-sigterm' timed out. Killing.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 1090 (pveproxy) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 1775 (pveproxy) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 24898 (pveproxy) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26956 (pveproxy worker) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26957 (pveproxy worker) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26958 (pveproxy worker) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 31272 (pveproxy) with signal SIGKILL.
Jul 04 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 32497 (pveproxy) with signal SIGKILL.
Jul 04 15:05:08 chaletprox systemd[1]: pveproxy.service: Processes still around after final SIGKILL. Entering failed mode.
Jul 04 15:05:08 chaletprox systemd[1]: Failed to start PVE API Proxy Server.
Jul 04 15:05:08 chaletprox systemd[1]: pveproxy.service: Unit entered failed state.
Jul 04 15:05:08 chaletprox systemd[1]: pveproxy.service: Failed with result 'timeout'.

The syslog outpout :
Jul 4 15:02:08 chaletprox systemd[1]: pveproxy.service: Start operation timed out. Terminating.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: State 'stop-final-sigterm' timed out. Killing.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 1090 (pveproxy) with signal SIGKILL.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 1775 (pveproxy) with signal SIGKILL.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 24898 (pveproxy) with signal SIGKILL.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26956 (pveproxy worker) with signal SIGKILL.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26957 (pveproxy worker) with signal SIGKILL.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 26958 (pveproxy worker) with signal SIGKILL.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 31272 (pveproxy) with signal SIGKILL.
Jul 4 15:03:38 chaletprox systemd[1]: pveproxy.service: Killing process 32497 (pveproxy) with signal SIGKILL.
Jul 4 15:05:08 chaletprox systemd[1]: pveproxy.service: Processes still around after final SIGKILL. Entering failed mode.
Jul 4 15:05:08 chaletprox systemd[1]: Failed to start PVE API Proxy Server.
Jul 4 15:05:08 chaletprox systemd[1]: pveproxy.service: Unit entered failed state.
Jul 4 15:05:08 chaletprox systemd[1]: pveproxy.service: Failed with result 'timeout'.


The pveversion -v outpout :
pveversion -v
proxmox-ve: 5.3-1 (running kernel: 4.15.18-10-pve)
pve-manager: 5.3-8 (running version: 5.3-8/2929af8e)
pve-kernel-4.15: 5.3-1
pve-kernel-4.15.18-10-pve: 4.15.18-32
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-44
libpve-guest-common-perl: 2.0-19
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-36
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-2
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-33
pve-container: 2.0-33
pve-docs: 5.3-1
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-17
pve-firmware: 2.0-6
pve-ha-manager: 2.0-6
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 3.10.1-1
qemu-server: 5.0-45
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1

The ss -antlp | grep 8006 returns :
LISTEN 129 128 0.0.0.0:8006 0.0.0.0:* users:(("pveproxy",pid=1775,fd=6))


The df -h :
Filesystem Size Used Avail Use% Mounted on
udev 32G 0 32G 0% /dev
tmpfs 6.3G 652M 5.7G 11% /run
/dev/mapper/pve-root 94G 11G 79G 13% /
tmpfs 32G 138M 32G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/fuse 30M 16K 30M 1% /etc/pve
192.168.1.220:/volume1/bckpvm 1.8T 942G 843G 53% /mnt/pve/Synology

The hostname --ip-address cmd returns my server's ip.
192.168.1.200

What I have tried so far was :
systemctl reset-failed pveproxy.service
systemctl restart pvedaemon.service
systemctl restart pveproxy.service (it hangs out for more than five minutes before returning the failed).

I don't know what to do I am completely lost ...
Thanks you by advance for tye help you could provide
PS : MY VMs are running well ...
 
Hi,
it might be that you have multiple instances of pveproxy running and the new one can't take the lock. Is there still a pveproxy instance running after the service failed?

Can you run vzdump from the command line or is there an error then? If you can, I suggest you create backups and then reboot.

Two suggestions for the future:
 
Hi,
it might be that you have multiple instances of pveproxy running and the new one can't take the lock. Is there still a pveproxy instance running after the service failed?

Can you run vzdump from the command line or is there an error then? If you can, I suggest you create backups and then reboot.

Two suggestions for the future:
Hi I will follow the instructions, vzdump works, how can I backup using vzdump to a remote synology nas ?
I have never done it by cli.
Thanks you by adavnce :)
 
If you added the storage to Proxmox VE already, you can use
Code:
vzdump <VM ID(s)> --storage <storage ID>
Otherwise, you can mount it and use the dumpdir option, so
Code:
vzdump <VM ID(s)> --dumpdir <path to dir>
See man vzdump for details.
 
I have successfully done the backups but I didnt get my server restarts I have tried reboot 0, systemctl reboot, shutdown -r now, nothing worked :(
 
That does sound strange. Do you get any errors from the commands, anything in dmesg/syslog?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!