[SOLVED] Failed to start pve-cluster.service - /etc/pve/local/pveproxy-ssl.key: failed to load local private key

dima1002

Member
May 23, 2021
51
1
13
52
Hello, I tried to rename the Proxmox server, but since then I've been getting this error. In the meantime I gave the server the old name again and rebooted it, but unfortunately the cluster service no longer starts, why?

Code:
root@prx500:/# journalctl -f
Jan 21 21:02:26 prx500 pveproxy[11134]: worker 16061 finished
Jan 21 21:02:26 prx500 pveproxy[11134]: worker 16108 started
Jan 21 21:02:26 prx500 pveproxy[16108]: /etc/pve/local/pveproxy-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 2037.
Jan 21 21:02:26 prx500 pveproxy[16062]: worker exit
Jan 21 21:02:26 prx500 pveproxy[11134]: worker 16062 finished
Jan 21 21:02:26 prx500 pveproxy[11134]: starting 2 worker(s)
Jan 21 21:02:26 prx500 pveproxy[11134]: worker 16109 started
Jan 21 21:02:26 prx500 pveproxy[11134]: worker 16110 started
Jan 21 21:02:26 prx500 pveproxy[16109]: /etc/pve/local/pveproxy-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 2037.
Jan 21 21:02:26 prx500 pveproxy[16110]: /etc/pve/local/pveproxy-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 2037.
Jan 21 21:02:31 prx500 pveproxy[16108]: worker exit
Jan 21 21:02:31 prx500 pveproxy[11134]: worker 16108 finished
Jan 21 21:02:31 prx500 pveproxy[11134]: starting 1 worker(s)
Jan 21 21:02:31 prx500 pveproxy[11134]: worker 16161 started
Jan 21 21:02:31 prx500 pveproxy[16161]: /etc/pve/local/pveproxy-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 2037.
Jan 21 21:02:31 prx500 pveproxy[16109]: worker exit
Jan 21 21:02:31 prx500 pveproxy[16110]: worker exit
Jan 21 21:02:31 prx500 pveproxy[11134]: worker 16109 finished
Jan 21 21:02:31 prx500 pveproxy[11134]: starting 1 worker(s)
Jan 21 21:02:31 prx500 pveproxy[11134]: worker 16110 finished
Jan 21 21:02:31 prx500 pveproxy[11134]: worker 16162 started
Jan 21 21:02:31 prx500 pveproxy[16162]: /etc/pve/local/pveproxy-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 2037.

^C
root@prx500:/# systemctl status pve-cluster
× pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; disabled; preset: enabled)
     Active: failed (Result: exit-code) since Tue 2025-01-21 20:55:26 CET; 7min ago
    Process: 11513 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 8ms

Jan 21 20:55:26 prx500 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jan 21 20:55:26 prx500 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Jan 21 20:55:26 prx500 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jan 21 20:55:26 prx500 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jan 21 20:55:26 prx500 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
root@prx500:/# journalctl -xeu pve-cluster
Jan 21 20:55:26 prx500 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit pve-cluster.service has entered the 'failed' state with result 'exit-code'.
Jan 21 20:55:26 prx500 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
░░ Subject: A start job for unit pve-cluster.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pve-cluster.service has finished with a failure.
░░
░░ The job identifier is 7832 and the job result is failed.
Jan 21 20:55:26 prx500 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
░░ Subject: Automatic restarting of a unit has been scheduled
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support

Code:
root@prx500:/etc/pve/local# ls -ahltr
total 78K
drwxr-xr-x 4 root root    4 Jan 21 19:17 ..
drwxr-xr-x 2 root root   11 Jan 21 19:17 lxc
drwx------ 2 root root    2 Jan 21 19:17 priv
drwxr-xr-x 2 root root    2 Jan 21 19:17 openvz
drwxr-xr-x 2 root root   11 Jan 21 19:17 qemu-server
-rw-r----- 1 root root  558 Jan 21 20:37 ssh_known_hosts
-rw-r----- 1 root root   83 Jan 21 20:37 lrm_status
-rw-r----- 1 root root 1.7K Jan 21 20:37 pve-ssl.pem
-rw-r----- 1 root root 1.7K Jan 21 20:37 pve-ssl.key
-rw-r----- 1 root root 3.2K Jan 21 20:37 pveproxy-ssl.key
-rw-r----- 1 root root  503 Jan 21 20:37 host.fw
-rw-r----- 1 root root   82 Jan 21 20:37 config
drwxr-xr-x 6 root root   14 Jan 21 20:37 .
-rw-r----- 1 root root 3.9K Jan 21 20:37 pveproxy-ssl.pem
 
Last edited:
yes, I have changed the hostname in /etc/hostname /etc/hosts /etc/mailname /etc/postfix/main.cf. The IP has not changed, or where should I have changed it? But I have just seen that someone has installed Cloudflair with the Acme plugin. Do I have to change anything there too? But I don't actually need the plugin. It can also be deleted.
 
Is this host part of a cluster?

Looks like the cert and the privkey don't match for some reason or that they don't match the host name. Probably caused by that ACME plugin. I would try to move pveproxy-ssl.key and pveproxy-ssl.pem from /etc/pve/local and try to restart pve-cluster and pveproxy services. If still doesn't work, try pvecm updatecerts --force and restart services again.
 
No, it's a single node.

I've already moved both certificates and restarted the services, but I always get the same error.

I've tried all of that.

Code:
root@prx500:/etc/pve/local# systemctl restart pve-cluster
Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xeu pve-cluster.service" for details.
root@prx500:/etc/pve/local# pvenode acme cert status
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
root@prx500:/etc/pve/local# pvenode acme cert order --force
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
root@prx500:/etc/pve/local# pvecm updatecerts --force
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
root@prx500:/etc/pve/local# systemctl restart pve-cluster
Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xeu pve-cluster.service" for details.
root@prx500:/etc/pve/local# pveproxy status
running
root@prx500:/etc/pve/local# pveproxy restart
root@prx500:/etc/pve/local# systemctl status  pve-cluster
× pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; disabled; preset: enabled)
     Active: failed (Result: exit-code) since Tue 2025-01-21 21:51:33 CET; 26s ago
    Process: 44566 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 9ms

Jan 21 21:51:33 prx500 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jan 21 21:51:33 prx500 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Jan 21 21:51:33 prx500 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jan 21 21:51:33 prx500 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jan 21 21:51:33 prx500 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
root@prx500:/etc/pve/local# systemctl restart  pve-cluster
Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xeu pve-cluster.service" for details.
root@prx500:/etc/pve/local#
 
here, IP is correct

Code:
Jan 21 21:52:06 prx500 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Jan 21 21:52:06 prx500 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
Jan 21 21:52:06 prx500 pmxcfs[44947]: [main] notice: resolved node name 'prx500' to '4.5.215.29' for default node IP address
Jan 21 21:52:06 prx500 pmxcfs[44947]: [main] notice: resolved node name 'prx500' to '4.5.215.29' for default node IP address
Jan 21 21:52:06 prx500 pmxcfs[44947]: fuse: mountpoint is not empty
Jan 21 21:52:06 prx500 pmxcfs[44947]: fuse: if you are sure this is safe, use the 'nonempty' mount option
Jan 21 21:52:06 prx500 pmxcfs[44947]: [main] crit: fuse_mount error: File exists
Jan 21 21:52:06 prx500 pmxcfs[44947]: [main] crit: fuse_mount error: File exists
Jan 21 21:52:06 prx500 pmxcfs[44947]: [main] notice: exit proxmox configuration filesystem (-1)
Jan 21 21:52:06 prx500 pmxcfs[44947]: [main] notice: exit proxmox configuration filesystem (-1)
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jan 21 21:52:06 prx500 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 4.
Jan 21 21:52:06 prx500 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Jan 21 21:52:06 prx500 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
Jan 21 21:52:06 prx500 pmxcfs[44948]: [main] notice: resolved node name 'prx500' to '4.5.215.29' for default node IP address
Jan 21 21:52:06 prx500 pmxcfs[44948]: [main] notice: resolved node name 'prx500' to '4.5.215.29' for default node IP address
Jan 21 21:52:06 prx500 pmxcfs[44948]: fuse: mountpoint is not empty
Jan 21 21:52:06 prx500 pmxcfs[44948]: fuse: if you are sure this is safe, use the 'nonempty' mount option
Jan 21 21:52:06 prx500 pmxcfs[44948]: [main] crit: fuse_mount error: File exists
Jan 21 21:52:06 prx500 pmxcfs[44948]: [main] crit: fuse_mount error: File exists
Jan 21 21:52:06 prx500 pmxcfs[44948]: [main] notice: exit proxmox configuration filesystem (-1)
Jan 21 21:52:06 prx500 pmxcfs[44948]: [main] notice: exit proxmox configuration filesystem (-1)
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jan 21 21:52:06 prx500 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jan 21 21:52:06 prx500 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jan 21 21:52:06 prx500 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jan 21 21:52:06 prx500 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
 
What exactly do I have to do? Back up /etc/pve, then delete it. Restart services and then restore from backup?
 
/etc/pve is populated by pve-cluster service based on the contents of the sqlite database at /var/lib/pve-cluster. If pve-cluster isn't running, /etc/pve should be empty... If it has contents someone/something has copied data there while the service was stopped and now it can't start.

Umm... What I would do:
  • triple check that service pve-cluster isn't running (there should be no /usr/bin/pmxcfs process).
  • Backup /etc/pve, just in case.
  • Delete /etc/pve/* (do NOT remove /etc/pve itself).
  • systemctl start pve-cluster.service.