Woke up this AM and my PVE Web GUI was down. Wouldn't start with a "can't update certs" issue. "No quorum"
Okay, well, I just established a cluster yesterday and then brought a cluster member down -- and it's probably not coming back. I didn't see any way to delete the cluster member, so I just left it. Seems maybe that wasn't a good idea. I guess behind the scenes, certs for Web GUI on both are shared, and with the cluster member gone, I can't do that and my Web GUI (on my now lone cluster member) went down? (Seems kinda non-resilient to me, but...)
So now, I'm wondering if I just delete the cluster member, I'm going to be good.
The first command is what
I found these instructions for deleting a cluster member, and I want to be absolutely sure if I do the last step here, I'm not going to bork something:
https://sysadmin-community.com/remove-node-from-cluster-proxmox/
Basically, as you can see from my output quoted above, cluster member "construct3" is now gone, and I'm thinking if I remove it permanently, then my
Here's the
So, (1) permanently delete cluster member (using command line approach) and (2) Web GUI should come back up?
Okay, well, I just established a cluster yesterday and then brought a cluster member down -- and it's probably not coming back. I didn't see any way to delete the cluster member, so I just left it. Seems maybe that wasn't a good idea. I guess behind the scenes, certs for Web GUI on both are shared, and with the cluster member gone, I can't do that and my Web GUI (on my now lone cluster member) went down? (Seems kinda non-resilient to me, but...)
So now, I'm wondering if I just delete the cluster member, I'm going to be good.
The first command is what
service pveproxy restart
seems to be failing on and that's the error (ie. "no quorum"):
Code:
root@pve:~# pvecm updatecerts
no quorum - unable to update files
root@pve:~# pvecm nodes
Membership information
----------------------
Nodeid Votes Name
1 1 pve (local)
root@pve:~# pvecm delnode construct3
cluster not ready - no quorum?
root@pve:~# cd /etc/pve/nodes/
root@pve:/etc/pve/nodes# ll
total 0
dr-xr-xr-x 2 root www-data 0 May 17 20:25 construct3/
dr-xr-xr-x 2 root www-data 0 Apr 7 23:50 pve/
I found these instructions for deleting a cluster member, and I want to be absolutely sure if I do the last step here, I'm not going to bork something:
https://sysadmin-community.com/remove-node-from-cluster-proxmox/
Basically, as you can see from my output quoted above, cluster member "construct3" is now gone, and I'm thinking if I remove it permanently, then my
service pveproxy restart
is going to work?Here's the
service pveproxy restart
and systemctl status pveproxy.service
output as evidence for my theory. Not sure about the mkdir /etc/pve/ha
error, but the rest seems to fit on pvecm updatecerts
. Starting a cluster and the taking down a cluster member has to be what started this whole chain of events:
Code:
root@pve:/etc/pve/nodes# service pveproxy restart
Job for pveproxy.service failed because the control process exited with error code.
See "systemctl status pveproxy.service" and "journalctl -xe" for details.
root@pve:/etc/pve/nodes# systemctl status pveproxy.service | less
● pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
Active: activating (start) since Tue 2020-05-19 12:20:30 CDT; 613ms ago
Process: 21210 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=1/FAILURE)
Cntrl PID: 21215 (pveproxy)
Tasks: 1 (limit: 4915)
Memory: 48.4M
CGroup: /system.slice/pveproxy.service
└─21215 /usr/bin/perl -T /usr/bin/pveproxy start
May 19 12:20:30 pve systemd[1]: Starting PVE API Proxy Server...
May 19 12:20:31 pve pvecm[21210]: mkdir /etc/pve/ha: Permission denied at /usr/share/perl5/PVE/Cluster.pm line 88.
So, (1) permanently delete cluster member (using command line approach) and (2) Web GUI should come back up?