Cluster down while adding node again

maddy_in65

New Member
Mar 27, 2024
10
1
3
Recently one of the node had issues with its disk so reinstalled Proxmox on it with new disk.
In order to add node back to cluster, I followed some guides available online and this forum but it seems I made major blunder while troubleshooting and it lost Corosync.

Now i have lost my cluster and I am also unable to access nodes as it seems 2FA also synced to other nodes and not able to access WebUI for joining the cluster.
I have messed up everything, now even unable to access mater node :(

Please help with troubleshooting.
 
Last edited:
I am able to get cluster created on main node however another node is giving issue while joining the cluster
while trying to login to WebUI, I am getting 2FA option, I have tried to delete tfa.cfg file but it am getting permission.

Code:
root@PvE02Ser06:/etc/pve/priv# rm tfa.cfg
rm: cannot remove 'tfa.cfg': Permission denied

I have tried to join via CLI but it seems old cluster config is still there, I have deleted these folders but it seems i have now messed corosync config. Please help.
Code:
root@PvE02Ser06:/# pvecm add 192.168.5.15
Please enter superuser (root) password for '192.168.50.15': *************
detected the following error(s):
* authentication key '/etc/corosync/authkey' already exists
* cluster config '/etc/pve/corosync.conf' already exists
* this host already contains virtual guests
* corosync is already running, is this node already in a cluster?!
Check if node may join a cluster failed!

systemctl status corosync.service
× corosync.service - Corosync Cluster Engine
     Loaded: loaded (/lib/systemd/system/corosync.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Sun 2024-05-05 14:15:17 IST; 21s ago
   Duration: 15h 26min 52.789s
       Docs: man:corosync
             man:corosync.conf
             man:corosync_overview
    Process: 163903 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)
   Main PID: 163903 (code=exited, status=8)
        CPU: 15ms

May 05 14:15:17 PvE02Ser06 systemd[1]: Starting corosync.service - Corosync Cluster Engine...
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Corosync Cluster Engine  starting up
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Corosync built-in features: dbus monitoring watchdog systemd xmlconf vqsim nozzle snmp pie relro bindnow
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Could not open /etc/corosync/authkey: No such file or directory
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1417.
May 05 14:15:17 PvE02Ser06 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
May 05 14:15:17 PvE02Ser06 systemd[1]: corosync.service: Failed with result 'exit-code'.
May 05 14:15:17 PvE02Ser06 systemd[1]: Failed to start corosync.service - Corosync Cluster Engine.
root@PvE02Ser06:/etc/pve# pvecm add 192.168.5.15