Cluster down while adding node again

maddy_in65

New Member
Mar 27, 2024
10
1
3
Recently one of the node had issues with its disk so reinstalled Proxmox on it with new disk.
In order to add node back to cluster, I followed some guides available online and this forum but it seems I made major blunder while troubleshooting and it lost Corosync.

Now i have lost my cluster and I am also unable to access nodes as it seems 2FA also synced to other nodes and not able to access WebUI for joining the cluster.
I have messed up everything, now even unable to access mater node :(

Please help with troubleshooting.
 
Last edited:
I am able to get cluster created on main node however another node is giving issue while joining the cluster
while trying to login to WebUI, I am getting 2FA option, I have tried to delete tfa.cfg file but it am getting permission.

Code:
root@PvE02Ser06:/etc/pve/priv# rm tfa.cfg
rm: cannot remove 'tfa.cfg': Permission denied

I have tried to join via CLI but it seems old cluster config is still there, I have deleted these folders but it seems i have now messed corosync config. Please help.
Code:
root@PvE02Ser06:/# pvecm add 192.168.5.15
Please enter superuser (root) password for '192.168.50.15': *************
detected the following error(s):
* authentication key '/etc/corosync/authkey' already exists
* cluster config '/etc/pve/corosync.conf' already exists
* this host already contains virtual guests
* corosync is already running, is this node already in a cluster?!
Check if node may join a cluster failed!

systemctl status corosync.service
× corosync.service - Corosync Cluster Engine
     Loaded: loaded (/lib/systemd/system/corosync.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Sun 2024-05-05 14:15:17 IST; 21s ago
   Duration: 15h 26min 52.789s
       Docs: man:corosync
             man:corosync.conf
             man:corosync_overview
    Process: 163903 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)
   Main PID: 163903 (code=exited, status=8)
        CPU: 15ms

May 05 14:15:17 PvE02Ser06 systemd[1]: Starting corosync.service - Corosync Cluster Engine...
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Corosync Cluster Engine  starting up
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Corosync built-in features: dbus monitoring watchdog systemd xmlconf vqsim nozzle snmp pie relro bindnow
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Could not open /etc/corosync/authkey: No such file or directory
May 05 14:15:17 PvE02Ser06 corosync[163903]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1417.
May 05 14:15:17 PvE02Ser06 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
May 05 14:15:17 PvE02Ser06 systemd[1]: corosync.service: Failed with result 'exit-code'.
May 05 14:15:17 PvE02Ser06 systemd[1]: Failed to start corosync.service - Corosync Cluster Engine.
root@PvE02Ser06:/etc/pve# pvecm add 192.168.5.15
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!