Cluster member server died

badgerfruit

New Member
Nov 25, 2020
12
0
1
Hello
As per the title, we have a setup comprising of 3 "node" servers (call them node1, node2 and node3). Node1 suffered a hardware failure which resulted in me needing to re-install Proxmox from the ground up. I have done this, reconfigured it and I can access the web GUI etc. On the Datacentre > Summary page, it shows a nice big green tick and says "Standalone node - no cluster defined". Nodes: Online 1, Offline 0.

However, switch over to node2 or node3's web GUI and check out Datacentre > Summary, is shows another big green tick in Status but reports "Cluster: MyCluster, Quorate: Yes", Nodes online 2, offline 1. The tree at the left side mirrors this as I can see "Node1" with a little red/white X (as to be expected I presume since the thing blew up).

My question (eventually, sorry!) is how do I get Node1 back as part of the cluster? If I click on Datacentre > Cluster from Node2 or Node3, it says "Standalone mode - no cluster defined" but has a list of Cluster Nodes under it which show Node1 (the original one I suspect), Node2 and Node3. I have an option to Create or Join a cluster but the button for "Join information" is disabled.

I'm afraid I am a total nooboid with Proxmox - if it helps, Node1 is 6.2-4 but Nodes2 and 3 are 6.1-7. I am comfortable with shell commands and messing with config files if that is what needs to be done, just point me in the right direction and I'll give it a go!

Thank you :)
 
First off, thanks for this, I had a look and tried the first command, expecting to see all 3 nodes listed but instead, I saw this:

Code:
root@Node2-Supermicro:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         2          1 Node2-Supermicro (local)
         3          1 Node3-DellR710

So straight off the bat, there's no "node1" listed to be able to delete :\

I browsed that page you sent and tried to add it to the cluster using pvecm add:

Code:
root@Node2-Supermicro:~# pvecm add 192.168.0.6
Please enter superuser (root) password for '192.168.0.6': ************
detected the following error(s):
* authentication key '/etc/corosync/authkey' already exists
* cluster config '/etc/pve/corosync.conf' already exists
* this host already contains virtual guests
* corosync is already running, is this node already in a cluster?!
Check if node may join a cluster failed!
 
Last edited:
Hmm, okay, sounds like you need to clean that up manually. Check the /etc/pve/corosync.conf file and remove the section for node1. Make sure that there is no directory /etc/pve/nodes/node1
 
Hmm, okay, sounds like you need to clean that up manually. Check the /etc/pve/corosync.conf file and remove the section for node1. Make sure that there is no directory /etc/pve/nodes/node1

... and if there is, am I safe to just rm -r it?
 
Okay, so I went ahead and deleted it (it doesn't work etc so not going to make it any worse!)
I then logged into node2's web GUI and the cluster "Join information" was now clickable.
Clicked it, copied the data, switched to Node1's web GUI Datacentre > Cluster > Join, pasted the info in but got the error:

permission denied - invalid PVE ticket (401)

HOWEVER, clicked close and refreshed my page (as it started showing all weird browser type errors), the Node has joined and from node1, I can now see all three nodes in the tree to the left - and likewise, on Node2 (and 3).

Great success by the looks so many thanks for pointing me in that direction, it's most appreciated :)
 
permission denied - invalid PVE ticket (401)

HOWEVER, clicked close and refreshed my page (as it started showing all weird browser type errors), the Node has joined and from node1, I can now see all three nodes in the tree to the left - and likewise, on Node2 (and 3).
That is normal behavior. Once a node joins a cluster, the local self signed certificate is exchanged for the certificate used in the whole cluster. This is usually also a self signed certificate. That means that the browser does not trust it and therefore does not allow any communication. Since the calls to the server are done via the JS based GUI you also don't get to see the message warning about the self signed cert until you do a page refresh.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!