Main cluster failure, what to do?

sunset

Member
Apr 9, 2021
3
0
6
34
Hello to all,

I am French, please excuse my vocabulary in advance.

Indeed, I very rarely post on the forums because generally we all find what we need.

However, since a few days a question has been bothering me about a failing cluster, so I'll be as clear as possible.

I have set up a cluster in the form of a lab with three nodes that have been working perfectly well for several days.

My question is the following, in case of a complete crash of the main nodes, what do we do?

Also, I have managed to detach secondary nodes, and then the main node, but this practice is in the best of cases.

But if the main node crashes, it will be difficult to declare the secondary nodes on a new main node.

Thanks in advance for your thoughts or feedback.
 
Do you mean the master node of the ha-manager by "main node"? Because generally Proxmox VE clusters are multi-master clusters, where each node can do all management tasks.

Maybe the HA simulator would be intersting for you?
 
Thanks for your feedback,

My reflection is related to Proxmox VE clusters, I expose below my reflection:

1 ) I create a cluster from pve-1
2) I join the cluster initially created on pve-1 with pve-2
3) I join the cluster initially created on pve-1 with pve-3

This makes me 3 pve servers in a multi-master cluster, so the management is done on the three servers, until then I follow you.

However, in case of failure of the pve-1 server, let's assume that it is irrecoverable "Damage on the disks, fire, theft".

What about pve-2 and pve-3?

Should I recreate the cluster from scratch and reassemble the backups, or there is a method that is to detach pve-2 and pve-3 to pve-1 "crashed" and then join again pve-2 and pve-3 to pve-1 "Reinstall".

Obviously, this model would apply exponentially depending on the number of nodes.

I hope I was more precise on my request.
 
It doesn't matter which node you detach, since there is no leader node.
The important directory /etc/pve is shared among all members.
You can install another node, pve-4 maybe, and add it to the cluster again. One thing to keep in mind though, if the new node has the same hostname like the failed node before, the cluster will complain about not matching ssh keys, which will require more intervention.
 
  • Like
Reactions: Dominic

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!