4 Node Cluster fail after 2 node are offline

bzb-rs

Member
Jun 8, 2022
43
4
8
Canada
This happened to me today when 50% of our 4 node cluster was taken for maintenance and the ha worked as expected. However after couple of minutes login to webui was not possible and both nodes went into read only mode and all vms went dead
assuming the nodes were waiting for the quorum.
Hence I aborted maintenance and brought the offline nodes back to life. (all nodes are on latest pve)

Coming to the issue, when my 2 nodes of 4 are online, is there a way i can make sure cluster automatically migrate my VM's to available node/s, even if the last one standing is online provided it has enough resources? Right now it does not make sense for my services(VM) to go awol when atleast 1 pve nodes is running and i am able to login into ssh.

If I apply pvecm expected 1 will the last pve node standing take care of the cluster and the services? Even if i am left with only 1 node of 4?

Should i also modify pvecm expected 4 once all nodes in my cluster is back online or is this automated?
 
If I apply pvecm expected 1 will the last pve node standing take care of the cluster and the services? Even if i am left with only 1 node of 4?
Yes, but that is very dangerous. Imagine you plug out all network cables but your storage and all VMs will start on the one node an still run on the others. This is why quorum exists and you explained it correctly.

If you want to get always 2 nodes down in an upgrade case, just buy a 5th node so that "half" of the cluster does not exist and quorum is never a problem. Alternatively, just update one node in each run and bulk-migrate all the VMs away from the to-be-updated node.
 
So the only option which is recommended by the team to have a minimum of 3 nodes at any given time for the cluster to function "normally" ?
The reason why i assumed 2 nodes down is because of the setup i have with 2 nodes on my rack and 2 on another section of dc. So if either of them goes down, my cluster is up in the air.

Yes i am thinking in the direction of qdevice to have a 5th one because i trust no one lol. My point though was that the idea of quorum defeats the idea of having 4 nodes?

But thanks much for input, appreciate it.
 
Yes i am thinking in the direction of qdevice to have a 5th one because i trust no one lol. My point though was that the idea of quorum defeats the idea of having 4 nodes?
Oh yes, I forgot about that. With 2 nodes in DC A and 2 nodes in DC B the qdevice in DC C is the way to go. It is also the only option with 2-node-clusters that are very similar to your situation.

What do you use for HA storage?
 
Oh yes, I forgot about that. With 2 nodes in DC A and 2 nodes in DC B the qdevice in DC C is the way to go. It is also the only option with 2-node-clusters that are very similar to your situation.

What do you use for HA storage?
I was thinking of a free tier micro vm from gcp, but is it possible to add wan qdevice while my cluster is behind a nat'd firewall network? Perhaps 2 qdevice then i could use the last standing node to run on its own?

I currently utilize zfs replication for HA. But thinking of acquiring a san for storage among other features that can be added.
Ceph also interests me half and half as i never setup/worked on this type of system.

To avoid split-brain issues in the future, number of nodes need to be odd.

Can always setup a quorum device on a RPI or a VM on a non-cluster host https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support
Does it always have to be odd number? Currently i have 4 nodes and planning to add 2 qdevice making it 6 so that the value of vote is minimum of 3 in an event?
 
Does it always have to be odd number? Currently i have 4 nodes and planning to add 2 qdevice making it 6 so that the value of vote is minimum of 3 in an event?
Of course odd, if it would be even, you would run into the same problem again you're currently running in.
 
That makes total sense and also a bummer as we have to plan everything in odd numbers. It would have been a nice feature if proxmox can remove the voting of last node to make the cluster in odd to N-1 if the number of nodes are even and above 3 at any given time?

Maybe I am stretching this bit a lot? o_O
 
It would have been a nice feature if proxmox can remove the voting of last node to make the cluster in odd to N-1 if the number of nodes are even and above 3 at any given time?
That's what expected votes is for. You can do whatever you want with that number.
 
But again, that is not recommended.

I get the point now, looks like adding qdevice for odd number is the way to go.
 
I was thinking of a free tier micro vm from gcp, but is it possible to add wan qdevice while my cluster is behind a nat'd firewall network? Perhaps 2 qdevice then i could use the last standing node to run on its own?

I currently utilize zfs replication for HA. But thinking of acquiring a san for storage among other features that can be added.
Ceph also interests me half and half as i never setup/worked on this type of system.


Does it always have to be odd number? Currently i have 4 nodes and planning to add 2 qdevice making it 6 so that the value of vote is minimum of 3 in an event?
In split-brain situations, each node will vote for the other node, hence you get a deadlock. A QDevice will randomly vote for a node in a 2-node cluster, breaking the tie.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!