[SOLVED] Quorum requirements question - minimum number

Jun 9, 2025
16
2
3
Hi,
So, have a maintenance event coming up (talked about it previously) that is taking down one of my server rooms. I currently run an 8-node cluster and 4 of the nodes reside in the affected server room so I've migrated all of the would-be-affected guests to nodes in the other server room. In reviewing the output of "pvecm status" I'm seeing that the "Quorum" value is 5. Does this mean that if I have fewer than 5 nodes online that the cluster will not function as expected? What are the ramifications in that scenario and what can I do to change that change (if that's even a good idea)? The maintenance event is planned for 3-5 days, and the affected nodes will be offline for that entire time.

Sorry if this is already explained somewhere, I did a cursory search of the forums before posting but have a lot of irons in the fire at the moment so couldn't go too deep, thanks.
 
Yes, if you loose 4 out of 8 nodes you have an issue. You need to have an odd number of nodes.

If you have your cluster split in two rooms, you need the deciding vote (could be a q-device) located in a 3rd room/location. It should be equally reachable from either of the primary rooms.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Yes, if you loose 4 out of 8 nodes you have an issue. You need to have an odd number of nodes.

If you have your cluster split in two rooms, you need the deciding vote (could be a q-device) located in a 3rd room/location. It should be equally reachable from either of the primary rooms.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
OK, so reading through the Cluster_Manager wiki it looks like a Q-Device is just a standalone/external Debian/Ubuntu server running corosync. You set up the daemon on each node and then run a setup command on one of the nodes to get the Q-Device talking to the cluster...that sound about right?
Thanks again!
 
Integrating an independent Quorum Device is the correct solution for "normal use".

You go for maintenance and I expect you to have an eye on the remaining cluster during that time. In this case you can circumvent the Quorum problem by telling the machines in the stay-turned-on room that there are no 8 nodes temporarily, but a lower number.

Please read man pvecm - search for "expected".

When you begin to shutdown the first room: do it slowly. Set maintenance for a first node-to-be-shutdown (ha-manager crm-command node-maintenance enable pvenodeX), evacuate it and shut it down.

Now 7 nodes are up, right? pvecm expected 7 will work only then, not before that first one is down.

Repeat for the other three. At the end you have four nodes, four nodes "expected" and a Quorum requirement of "3".

The "expected" value will rise automatically when you turn on the four machines in that room.


Disclaimer: you should test this procedure in a test cluster. In my Homelab I have a virtual PVE cluster for experiments with "dangerous" things...
 
Last edited:
Integrating an independent Quorum Device is the correct solution for "normal use".

You go for maintenance and I expect you to have an eye on the remaining cluster during that time. In this case you can circumvent the Quorum problem by telling the machines in the stay-turned-on room that there are no 8 nodes temporarily, but a lower number.

Please read man pvecm - search for "expected".

When you begin to shutdown the first room: do it slowly. Set maintenance for a first node-to-be-shutdown (ha-manager crm-command node-maintenance enable pvenodeX), evacuate it and shut it down.

Now 7 nodes are up, right? pvecm expected 7 will work only then, not before that first one is down.

Repeat for the other three. At the end you have four nodes, four nodes "expected" and a Quorum requirement of "3".

The "expected" value will rise automatically when you turn on the four machines in that room.


Disclaimer: you should test this procedure in a test cluster. In my Homelab I have a virtual PVE cluster for experiments with "dangerous" things...
Solid advice - I'll look into this option as well, thanks.
@UdoB is correct for a one-time procedure. It may be a good time to think about what would happen in a non-orderly shutdown (loss of the room), and to implement good practices.

Cheers


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Yup, that scenario has not escaped my thought process - plan on doing the Q-Device for as long as I have an even number of nodes but might try and experiment with the "maintenance-mode" option for my own edification.
 
  • Like
Reactions: Johannes S and UdoB