Proxmox QDevice with a 16 node cluster

EnhancedC

Member
Oct 4, 2023
6
0
6
Just want some advise on the best way to keep quorum in my scenario
What we have
16 PowerEdge M640 Blades Split across 2 server rooms in a cluster
3rd room with around 3 Proxmox servers (1 standalone, 2 in a cluster sperate cluster)

so
Room1: 8 PowerEdge M640 Blades
Room2: 8 PowerEdge M640 Blades
Room3: 3 Other Proxmox Servers
I know 1 QDevice in room 3 will allow for a full room to fail, but what Im looking to do is have allow a room + 1 (9 Nodes) to fail, while keeping quorum, is this possible, and what would i need to make that possible?
 
Best thing is to have 3 replica on 3 locations. i'd advise to buy 8 more m640 for the 3rd room ;-)
Another idea: Make 1 Proxmox server spare and split the other 18 in 3x6 (6 per room) and go for one cluster.
Is that feasable?
 
You can assign 2 or more votes to the Qdevice. This could help by providing more votes.
E.g.: each PVE node has one vote and the Qdevice has 3. Total votes are 19, majority is 10. 9 PVE hosts can fail and there is still a majority of votes available.
But if the Qdevice fails only 6 other PVE nodes may fail.
 
You can assign 2 or more votes to the Qdevice. This could help by providing more votes.
E.g.: each PVE node has one vote and the Qdevice has 3. Total votes are 19, majority is 10. 9 PVE hosts can fail and there is still a majority of votes available.
But if the Qdevice fails only 6 other PVE nodes may fail.
I see a problem here and thus i usually discourage from changing the vote strength of devices. Because the setup is inhomogeneous
and thus unpredictable. Exactly for the reason you showed.
 
  • Like
Reactions: Johannes S
My own largest cluster has 10 nodes. I can lose four nodes and still have 6 votes.

Does it really make sense to add a QDev with such a high number of nodes? With 11 votes I could lose five nodes instead of the mentioned four. But..., really, I believe that this event can't happen in my lifetime (with a reasonable probability).

The failure orgy would need to happen in sequence: a first one fails. Now during thinking about a replacement a second one fails. Now during investigating one+two a third one fails. Now...

If this really happens I would expect a systematic error is propagating, nothing I could stop with one more vote :-(

Let's say one node will fail per one year (which is a very bad assumption) and I need just a single day to take countermeasure (be it for replacing it or just "pvecm expected 9") . The chance of that single failure for any day is p = 1/365 = 0.0027, or 0.27 %. The chance that this happens for four nodes is something like p^4, right? p = 0.0027*0.0027*0.0027*0,0027 = 56 E-12. In my world this means zero. (Yes, I know there is something like the birthday Paradoxon, but hey..., I am not a statistician and I don't want to search for it now.)

Any counterarguments? What aspect did I miss?
 
  • Like
Reactions: Johannes S
E.g.: each PVE node has one vote and the Qdevice has 3. Total votes are 19, majority is 10. 9 PVE hosts can fail and there is still a majority of votes available.
But if the Qdevice fails only 6 other PVE nodes may fail.
Nodes failing isnt the issue. the quorum device is meant as defense against a silo connectivity "tie". Anything short of a room losing connectivity would be handled normally without it.
 
I would read the “Supported Setups” section related to qdevices on this wiki post very carefully if I was you: https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support

Yes the odds of enough nodes failing to even need the qdevice in the first place is very low. But, crucially adding a qdevice to any large cluster can have unintended consequences (potential SPOF) explained in the article I linked.

Hope that helps.
 
  • Like
Reactions: Johannes S
Best thing is to have 3 replica on 3 locations. i'd advise to buy 8 more m640 for the 3rd room ;-)
Another idea: Make 1 Proxmox server spare and split the other 18 in 3x6 (6 per room) and go for one cluster.
Is that feasable?
Unfortunately, we don't have the budget to by any new kit. But thanks for the suggestion.