Quorum question - 5 Nodes over 2 data centers

elnino54 · Mar 30, 2026

Hi all,
We are in the process of validating Proxmox for our production environment, which is currently built as a single cluster.

DC1 will have 3 PVE hosts, DC2 will have 2 PVE hosts.

DC's are connected with dual 10Gbit fibre

Each DC also has dual internet connections, and SDWAN connections to each, so connectivity is very well maintained.

What would be the best way to ensure an appropriate quorum be retained in the event of an entire DC outage? Have a qdevice at a 3rd site with more than one vote?

_gabriel · Mar 30, 2026

Post in thread 'Cluster and HA over WAN'

Feb 11, 2022

Since Corosync needs latencies of under 2ms to establish a reliable cluster, creating a cluster with 60-70ms is at best unstable. You can read more about the network requirements needed to establish a cluster in the manual [1].

[1]: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_cluster_network_requirements

ness1602 · Mar 30, 2026

Yes, implement third DC with quorum device with like 3 votes can be effective.

UdoB · Mar 30, 2026

You need the same number of nodes (or votes) in both DC. Then you add a third location to host the Quorum Device. That's the only way I know to achieve HA without manual intervention in case of a disaster.

Disclaimer: I do not own such a setup.

Johannes S · Mar 30, 2026

What will be the storage of the cluster?

elnino54 · Mar 30, 2026

_gabriel said:
Post in thread 'Cluster and HA over WAN'

Feb 11, 2022

Since Corosync needs latencies of under 2ms to establish a reliable cluster, creating a cluster with 60-70ms is at best unstable. You can read more about the network requirements needed to establish a cluster in the manual [1].

[1]: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_cluster_network_requirements

sterzy

That's not really relevant for my case - As I said, the DCs are connected via dual 10Gbit (dark) fiber, so it's <1ms between DCs

According to the documentation, a qdevice CAN be over a slower connection, so should be fine on SDWAN.

Johannes S said:
What will be the storage of the cluster?

Each site also has it's own HA mirrored storage (Pure) with realtime replication.

ness1602 said:
Yes, implement third DC with quorum device with like 3 votes can be effective.

I'm thinking that might be the best option and was what I was thinking but I was not sure if there was anything i'm not allowing for.

Johannes S · Mar 30, 2026

If the cluster consists of an odd number a qdevice should be avoided:

We support QDevices for clusters with an even number of nodes and recommend it for 2 node clusters, if they should provide higher availability. For clusters with an odd node count, we currently discourage the use of QDevices. The reason for this is the difference in the votes which the QDevice provides for each cluster type. Even numbered clusters get a single additional vote, which only increases availability, because if the QDevice itself fails, you are in the same position as with no QDevice at all.

On the other hand, with an odd numbered cluster size, the QDevice provides (N-1) votes — where N corresponds to the cluster node count. This alternative behavior makes sense; if it had only one additional vote, the cluster could get into a split-brain situation. This algorithm allows for all nodes but one (and naturally the QDevice itself) to fail. However, there are two drawbacks to this:

If the QNet daemon itself fails, no other node may fail or the cluster immediately loses quorum. For example, in a cluster with 15 nodes, 7 could fail before the cluster becomes inquorate. But, if a QDevice is configured here and it itself fails, no single node of the 15 may fail. The QDevice acts almost as a single point of failure in this case.

The fact that all but one node plus QDevice may fail sounds promising at first, but this may result in a mass recovery of HA services, which could overload the single remaining node. Furthermore, a Ceph server will stop providing services if only ((N-1)/2) nodes or less remain online.

If you understand the drawbacks and implications, you can decide yourself if you want to use this technology in an odd numbered cluster setup. https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support

I would also avoid messing round with changing votes. I see these possibilities:
- Add another node to your two-node setup so you have a six-node cluster with the qdevice as seventh vote
- Split the five node cluster in a three node and two-node cluster, add the qdevice only to the two node cluster. Use the proxmox datacenter manager for cross-cluster migration.

Since both options propably don't fit your requirements ( otherwise you would have already choose one of them wouldn't you?) I hope others have better ideas, sorry
Edit: Added reference to proxmox datacenter manager

elnino54 · Mar 30, 2026

Johannes S said:
Since both options propably don't fit your requirements ( otherwise you would have already choose one of them wouldn't you?) I hope others have better ideas, sorry

I'm not ruling out expanding it to two clusters but I don't know what that looks like in reality, and what consequences that would have. It was certainly something that crossed my mind but it seemed overcomplicated for what I'm trying to achieve.

I'm aware of the qdevice not being recommended for even configurations but this did seem like a plausible solution if the qdevice has more weight.

I still might be overcomplicating everything as the likelyhood of a failure that would cause loss of quorum without cause is extremely low. I just don't want a situation where a site isolates itself unnesessarily. It might be a case of trial and error DR scenario testing.

heythiscomputes · Mar 31, 2026

There is nothing inherently wrong with stretching your cluster in this way, latency permitting.

Take this wiki page as an example, (even though it is focused on Ceph storage, the point still stands): https://pve.proxmox.com/wiki/Stretch_Cluster

I would strongly encourage you to have an even number of PVE nodes at both sites to eliminate the need to tweak the number of votes provided by the qdevice.

If you want seamless failover, you need to keep all of these PVE instances in the same Datacentre.

PDM remote migrations or other "hackier" methods like syncing VM configuration files between clusters would "work", but would surely add unnecessary complexity and extra manual upkeep and/or scripting to even come close to seamless failover territory.

Hope that helps.

elnino54 · Apr 1, 2026

Thanks, that's a great explanation of our system. We would still have seamless failover because we have replicated storage at each DC (Pure storage)

Would a potential solution to this be to install an extra node in the DC with 2 servers, but effectively excluded from the compute pool, and then retain the qdevice at a third site?

I'm thinking something like a NUC or other non-server type PC, just for the vote.

Johannes S · Apr 1, 2026

elnino54 said:
Would a potential solution to this be to install an extra node in the DC with 2 servers, but effectively excluded from the compute pool, and then retain the qdevice at a third site?

This would work. Not an ideal solution but if you can ensure that no workloads ever gets distributed on it (in case you use something like prox-lb or the affinity rules of Proxmox VEs native ha-manager see https://pve.proxmox.com/pve-docs/chapter-ha-manager.html ) it should be "good enough".

heythiscomputes · Apr 2, 2026

If you have a reliable third site, that wouldn't be affected by the same outage conditions that would disrupt either of your 3 node sites, then a stretch cluster is plausible. I would highly recommend that you get a NUC with redundant power supplies or that you at least put it on a UPS. In other words, treat the qdevice as a production server.

As mentioned previously, the latency to the third site can be greater if using a qdevice. If you are struggling to piece together a reliable third node at the second site, you can surely turn this PVE instance into a glorified quorum voter. You can easily accomplish this by using simple node affinity HA rules to completely exclude it from running VMs, or restrict it to run only non-critical VMs if needed. Important to note here is that if you are using the enterprise repository, you will need to pay for a subscription on this 3rd node at the same subscription level as all of your other nodes.

My recommendation? Spend a bit of cash to make it a proper compute node. For two reasons, firstly, to make the subscription fee a little more worthwhile, and secondly, to give you peace of mind in a failure scenario. If a glorified 2 node cluster is all that is standing at one site if your 3 node site is down, these 2 computes nodes are going to feel extremely fragile.

Last piece, if you are going to skip the 3rd server, make absolutely sure that you have enough compute in the 2 node cluster to comfortably house your entire environment plus headroom. If you needed to reboot one of those compute servers at the 2 node site for any reason in a failure scenario, could you house the entire environment in 1 of those 2 servers, obviously quorum becomes an issue anyway in this scenario but hopefully you get my point.

Quorum question - 5 Nodes over 2 data centers

elnino54

New Member

_gabriel

Distinguished Member

Post in thread 'Cluster and HA over WAN'

ness1602

Famous Member

UdoB

Distinguished Member

Johannes S

Distinguished Member

elnino54

New Member

Post in thread 'Cluster and HA over WAN'

Johannes S

Distinguished Member

elnino54

New Member

heythiscomputes

New Member

elnino54

New Member

Johannes S

Distinguished Member

heythiscomputes

New Member

We value your privacy