Node of last resort Quorum

ben90818532

Member
Sep 20, 2017
22
1
8
29
I have a 3 node Proxmox cluster and I'm wanting to add a fourth node in a different location in the building with it's own UPS that can act as a "Node of last resort".

Eg, the main server rack fails (UPS, switch or extended power outage) but this node will have its own UPS that can run for a very extended period and will be the "last survivor".

My question is how would a Proxmox cluster react to this?

4 Nodes in the cluster, 1 being the last resort node, 3 nodes suddenly disappear and the last resort has several VM's replicated to it configured for HA.

Seeing itself as the last survivor would the fourth node proceed to failover with HA? Or would it panic that it cannot determine Quorum?

Interested to hear opinions.
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
2,132
326
103
South Tyrol/Italy
With for nodes the majority is 3 nodes, so yes, if only that one node is left then it is not quorate and will do exactly nothing, by design.

I mean if it'd just happily start VMs and CTs just because it does not sees the other nodes you'll soon go into the land of split brains, multiple services with the same IP running, ..., this can happen if just the network between the primary 3-set of nodes and the secondary 1-set of nodes is lost, here the VMs on the primary set still works but the secondary node partiotion wouldn't (or better said, cannot) know that and would start the VMs, now the same VM runs two times, never good..

If you really want such a auto offside recovery the following strategy could be better:
* two nodes on site A
* two nodes on site B
* a third QDevice (no PVE node, can be a simple VM, ...) which is neither on site A or site B, but best from the place where the services the cluster hosts are actually used, e.g., the WAN (better known as internet).

See https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support for some docs (this is for 6.0., if you still run 5.x use the local "Documentation" button in the top right corner of the PVE WebUI to get to a local version for 5.x).

This way, if any node fails, but the link between site A and B is still working all is good and the cluster continues to operate.
If the link breaks, but both node-sets are operatable the outside QDevice will see this and choose one set only and give that quorum, then that could continue to operate, the other set is inquorate and does nothing (not really nothing, it fences itself if HA is on).
If two nodes are down (but link between site A and B still works) they can also continue to operate.

You see, better results and you can even cope with one node outage more - with 3 + 1 only one could die, after that all was over, with 2 + 2 + QDevice you can loose two of any of them and still continue to operate.. and the Qdevice does not needs to be a big or bare metal server.
 
Jan 21, 2017
286
32
33
31
Berlin
Only use this if you know what you're doing. This is dangerous and can destroy data and make server communication unreliable!

Simple solution to set quorate for a 2 node cluster with corosync3. I do not recommend this for HA features - dangerous!

Just activated this for some two node clusters and they're working fine. I have no time to write a Wiki article. Maybe someone else wants to invest some time. The following article is quiet outdated: https://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster

References: https://www.systutorials.com/docs/linux/man/5-votequorum/

Code:
root@pve1 ~ # cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pve1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 10.0.0.11
  }
  node {
    name: pve2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 10.0.0.12
  }
}

quorum {
  provider: corosync_votequorum
  expected_votes: 1
  two_node: 1
  wait_for_all: 0
}

totem {
  cluster_name: dc1
  config_version: 3
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  secauth: on
  version: 2
}
 
Last edited:

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
2,132
326
103
South Tyrol/Italy
Simple solution to set quorate for a 2 node cluster with corosync3. I wouldn't really recommend this for HA features but it might as well work fine for simple 2 node setups.
This is dangerous! It set's quorum to "1", which has the same split brain potential I explained in my initial reply, as long as you do not use HA and do not share resources between the two nodes you may be fine. But it won't work with HA and we definitively do not and cannot support it.

Also the thread started had already three nodes, so I don't that the suggestions could apply...
If really neither a (small) third node or a QDevice is possible, and some automatic failover is desired then use "auto_tie_breaker", there a predetermined node is chosen to live (per default the one with the lowest node id), while you only have a 50% chance to continue to operate this work at least...
 
  • Like
Reactions: DerDanilo

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
2,132
326
103
South Tyrol/Italy
expected_votes: 1 two_node: 1 wait_for_all: 0
All three combined makes this even more dangerous, I really hope that this is not a HA or the-like setup.

I mean, that can work for managing two nodes over one interface which else would have been stand alone, and use strictly separated resources, because that's the single use case this could maybe be valid.

With my official Proxmox hat on, for anybody reading this: if you're not 1000% sure what you do, or want to provide some redundancy or HA then do not enable two_node, especially with wait_for_all disabled and expected_votes 1, please, just setup a QDevice, even if it's on a Raspberry PI it will bring much more use and reliability. With a qdevice (or a third node) you can only win, with two_node only lose..
 
  • Like
Reactions: DerDanilo
Jan 21, 2017
286
32
33
31
Berlin
I totally agree. This should only be used in a two node cluster where one wants to be able to make config changes without the whole "cluster" beeing online.

Don't worry. Local ressources only. Cluster is only build for easier management and to be able to migrate with the new "local storage live migration" feature. Hetzner supports vswitch subnets (private VLANs). I guess it's some sort of vxlan. This allows to have nodes in different datacenters and still manage them in a two node setup.

Having a third "virtual" node is more complex and not really possible if servers are rented. Renting a third server for the pure joy of a quorum is to much for some smaller projects where two nodes are enough. :)

I updated my post. Better now? :p
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!