[SOLVED] Workaround for cluster with 2 bare metal nodes

EuroDomenii

Renowned Member
Sep 30, 2016
145
35
68
Slatina
www.domenii.eu
The problem

Problems with "two_node" Clusters This corosync options depends on a fence race condition, and only works using reliable HW fence devices. Above 'self fencing' algorithm does not work if you use this option! Via https://git.proxmox.com/?p=pve-ha-m...177f16e81fba2df819a4e469cd03323ac89a4;hb=HEAD

The workaround

Let’s say we have 2 bare metal servers ( server 1 + server2).
The suggested setup, with 4 nodes cluster:

Node1 - Proxmox bare metal server1
Node2 - Nested Proxmox on KVM on server1
Node3- Proxmox bare metal server2
Node4 - Nested Proxmox on KVM on server2

The nested nodes, will be only “dummy” nodes, just to satisfy the “fencing algorithm” , there won’t be VMs on.

Questions:
1. Is there a flow, in this design?

2. If server1 goes down, from the initial 4 nodes cluster, I shall have only 2 active ( node3, node4)

The minimum 3 nodes recommendation, is about total number of nodes in the cluster, and you should be fine if, at a certain moment, you have only 2 active… Or, you should have all the time minimum 3 active-online ? From my tests, works even with 2 online, in case of the third node failure.

See also:
For HA, 3 nodes were always needed and recommended. https://forum.proxmox.com/threads/proxmox-4-1-2-nodes.26172/#post-131335

If you are interested in High Availability too, for reliable quorum you must have at least 3 active nodes at all times (all nodes should have the same version). https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster#Requirements

Thx!
 
Last edited:
This setup will never work since it is all about majority vote and with 4 nodes you can only lose one node. In your setup you will always loose at least two nodes which will cause the remaining node to change status to read-only effectively meaning a broken cluster. Cluster algorithm: floor(nodes / 2) + 1 = required online nodes; for nodes > 2.
 
  • Like
Reactions: EuroDomenii
Thanks for the argumentation. Can you point me to the proxmox code line, containing cluster algorithm formula? I didn't find floor keyword in code.

there is no such single line. we use corosync as part of our cluster stack, which includes the whole voting on quorum part. you can give one of your two nodes 2 votes instead of 1, moving your odds of surviving a one node failure from 0 to 50%, but that is still far away from a reliable setup. setup three nodes - the third one can be very small, but it needs to be a separate entity, otherwise you gain (almost) nothing.
 
  • Like
Reactions: EuroDomenii