Heuristic Assistance/Suggestions

adamb

Famous Member
Mar 1, 2012
1,329
77
113
Hey all!

I am trying to pin down some solid heuristic's for my two node clusters. Ultimately I would like the node fenced if the DRBD/Cluster network goes down along with the local LAN. The local LAN should be easy utilizing some ping's. The DRBD/cluster network is the one I am having issues with.

<heuristic interval="3" program="ping 10.80.1.8 -c1 -w1" score="1" tko="3"/>
<heuristic interval="3" program="ping 10.80.1.1 -c1 -w1" score="1" tko="3"/>
<heuristic interval="3" program="ip addr | grep eth0 | grep -q UP" score="2" tko="3"/>

Eth0 is my DRBD/cluster network. This is a dedicated network between each node. Meaning that if eth0 is down on node1 it is also down on node2, because there is no switch. Any other ideas or suggestions? I don't think pinging each other will work because if one node is down, then pings will fail on the node which is still up.
 
This is also my other option. Not to sure on this one. Has anyone else used this?

master_wins="0"If set to 1 (on), only the qdiskd master will advertise its votes to CMAN. In a network partition, only the qdisk master will provide votes to CMAN. Consequently, that node will automatically "win" in a fence race.This option requires careful tuning of the CMAN timeout, the qdiskd timeout, and CMAN's quorum_dev_poll value. As a rule of thumb, CMAN's quorum_dev_poll value should be equal to Totem's token timeout and qdiskd's timeout (interval*tko) should be less than half of Totem's token timeout. See section 3.3.1 for more information.
This option only takes effect if there are no heuristics configured. Usage of this option in configurations with more than two cluster nodes is undefined and should not be done.
In a two-node cluster with no heuristics and no defined vote count (see above), this mode is turned by default. If enabled in this way at startup and a node is later added to the cluster configuration or the vote count is set to a value other than 1, this mode will be disabled.
 
master_wins seems to function quite well. Bummer losing the ability to use heuristics but atleast the cluster is fencing the problem node off properly now.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!