[SOLVED] How to test fencing/watchdog device is working

Have not tested this in a while - but you can simulate "pulling" a cable by setting the link down:
* `ip link set <iface> down`
(can be set up again with `ip link set <iface> up`)
replace <iface> by your physical NIC carrying the cluster traffic.

make sure you have access to the console by other means! (IPMI/iDrac....)

I hope this helps!
 
Sorry got sidetracked with other projects...
Yes, that does help. However, we didn't witness the behavior we expected. With cluster network traffic down, the two "up" nodes and the "down" node could not communicate. But the "down" node did not get fenced. We now understand why this is the case. So, a node will only get fenced by watchdog if the node locks up (ala something like kernel panic, etc). But what of other cases.., like say the corosync stack has issues, etc?

Is there a way to have the nodes that have quorum fence the "down" node?

Thanks!
 
please share the logs (journal from the disconnected node and one of the one's that formed quorum) from the timeframe you set the link to down.

* You need to have HA active (meaning at least one guest running under HA) - else no fencing takes place (because there is no risk for split-brain)
* The interval for fencing is roughly 2 minutes - you should wait at least 3 minutes

I hope this helps!
 
  • Like
Reactions: pherrera_tamu