My nodes consist of Supermicro dual node servers (two servers in a 1U case with a shared PSU). Because two nodes share a PSU, it's not practicable to utilize a switched PDU for fencing (it would power off two nodes). That means I have to use IPMI for fencing, which will clearly fail if a node loses power. Also, I suspect that IPMI is sometimes simply too slow and hits some timeout. I've managed to create this situation just by issuing "shutdown" from the command line. The symptom seems to be that the cluster will sometimes lose track of a CT (it can't be operated on in any fashion).
I was considering adding the manual fencing agent as a secondary fencing device with the following added to /etc/rc.local on each node:
Since the node has clearly been rebooted, this seems safe enough and will hopefully restore sanity to the cluster.
Anyone see any caveats with doing this?
I was considering adding the manual fencing agent as a secondary fencing device with the following added to /etc/rc.local on each node:
Code:
hostname -s > /var/run/cluster/fenced_override
Since the node has clearly been rebooted, this seems safe enough and will hopefully restore sanity to the cluster.
Anyone see any caveats with doing this?
Last edited: