I'm putting together a cluster and confused about fencing. It looks like the recommended way is to use a watchdog timer to reboot a host that is hung. In my previous experience with clusters, the remaining hosts would "STONITH" the non-responsive host. Is there a reason that this method isn't recommended?
If the host is hung, the watchdog is fine but if there is a cluster network failure and VM are restarted on new hosts, you could have a split brain situation. Using IPMI to power cycle the disconnected server would be my choice.
Thanks
If the host is hung, the watchdog is fine but if there is a cluster network failure and VM are restarted on new hosts, you could have a split brain situation. Using IPMI to power cycle the disconnected server would be my choice.
Thanks