fenced (and rgmanager) not * always * automatically started after boot

wosp

Renowned Member
Apr 18, 2015
203
23
83
38
The Netherlands
Good Day!

We have a 3 node cluster that is working fine, but after a reboot of a node fenced (and therefore rgmanager) is not always started automatically. One time it works and another time it doesn't, all 3 nodes have this problem at random and I can't find any logical reason for it. Nodes see each other when this occurs:

Code:
root@host03:~# clustat
Cluster Status for HA-cluster @ Sat Apr 18 18:43:09 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 host01                                                              1 Online
 host02                                                              2 Online
 host03                                                              3 Online, Local

Code:
# cat /var/log/cluster/rgmanager.log
Apr 18 18:41:33 rgmanager Waiting for quorum to form
Apr 18 18:41:58 rgmanager Quorum formed

Code:
root@host03:~# service cman status
fenced is stopped

A manual start of rgmanager gives a 'failed' back (probably because fenced isn't running), after a manual restart of cman I can start rgmanager and everything is working as it should be. Any ideas?
 
Hi,

Check on each node :
tail -f /var/log/cluster/*.log

You should have one node trying to fence another one and your fencing does not work correctly.
Correct your cluster.conf and everything should restart correctly.
 
Problem seems to be 'solved'. I think this problem was caused by a not properly working multicast router. Since we have fixed this, the problem has not returned.