RGManager don't start

KDE

New Member
Dec 16, 2014
3
0
1
Hello,


we have a problem with our new 4 Node HA Cluster.


We installed the recent VE3.3 Version and activated fencing on all nodes.
Then we updated all nodes and created the Cluster und added all nodes to them.


pvecm nodes: (same on all nodes)
Node Sts Inc Joined Name
1 M 124 2014-12-16 11:23:42 kde-node01
2 M 136 2014-12-16 11:26:18 kde-node02
3 M 144 2014-12-16 11:28:47 kde-node03
4 M 156 2014-12-16 11:32:28 kde-node04


pvecm status node 1:
Version: 6.2.0
Config Version: 4
Cluster Name: kde-cluster
Cluster Id: 9140
Cluster Member: Yes
Cluster Generation: 156
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: kde-node01
Node ID: 1


pvecm status node 2:
Version: 6.2.0
Config Version: 4
Cluster Name: kde-cluster
Cluster Id: 9140
Cluster Member: Yes
Cluster Generation: 156
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: kde-node02
Node ID: 2


pvecm status node 3:
Version: 6.2.0
Config Version: 4
Cluster Name: kde-cluster
Cluster Id: 9140
Cluster Member: Yes
Cluster Generation: 156
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: kde-node03
Node ID: 3


pvecm status node 4:
Version: 6.2.0
Config Version: 4
Cluster Name: kde-cluster
Cluster Id: 9140
Cluster Member: Yes
Cluster Generation: 156
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: kde-node04
Node ID: 4


/etc/cluster/cluster.conf:
<?xml version="1.0"?>
<cluster name="kde-cluster" config_version="4">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>


<clusternodes>
<clusternode name="kde-node01" votes="1" nodeid="1"/>
<clusternode name="kde-node02" votes="1" nodeid="2"/>
<clusternode name="kde-node03" votes="1" nodeid="3"/>
<clusternode name="kde-node04" votes="1" nodeid="4"/>
</clusternodes>


</cluster>

fence_tool ls (same on all nodes):
fence domain
member count 4
victim count 0
victim now 0
master nodeid 1
wait state none
members 1 2 3 4


Now we would like to start the RGManager but it is not running and no reply on the CLI




root@kde-node01:~# service rgmanager status
rgmanager is stopped
root@kde-node01:~# service rgmanager start
root@kde-node01:~# service rgmanager status
rgmanager is stopped


following we join all nodes to fencing again and restart cman on all nodes. Finally we restart the service pve-cluster on all nodes
but we have the same problem like before.


any ideas?
 
I do not see any configure fencing devices in your cluster.conf.?

without fencing, it cannot work.
 
we have HP ILO Server, is that correct.

<?xml version="1.0"?>
<cluster name="kde-cluster" config_version="4">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>




<clusternodes>
<clusternode name="kde-node01" votes="1" nodeid="1">
<fence>
<method name="1">
<device name="ilo-kde-node01" action="reboot"/>
</method>
</fence>
</clusternode>
<clusternode name="kde-node02" votes="1" nodeid="2">
<fence>
<method name="1">
<device name="ilo-kde-node02" action="reboot"/>
</method>
</fence>
</clusternode>
<clusternode name="kde-node03" votes="1" nodeid="3">
<fence>
<method name="1">
<device name="ilo-kde-node03" action="reboot"/>
</method>
</fence>
</clusternode>
<clusternode name="kde-node04" votes="1" nodeid="4">
<fence>
<method name="1">
<device name="ilo-kde-node04" action="reboot"/>
</method>
</fence>
</clusternode>
</clusternodes>


<fencedevices>
<fencedevice agent="fence_ilo" ipaddr="xxx" login="xxx" name="ilo-kde-node01" passwd="xxx"/>
<fencedevice agent="fence_ilo" ipaddr="xxx" login="xxx" name="ilo-kde-node02" passwd="xxx"/>
<fencedevice agent="fence_ilo" ipaddr="xxx" login="xxx" name="ilo-kde-node03" passwd="xxx"/>
<fencedevice agent="fence_ilo" ipaddr="xxx" login="xxx" name="ilo-kde-node04" passwd="xxx"/>
</fencedevices>




</cluster>
 
Okay fencing seems to work.

on node01 with the command fence_check -vv:
fence_check run at Tue Dec 16 15:47:59 CET 2014 pid: 12973
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Get node list: kde-node01 kde-node02 kde-node03 kde-node04
Testing kde-node01 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node kde-node01
Found 1 method(s) to test for node kde-node01
Testing kde-node01 method 1 status
Testing kde-node01 method 1: success
Testing kde-node02 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node kde-node02
Found 1 method(s) to test for node kde-node02
Testing kde-node02 method 1 status
Testing kde-node02 method 1: success
Testing kde-node03 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node kde-node03
Found 1 method(s) to test for node kde-node03
Testing kde-node03 method 1 status
Testing kde-node03 method 1: success
Testing kde-node04 fencing
Checking if cman is running: running
Checking if node is quorate: quorate
Checking if node is in fence domain: yes
Checking if node is fence master: this node is fence master
Checking if real fencing is in progress: no fencing in progress
Checking how many fencing methods are configured for node kde-node04
Found 1 method(s) to test for node kde-node04
Testing kde-node04 method 1 status
Testing kde-node04 method 1: success
cleanup: 0

and fence_ilo seems to work too:
root@kde-node01:~# fence_node kde-node02 -vv
fence kde-node02 dev 0.0 agent fence_ilo result: success
agent args: action=reboot nodename=kde-node02 agent=fence_ilo ipaddr=xxx login=xxx passwd=xxx
fence kde-node02 success


after fencing works we have restarted the cman and pve-cluster service but we still have the same problem with the rgmanager...

anyone an idea?
 
after fencing works we have restarted the cman and pve-cluster service but we still have the same problem with the rgmanager...

anyone an idea?

RGManager won't start unless you have something marked for high-availability. We usually put a VIP entry in the cluster.conf as it is useful, and always causes RGManager to start even if we haven't marked any VMs for high availability.

Here's what we typically put in there (after the </clusternodes>):
Code:
<rm>
  <service autostart="1" exclusive="0" name="cluster-ip" recovery="relocate">
    <ip address="10.50.30.40" monitor_link="on" sleeptime="10"/>
  </service>
</rm>
 
RGManager won't start unless you have something marked for high-availability. We usually put a VIP entry in the cluster.conf as it is useful, and always causes RGManager to start even if we haven't marked any VMs for high availability.

Here's what we typically put in there (after the </clusternodes>):
Code:
<rm>
  <service autostart="1" exclusive="0" name="cluster-ip" recovery="relocate">
    <ip address="10.50.30.40" monitor_link="on" sleeptime="10"/>
  </service>
</rm>

After that, and # ccs_config_validate -v -f /etc/pve/cluster.conf.new

go to the GUI : H.A ------> ACTIVE.

Reboot the nodes.

Merci.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!