adding node to cluster failed, now quorum says activity blocked

stefws

Renowned Member
Jan 29, 2015
302
4
83
Denmark
siimnet.dk
Trying to add another node to a 4x node test cluster I got an issue like this:

root@node6:¨#> pvecm add node1
The authenticity of host 'node1 (xx.xx.xx.xx)' can't be established.
ECDSA key fingerprint is da:a7:df:e7:ff:8f:0f:1a:82:82:1b:e1:e6:49:3d:30.
Are you sure you want to continue connecting (yes/no)? yes
copy corosync auth key
stopping pve-cluster service
Stopping pve cluster filesystem: pve-cluster.
backup old database
Starting pve cluster filesystem : pve-cluster.
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
waiting for quorum...

node address seems to say localhost...

root@node6:~# pvecm status
Version: 6.2.0
Config Version: 8
Cluster Name: sprawlcl
Cluster Id: 28778
Cluster Member: Yes
Cluster Generation: 12
Membership state: Cluster-Member
Nodes: 1
Expected votes: 5
Total votes: 1
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 2
Flags:
Ports Bound: 0
Node name: node6
Node ID: 5
Multicast addresses: 239.192.112.218
Node addresses: 127.0.0.1


Following I had no issue adding another node7, which now says:

root@node7:/# pvecm status
Version: 6.2.0
Config Version: 11
Cluster Name: sprawlcl
Cluster Id: 28778
Cluster Member: Yes
Cluster Generation: 236
Membership state: Cluster-Member
Nodes: 5
Expected votes: 5
Total votes: 5
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: node7
Node ID: 6
Multicast addresses: 239.192.112.218
Node addresses: xx.xx.xx.xx



any hints on how to resolve are appreciated (other than just reinstall hole server :)

TIA!
 
Please can you post your /etc/hosts file? Your hostname should be resolvable via that file to
a real network address (not 127.0.0.1),
 
Right thanks, found a typo in it's IP address in /etc/hosts :)

Now I tried to remove this node again from the cluster from another node, but pvecm nodes still reports this node though it's a goner o'right from the /etc/pve/cluster.conf file okay:

root@node4:~# pvecm nodes
Node Sts Inc Joined Name
1 M 232 2015-02-03 15:50:53 node3
2 M 220 2015-02-03 15:37:29 node2
3 M 220 2015-02-03 15:37:29 node1
4 M 216 2015-02-03 15:37:15 node4
5 M 268 2015-02-04 09:39:43 node6
6 M 248 2015-02-03 21:03:32 node7
7 M 264 2015-02-04 09:37:18 node5
root@node4:~# pvecm delnode node6
root@node4:~# pvecm nodes
Node Sts Inc Joined Name
1 M 232 2015-02-03 15:50:53 node3
2 M 220 2015-02-03 15:37:29 node2
3 M 220 2015-02-03 15:37:29 node1
4 M 216 2015-02-03 15:37:15 node4
5 M 268 2015-02-04 09:39:43 node6
6 M 248 2015-02-03 21:03:32 node7
7 M 264 2015-02-04 09:37:18 node5

root@node4:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster name="sprawlcl" config_version="16">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>


<clusternodes>
<clusternode name="node4" votes="1" nodeid="4"/>
<clusternode name="node3" votes="1" nodeid="1"/>
<clusternode name="node2" votes="1" nodeid="2"/>
<clusternode name="node1" votes="1" nodeid="3"/>
<clusternode name="node7" votes="1" nodeid="6"/>
<clusternode name="node5" votes="1" nodeid="7"/>
</clusternodes>


</cluster>

Should I just reinstall node6 from scratch now or could I mend the cluster brain somehow on node 6 and read it again?


Node 6 still thinks differently and not all sus system are running:



root@node6:~# pvecm nodes
Node Sts Inc Joined Name
1 X 0 node3
2 X 0 node2
3 X 0 node1
4 X 0 node4
5 M 4 2015-02-04 09:39:17 node6
6 X 0 node7
7 X 0 node5
root@node6:~# pvecm stat
Version: 6.2.0
Config Version: 16
Cluster Name: sprawlcl
Cluster Id: 28778
Cluster Member: Yes
Cluster Generation: 268
Membership state: Cluster-Member
Nodes: 1
Expected votes: 5
Total votes: 1
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: node6
Node ID: 5
Multicast addresses: 239.192.112.218
Node addresses: xx.xx.xx.xx
 
Just did a reinstall and then added successfully this node, thanks!

root@node6:~# pvecm stat
Version: 6.2.0
Config Version: 17
Cluster Name: sprawlcl
Cluster Id: 28778
Cluster Member: Yes
Cluster Generation: 288
Membership state: Cluster-Member
Nodes: 7
Expected votes: 7
Total votes: 7
Node votes: 1
Quorum: 4
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: node6
Node ID: 5
Multicast addresses: 239.192.112.218
Node addresses: xx.xx.xx.xx