3 nodes clusters : problem after upgrade to 2.1

jmbeuken

Renowned Member
Mar 14, 2011
8
0
66
Sombreffe, Belgium
www.uclouvain.be
Hi,

I have a 3 nodes working under PVE 2.0 in cluster since 1 month
today, I stop all VM and I upgrade all server nodes ( aptitude update && aptitude full-upgrade )
but now,

-------
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
------

on the 3 nodes, the /etc/pve/cluster.conf is the same :

<?xml version="1.0"?>
<cluster name="ETSF" config_version="11">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"></cman>
<clusternodes>
<clusternode name="openvz" votes="1" nodeid="3"/>
<clusternode name="abinit6" votes="1" nodeid="1"/>
<clusternode name="abinit5" votes="1" nodeid="4"/>
</clusternodes>
</cluster>
pveversion --verbose ( same on the 3 nodes )


pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-11-pve: 2.6.32-66
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-15
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1


here , the output of "pvecm nodes" on the 3 nodes


Node Sts Inc Joined Name
1 X 0 abinit6
3 X 0 openvz
4 M 140 2012-05-07 12:33:48 abinit5

Node Sts Inc Joined Name
1 M 68 2012-05-06 21:17:11 abinit6
3 M 52 2012-05-06 21:04:54 openvz
4 X 52 abinit5

Node Sts Inc Joined Name
1 M 60 2012-05-06 21:17:11 abinit6
3 M 68 2012-05-06 21:17:11 openvz
4 X 0 abinit5



on each nodes , there are "Quorum: X Activity blocked"

pvecm status on node abinit5, for ex :


Version: 6.2.0
Config Version: 11
Cluster Name: ETSF
Cluster Id: 1124
Cluster Member: Yes
Cluster Generation: 220
Membership state: Cluster-Member
Nodes: 1
Expected votes: 3
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: abinit5
Node ID: 4
Multicast addresses: 239.192.4.104
Node addresses: xxx.xxx.xxx.138


or on openvz :

Version: 6.2.0
Config Version: 11
Cluster Name: ETSF
Cluster Id: 1124
Cluster Member: Yes
Cluster Generation: 68
Membership state: Cluster-Member
Nodes: 2
Expected votes: 4
Total votes: 2
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: openvz
Node ID: 3
Multicast addresses: 239.192.4.104
Node addresses: xxx.xxx.xxx.248




the "corosync" service runs on all nodes...
there is the same error message as mentioned in thread : #1

I don't know why but "Expected votes" is 4 indeed of 3 under 2 nodes ( openvz, abinit6 )
I remove 1 month ago a node then the cluster had 4 nodes

it's the problem ?

thank for your help

jmb
 
Last edited:
Dear jmb,

yes, I had the same problem and the biggest problem is, you cant remove a node from the cluster manual - we must complete reinstall one server - and in google nothing to find!

DB
 
Hi,

I don't known why , but after reboot only one server/node ( abinit6 ),

the cluster works again...:cool:

root@abinit6:~# pvecm statusVersion: 6.2.0
Config Version: 12
Cluster Name: ETSF
Cluster Id: 1124
Cluster Member: Yes
Cluster Generation: 224
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: abinit6
Node ID: 1

@udo, as you see, a 3 nodes cluster , expects 3 votes...


jmb