Wierd cluster behavior - No quorum?

Nils · Mar 25, 2015

G'day everyone,

I am seeing strange issues on a two node cluster.
They are connected via a direct 10G ethernet fibre optic link and can ping each other.
I started the configuration from scratch.
I use unicast communication.
Time is synced via NTP.

/etc/pve/cluster.conf

Code:

<?xml version="1.0"?>
<cluster name="CLUSTER1" config_version="3">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu">
  </cman>

  <clusternodes>
  <clusternode name="vnode-1" votes="1" nodeid="1"/>
  <clusternode name="vnode-CLUSTER1" votes="1" nodeid="2"/></clusternodes>

</cluster>

Hostnames are known to each other:

Code:

172.22.0.1 vnode-1.domain.local vnode-1 pvelocalhost
172.22.0.2 vnode-2.domain.local vnode-2

resp.

Code:

172.22.0.2 vnode-2.domain.local vnode-2 pvelocalhost
172.22.0.1 vnode-1.domain.local vnode-1

With cmdline tools, everything seems fine:

Code:

root@vnode-2:~# clustat 
Cluster Status for CLUSTER1 @ Tue Mar 24 08:47:06 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vnode-1                                                          1 Online
 vnode-2                                                           2 Online, Local


root@vnode-2:~# pvecm nodes
Node  Sts   Inc   Joined               Name
   1   M    992   2015-03-24 08:25:40  vnode-1
   2   M     20   2015-03-21 20:10:09  vnode-2


root@vnode-2:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2  
Active subsystems: 2
Flags: 
Ports Bound: 0  
Node name: vnode-2
Node ID: 2
Multicast addresses: 255.255.255.255 
Node addresses: 172.22.0.2

resp.

Code:

root@vnode-1:~# clustat 
Cluster Status for CLUSTER1 @ Tue Mar 24 08:48:50 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vnode-1                                                          1 Online, Local
 vnode-2                                                           2 Online

root@vnode-1:~# pvecm nodes
Node  Sts   Inc   Joined               Name
   1   M    620   2015-03-21 20:21:01  vnode-1
   2   M    992   2015-03-24 08:25:23  vnode-2


root@vnode-1:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2  
Active subsystems: 5
Flags: 
Ports Bound: 0  
Node name: vnode-1
Node ID: 1
Multicast addresses: 255.255.255.255 
Node addresses: 172.22.0.1

When look at the GUI, the nodes in the left inventory show red status and show a different view depending on which node I view the GUI, please see attached screenshots.

Whenever I try to do an action in the GUI, I get the message: "cluster not ready - no quorum? (500)"

I am a bit clueless, since I have similar clusters setup like this (with additional quorum disks, fencing etc.) running without problems.
I am absolutely grateful for any advice where to look or what to change.

/var/log/daemon.log:

Code:

Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:51 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1900
Mar 24 08:38:52 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1910
Mar 24 08:38:53 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1920
Mar 24 08:38:54 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1930
Mar 24 08:38:55 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1940
Mar 24 08:38:56 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1950
Mar 24 08:38:57 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1960
Mar 24 08:38:58 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1970
Mar 24 08:38:59 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1980
Mar 24 08:39:00 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1990
Mar 24 08:39:00 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9

PS: Yes, I do have a community subscription, it's just currently not rolled out due to the fact I was reinstalling the nodes several times ;-)

Mr.Holmes · Mar 29, 2015

Hello Nils,

Nils said:

Code:

root@vnode-1:~# clustat
Cluster Status for CLUSTER1 @ Tue Mar 24 08:48:50 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vnode-1                                                          1 Online, Local
 vnode-2                                                           2 Online

root@vnode-1:~# pvecm nodes
Node  Sts   Inc   Joined               Name
   1   M    620   2015-03-21 20:21:01  vnode-1
   2   M    992   2015-03-24 08:25:23  vnode-2


root@vnode-1:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2  
Active subsystems: 5
Flags:
Ports Bound: 0  
Node name: vnode-1
Node ID: 1
Multicast addresses: 255.255.255.255
Node addresses: 172.22.0.1

When look at the GUI, the nodes in the left inventory show red status and show a different view depending on which node I view the GUI, please see attached screenshots.

Whenever I try to do an action in the GUI, I get the message: "cluster not ready - no quorum? (500)"

I have seen such a phenomenon too sometimes - when I had problems with access to NFS storage.

Cannot say if it is in your case the same, but at least you can try (e.g. remove temporarily NFS storage)

Kind regards

Mr.Holmes

Search

Search

Wierd cluster behavior - No quorum?

Nils

New Member

Attachments

Mr.Holmes

Active Member

We value your privacy