Wierd cluster behavior - No quorum?

Nils

New Member
Jan 19, 2014
7
0
1
G'day everyone,

I am seeing strange issues on a two node cluster.
They are connected via a direct 10G ethernet fibre optic link and can ping each other.
I started the configuration from scratch.
I use unicast communication.
Time is synced via NTP.

/etc/pve/cluster.conf
Code:
<?xml version="1.0"?>
<cluster name="CLUSTER1" config_version="3">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu">
  </cman>

  <clusternodes>
  <clusternode name="vnode-1" votes="1" nodeid="1"/>
  <clusternode name="vnode-CLUSTER1" votes="1" nodeid="2"/></clusternodes>

</cluster>

Hostnames are known to each other:
Code:
172.22.0.1 vnode-1.domain.local vnode-1 pvelocalhost
172.22.0.2 vnode-2.domain.local vnode-2
resp.
Code:
172.22.0.2 vnode-2.domain.local vnode-2 pvelocalhost
172.22.0.1 vnode-1.domain.local vnode-1

With cmdline tools, everything seems fine:
Code:
root@vnode-2:~# clustat 
Cluster Status for CLUSTER1 @ Tue Mar 24 08:47:06 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vnode-1                                                          1 Online
 vnode-2                                                           2 Online, Local


root@vnode-2:~# pvecm nodes
Node  Sts   Inc   Joined               Name
   1   M    992   2015-03-24 08:25:40  vnode-1
   2   M     20   2015-03-21 20:10:09  vnode-2


root@vnode-2:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2  
Active subsystems: 2
Flags: 
Ports Bound: 0  
Node name: vnode-2
Node ID: 2
Multicast addresses: 255.255.255.255 
Node addresses: 172.22.0.2
resp.
Code:
root@vnode-1:~# clustat 
Cluster Status for CLUSTER1 @ Tue Mar 24 08:48:50 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vnode-1                                                          1 Online, Local
 vnode-2                                                           2 Online

root@vnode-1:~# pvecm nodes
Node  Sts   Inc   Joined               Name
   1   M    620   2015-03-21 20:21:01  vnode-1
   2   M    992   2015-03-24 08:25:23  vnode-2


root@vnode-1:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2  
Active subsystems: 5
Flags: 
Ports Bound: 0  
Node name: vnode-1
Node ID: 1
Multicast addresses: 255.255.255.255 
Node addresses: 172.22.0.1

When look at the GUI, the nodes in the left inventory show red status and show a different view depending on which node I view the GUI, please see attached screenshots.

Whenever I try to do an action in the GUI, I get the message: "cluster not ready - no quorum? (500)"

I am a bit clueless, since I have similar clusters setup like this (with additional quorum disks, fencing etc.) running without problems.
I am absolutely grateful for any advice where to look or what to change.

/var/log/daemon.log:
Code:
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:51 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1900
Mar 24 08:38:52 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1910
Mar 24 08:38:53 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1920
Mar 24 08:38:54 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1930
Mar 24 08:38:55 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1940
Mar 24 08:38:56 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1950
Mar 24 08:38:57 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1960
Mar 24 08:38:58 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1970
Mar 24 08:38:59 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1980
Mar 24 08:39:00 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1990
Mar 24 08:39:00 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9

PS: Yes, I do have a community subscription, it's just currently not rolled out due to the fact I was reinstalling the nodes several times ;-)
 

Attachments

  • GUI_View.jpg
    GUI_View.jpg
    144.7 KB · Views: 9
Hello Nils,

Code:
root@vnode-1:~# clustat
Cluster Status for CLUSTER1 @ Tue Mar 24 08:48:50 2015
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vnode-1                                                          1 Online, Local
 vnode-2                                                           2 Online

root@vnode-1:~# pvecm nodes
Node  Sts   Inc   Joined               Name
   1   M    620   2015-03-21 20:21:01  vnode-1
   2   M    992   2015-03-24 08:25:23  vnode-2


root@vnode-1:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2  
Active subsystems: 5
Flags:
Ports Bound: 0  
Node name: vnode-1
Node ID: 1
Multicast addresses: 255.255.255.255
Node addresses: 172.22.0.1

When look at the GUI, the nodes in the left inventory show red status and show a different view depending on which node I view the GUI, please see attached screenshots.

Whenever I try to do an action in the GUI, I get the message: "cluster not ready - no quorum? (500)"

I have seen such a phenomenon too sometimes - when I had problems with access to NFS storage.

Cannot say if it is in your case the same, but at least you can try (e.g. remove temporarily NFS storage)

Kind regards

Mr.Holmes
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!