G'day everyone,
I am seeing strange issues on a two node cluster.
They are connected via a direct 10G ethernet fibre optic link and can ping each other.
I started the configuration from scratch.
I use unicast communication.
Time is synced via NTP.
/etc/pve/cluster.conf
Hostnames are known to each other:
resp.
With cmdline tools, everything seems fine:
resp.
When look at the GUI, the nodes in the left inventory show red status and show a different view depending on which node I view the GUI, please see attached screenshots.
Whenever I try to do an action in the GUI, I get the message: "cluster not ready - no quorum? (500)"
I am a bit clueless, since I have similar clusters setup like this (with additional quorum disks, fencing etc.) running without problems.
I am absolutely grateful for any advice where to look or what to change.
/var/log/daemon.log:
PS: Yes, I do have a community subscription, it's just currently not rolled out due to the fact I was reinstalling the nodes several times ;-)
I am seeing strange issues on a two node cluster.
They are connected via a direct 10G ethernet fibre optic link and can ping each other.
I started the configuration from scratch.
I use unicast communication.
Time is synced via NTP.
/etc/pve/cluster.conf
Code:
<?xml version="1.0"?>
<cluster name="CLUSTER1" config_version="3">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu">
</cman>
<clusternodes>
<clusternode name="vnode-1" votes="1" nodeid="1"/>
<clusternode name="vnode-CLUSTER1" votes="1" nodeid="2"/></clusternodes>
</cluster>
Hostnames are known to each other:
Code:
172.22.0.1 vnode-1.domain.local vnode-1 pvelocalhost
172.22.0.2 vnode-2.domain.local vnode-2
Code:
172.22.0.2 vnode-2.domain.local vnode-2 pvelocalhost
172.22.0.1 vnode-1.domain.local vnode-1
With cmdline tools, everything seems fine:
Code:
root@vnode-2:~# clustat
Cluster Status for CLUSTER1 @ Tue Mar 24 08:47:06 2015
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
vnode-1 1 Online
vnode-2 2 Online, Local
root@vnode-2:~# pvecm nodes
Node Sts Inc Joined Name
1 M 992 2015-03-24 08:25:40 vnode-1
2 M 20 2015-03-21 20:10:09 vnode-2
root@vnode-2:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 2
Flags:
Ports Bound: 0
Node name: vnode-2
Node ID: 2
Multicast addresses: 255.255.255.255
Node addresses: 172.22.0.2
Code:
root@vnode-1:~# clustat
Cluster Status for CLUSTER1 @ Tue Mar 24 08:48:50 2015
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
vnode-1 1 Online, Local
vnode-2 2 Online
root@vnode-1:~# pvecm nodes
Node Sts Inc Joined Name
1 M 620 2015-03-21 20:21:01 vnode-1
2 M 992 2015-03-24 08:25:23 vnode-2
root@vnode-1:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: CLUSTER1
Cluster Id: 495
Cluster Member: Yes
Cluster Generation: 992
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: vnode-1
Node ID: 1
Multicast addresses: 255.255.255.255
Node addresses: 172.22.0.1
When look at the GUI, the nodes in the left inventory show red status and show a different view depending on which node I view the GUI, please see attached screenshots.
Whenever I try to do an action in the GUI, I get the message: "cluster not ready - no quorum? (500)"
I am a bit clueless, since I have similar clusters setup like this (with additional quorum disks, fencing etc.) running without problems.
I am absolutely grateful for any advice where to look or what to change.
/var/log/daemon.log:
Code:
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:50 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
Mar 24 08:38:51 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1900
Mar 24 08:38:52 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1910
Mar 24 08:38:53 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1920
Mar 24 08:38:54 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1930
Mar 24 08:38:55 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1940
Mar 24 08:38:56 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1950
Mar 24 08:38:57 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1960
Mar 24 08:38:58 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1970
Mar 24 08:38:59 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1980
Mar 24 08:39:00 vnode-1 pmxcfs[274965]: [dcdb] notice: cpg_join retry 1990
Mar 24 08:39:00 vnode-1 pmxcfs[274965]: [status] crit: cpg_send_message failed: 9
PS: Yes, I do have a community subscription, it's just currently not rolled out due to the fact I was reinstalling the nodes several times ;-)