PE 5.0 (3-node cluster Quorum issue)

devinacosta

Renowned Member
Aug 3, 2017
65
11
73
48
I am trying to create a 3 node cluster initially with the latest Proxmox 5.0-30. I created the first node and then added the 2nd and 3rd nodes and had full quorum for like 3 minutes, then 2 machines exited. See below, this is a newly installed cluster so this isn't making much sense.

1st node commands:

pvecm create mycluster -bindnet0_addr 10.69.69.31 -ring0_addr 10.69.69.31
pvecm add pve01-int -ring0_addr 10.69.69.32

2nd node commands:
pvecm add pve01-int -ring0_addr 10.69.69.32

3rd node commands:
pvecm add pve01-int -ring0_addr 10.69.69.33

Then for like 3 minutes it had quorum and then went south. Right now each host only see's itself, and even if i run 'pvecm nodes' it ONLY returns itself even though the /etc/pve/corosync.conf shows correct:

root@pve02:~# cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: pve03
nodeid: 3
quorum_votes: 1
ring0_addr: 10.69.69.33
}

node {
name: pve01
nodeid: 1
quorum_votes: 1
ring0_addr: 10.69.69.31
}

node {
name: pve02
nodeid: 2
quorum_votes: 1
ring0_addr: 10.69.69.32
}

}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: mycluster
config_version: 3
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: 10.69.69.31
ringnumber: 0
}

}

Corosync from the 1st node.

Aug 22 21:16:51 pve01 corosync[1927]: [TOTEM ] A new membership (10.69.69.31:4) was formed. Members joined: 1
Aug 22 21:16:51 pve01 corosync[1927]: [QUORUM] Members[1]: 1
Aug 22 21:16:51 pve01 corosync[1927]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 22 21:18:53 pve01 corosync[1927]: notice [CFG ] Config reload requested by node 1
Aug 22 21:18:53 pve01 corosync[1927]: [CFG ] Config reload requested by node 1
Aug 22 21:18:53 pve01 corosync[1927]: notice [QUORUM] This node is within the non-primary component and will NOT provide any services.
Aug 22 21:18:53 pve01 corosync[1927]: notice [QUORUM] Members[1]: 1
Aug 22 21:18:53 pve01 corosync[1927]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
Aug 22 21:18:53 pve01 corosync[1927]: [QUORUM] Members[1]: 1
Aug 22 21:18:56 pve01 corosync[1927]: notice [TOTEM ] A new membership (10.69.69.31:8) was formed. Members joined: 2
Aug 22 21:18:56 pve01 corosync[1927]: [TOTEM ] A new membership (10.69.69.31:8) was formed. Members joined: 2
Aug 22 21:18:56 pve01 corosync[1927]: notice [QUORUM] This node is within the primary component and will provide service.
Aug 22 21:18:56 pve01 corosync[1927]: notice [QUORUM] Members[2]: 1 2
Aug 22 21:18:56 pve01 corosync[1927]: notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 22 21:18:56 pve01 corosync[1927]: [QUORUM] This node is within the primary component and will provide service.
Aug 22 21:18:56 pve01 corosync[1927]: [QUORUM] Members[2]: 1 2
Aug 22 21:18:56 pve01 corosync[1927]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 22 21:19:29 pve01 corosync[1927]: notice [CFG ] Config reload requested by node 1
Aug 22 21:19:29 pve01 corosync[1927]: [CFG ] Config reload requested by node 1
Aug 22 21:19:33 pve01 corosync[1927]: notice [TOTEM ] A new membership (10.69.69.31:12) was formed. Members joined: 3
Aug 22 21:19:33 pve01 corosync[1927]: [TOTEM ] A new membership (10.69.69.31:12) was formed. Members joined: 3
Aug 22 21:19:33 pve01 corosync[1927]: notice [QUORUM] Members[3]: 1 2 3
Aug 22 21:19:33 pve01 corosync[1927]: notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 22 21:19:33 pve01 corosync[1927]: [QUORUM] Members[3]: 1 2 3
Aug 22 21:19:33 pve01 corosync[1927]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 22 21:21:24 pve01 corosync[1927]: error [TOTEM ] FAILED TO RECEIVE
Aug 22 21:21:24 pve01 corosync[1927]: [TOTEM ] FAILED TO RECEIVE
Aug 22 21:21:26 pve01 corosync[1927]: notice [TOTEM ] A new membership (10.69.69.31:16) was formed. Members left: 2 3
Aug 22 21:21:26 pve01 corosync[1927]: notice [TOTEM ] Failed to receive the leave message. failed: 2 3
Aug 22 21:21:26 pve01 corosync[1927]: [TOTEM ] A new membership (10.69.69.31:16) was formed. Members left: 2 3
Aug 22 21:21:26 pve01 corosync[1927]: notice [QUORUM] This node is within the non-primary component and will NOT provide any services.
Aug 22 21:21:26 pve01 corosync[1927]: notice [QUORUM] Members[1]: 1
Aug 22 21:21:26 pve01 corosync[1927]: notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 22 21:21:26 pve01 corosync[1927]: [TOTEM ] Failed to receive the leave message. failed: 2 3
Aug 22 21:21:26 pve01 corosync[1927]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
Aug 22 21:21:26 pve01 corosync[1927]: [QUORUM] Members[1]: 1
Aug 22 21:21:26 pve01 corosync[1927]: [MAIN ] Completed service synchronization, ready to provide service.
 
Hi,

this sound like a switch problem.
Please check your switch.
Multicast have to be turned on and if there is something about multicast control check it too.