Hello,
Currently have a 4 node cluster trying to add a 5th node, the add command goes through fine and connects to the cluster, however then sits at waiting for quorm.
Checking the service pve-cluster status on another node is full of:
Oct 31 17:04:14 prox pmxcfs[3807]: [dcdb] crit: cpg_send_message failed: 6
Oct 31 17:04:14 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 10
Oct 31 17:04:15 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 10
Oct 31 17:04:15 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 20
Oct 31 17:04:16 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 20
Oct 31 17:04:16 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 30
Oct 31 17:04:17 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 30
Oct 31 17:04:17 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 40
Oct 31 17:04:18 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 40
The whole cluster & GUI loses connectivity, the only way I can fix this is by service corosync stop on the new node.
Attached is corosync.conf file
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: n1
nodeid: 1
quorum_votes: 1
ring0_addr: n1-corosync
}
node {
name: prox
nodeid: 3
quorum_votes: 1
ring0_addr: prox-corosync
}
node {
name: n2
nodeid: 2
quorum_votes: 1
ring0_addr: n2-corosync
}
node {
name: sn1
nodeid: 4
quorum_votes: 1
ring0_addr: sn1-corosync
}
node {
name: sn2
nodeid: 5
quorum_votes: 1
ring0_addr: sn2-corosync
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: clustername
config_version: 7
ip_version: ipv4
secauth: on
transport: udpu
version: 2
interface {
bindnetaddr: 172.16.1.1
ringnumber: 0
}
}
New node and all other nodes have updated host file, and all can ping no issue, currently running on udpu so I know is not a multicast issue, seems when the new node is added into the cluster goes into a dead lock until corosync is shutdown on new node.
Log from new node also shows :
Oct 31 17:04:19 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 620
Oct 31 17:04:20 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 630
Oct 31 17:04:21 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 640
Oct 31 17:04:22 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 650
Oct 31 17:04:23 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 660
Oct 31 17:04:24 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 670
Oct 31 17:04:25 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 680
Oct 31 17:04:26 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 690
Oct 31 17:04:27 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 700
Oct 31 17:04:28 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 710
Thanks
,Ashley
Currently have a 4 node cluster trying to add a 5th node, the add command goes through fine and connects to the cluster, however then sits at waiting for quorm.
Checking the service pve-cluster status on another node is full of:
Oct 31 17:04:14 prox pmxcfs[3807]: [dcdb] crit: cpg_send_message failed: 6
Oct 31 17:04:14 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 10
Oct 31 17:04:15 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 10
Oct 31 17:04:15 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 20
Oct 31 17:04:16 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 20
Oct 31 17:04:16 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 30
Oct 31 17:04:17 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 30
Oct 31 17:04:17 prox pmxcfs[3807]: [status] notice: cpg_send_message retry 40
Oct 31 17:04:18 prox pmxcfs[3807]: [dcdb] notice: cpg_send_message retry 40
The whole cluster & GUI loses connectivity, the only way I can fix this is by service corosync stop on the new node.
Attached is corosync.conf file
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: n1
nodeid: 1
quorum_votes: 1
ring0_addr: n1-corosync
}
node {
name: prox
nodeid: 3
quorum_votes: 1
ring0_addr: prox-corosync
}
node {
name: n2
nodeid: 2
quorum_votes: 1
ring0_addr: n2-corosync
}
node {
name: sn1
nodeid: 4
quorum_votes: 1
ring0_addr: sn1-corosync
}
node {
name: sn2
nodeid: 5
quorum_votes: 1
ring0_addr: sn2-corosync
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: clustername
config_version: 7
ip_version: ipv4
secauth: on
transport: udpu
version: 2
interface {
bindnetaddr: 172.16.1.1
ringnumber: 0
}
}
New node and all other nodes have updated host file, and all can ping no issue, currently running on udpu so I know is not a multicast issue, seems when the new node is added into the cluster goes into a dead lock until corosync is shutdown on new node.
Log from new node also shows :
Oct 31 17:04:19 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 620
Oct 31 17:04:20 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 630
Oct 31 17:04:21 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 640
Oct 31 17:04:22 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 650
Oct 31 17:04:23 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 660
Oct 31 17:04:24 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 670
Oct 31 17:04:25 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 680
Oct 31 17:04:26 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 690
Oct 31 17:04:27 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 700
Oct 31 17:04:28 sn2 pmxcfs[2933]: [dcdb] notice: cpg_join retry 710
Thanks
,Ashley