HI,
I was wondering if someone could shed some light on the issue im having. Currently i have 7 node cluster but what happed is that on one of nodes seem to lost the corosync.
I did the following
then i checked the last logs
this is the corosync of the bad node
and this is corosync of a good node
Thank you
I was wondering if someone could shed some light on the issue im having. Currently i have 7 node cluster but what happed is that on one of nodes seem to lost the corosync.
I did the following
Code:
root@prometheus5:~# pvecm status
Cannot initialize CMAP service
then i checked the last logs
Code:
Nov 11 09:21:28 prometheus5 corosync[2249]: notice [TOTEM ] A new membership (192.168.3.99:7260) was formed. Members joined: 2 1 5 left: 2 1 5
Nov 11 09:21:28 prometheus5 corosync[2249]: notice [TOTEM ] Failed to receive the leave message. failed: 2 1 5
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] A new membership (192.168.3.99:7260) was formed. Members joined: 2 1 5 left: 2 1 5
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Failed to receive the leave message. failed: 2 1 5
Nov 11 09:21:28 prometheus5 corosync[2249]: notice [TOTEM ] A new membership (192.168.3.99:7268) was formed. Members joined: 2 1 left: 2 1
Nov 11 09:21:28 prometheus5 corosync[2249]: notice [TOTEM ] Failed to receive the leave message. failed: 2 1
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] A new membership (192.168.3.99:7268) was formed. Members joined: 2 1 left: 2 1
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Failed to receive the leave message. failed: 2 1
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=7
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: notice [TOTEM ] A new membership (192.168.3.99:7276) was formed. Members joined: 2 left: 2
Nov 11 09:21:28 prometheus5 corosync[2249]: notice [TOTEM ] Failed to receive the leave message. failed: 2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: warning [TOTEM ] Discarding JOIN message during flush, nodeid=7
Nov 11 09:21:28 prometheus5 corosync[2249]: corosync: totemsrp.c:2871: orf_token_rtr: Assertion `range < QUEUE_RTR_ITEMS_SIZE_MAX' failed.
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=7
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=7
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] A new membership (192.168.3.99:7276) was formed. Members joined: 2 left: 2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Failed to receive the leave message. failed: 2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=1
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=6
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=4
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=3
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=2
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=5
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=7
Nov 11 09:21:28 prometheus5 corosync[2249]: [TOTEM ] Discarding JOIN message during flush, nodeid=7
Nov 11 09:21:28 prometheus5 systemd[1]: corosync.service: Main process exited, code=killed, status=6/ABRT
Nov 11 09:21:28 prometheus5 systemd[1]: corosync.service: Unit entered failed state.
Nov 11 09:21:28 prometheus5 systemd[1]: corosync.service: Failed with result 'signal'.
this is the corosync of the bad node
Code:
root@prometheus5:~# cat /etc/corosync/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: prometheus
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.3.150
}
node {
name: prometheus11
nodeid: 3
quorum_votes: 1
ring0_addr: 192.168.3.216
}
node {
name: prometheus12
nodeid: 6
quorum_votes: 1
ring0_addr: 192.168.3.186
}
node {
name: prometheus2
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.3.152
}
node {
name: prometheus4
nodeid: 4
quorum_votes: 1
ring0_addr: 192.168.3.187
}
node {
name: prometheus5
nodeid: 7
quorum_votes: 1
ring0_addr: 192.168.3.197
}
node {
name: prometheus6
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.3.99
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: troy
config_version: 7
interface {
bindnetaddr: 192.168.3.150
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
and this is corosync of a good node
Code:
root@prometheus6:~# cat /etc/corosync/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: prometheus
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.3.150
}
node {
name: prometheus11
nodeid: 3
quorum_votes: 1
ring0_addr: 192.168.3.216
}
node {
name: prometheus12
nodeid: 6
quorum_votes: 1
ring0_addr: 192.168.3.186
}
node {
name: prometheus2
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.3.152
}
node {
name: prometheus4
nodeid: 4
quorum_votes: 1
ring0_addr: 192.168.3.187
}
node {
name: prometheus5
nodeid: 7
quorum_votes: 1
ring0_addr: 192.168.3.197
}
node {
name: prometheus6
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.3.99
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: troy
config_version: 7
interface {
bindnetaddr: 192.168.3.150
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
Thank you