Cluster, Corosync Problem: JOIN or LEAVE message was thrown away during flush...

pxo

Renowned Member
Nov 3, 2013
29
0
66
Cluster, Corosync Problem: JOIN or LEAVE message was thrown away during flush operation

Hello,

I have caused the problem yourself: pvecm delnode px2 and pvecm delnode px3
Nodes px1 and px2 are back and online.

Code:
root@px1 ~ > pvecm status
Quorum information
------------------
Date:             Tue Dec  1 12:23:13 2015
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1888
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.0.7 (local)
0x00000002          1 192.168.0.8

The corosync config on node px1 an px2 :
Code:
root@px1 /etc/pve > cat corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: px3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: px3
  }

  node {
    name: px2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: px2
  }

  node {
    name: px1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: px1
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: Domainname
  config_version: 7
  ip_version: ipv4
  secauth: on
  version: 2
  interface {
    bindnetaddr: 192.168.0.7
    ringnumber: 0
  }

}


Node3 no come back, the Log:
Code:
root@px3 ~ > grep corosync /var/log/syslog
Dec  1 10:27:45 px3 corosync[32014]: Starting Corosync Cluster Engine (corosync): [FAILED]
Dec  1 10:27:45 px3 systemd[1]: corosync.service: control process exited, code=exited status=1
Dec  1 10:27:45 px3 systemd[1]: Unit corosync.service entered failed state.
Dec  1 10:36:38 px3 pvedaemon[11768]: <root@pam> starting task UPID:px3:000018DF:0033D07C:565D6A26:srvstart:corosync:root@pam:
Dec  1 10:36:38 px3 pvedaemon[6367]: starting service corosync: UPID:px3:000018DF:0033D07C:565D6A26:srvstart:corosync:root@pam:
Dec  1 10:36:38 px3 corosync[6374]:  [MAIN  ] Corosync Cluster Engine ('2.3.5'): started and ready to provide service.
Dec  1 10:36:38 px3 corosync[6374]:  [MAIN  ] Corosync built-in features: augeas systemd pie relro bindnow
Dec  1 10:36:38 px3 corosync[6375]:  [TOTEM ] Initializing transport (UDP/IP Multicast).
Dec  1 10:36:38 px3 corosync[6375]:  [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Dec  1 10:36:38 px3 corosync[6375]:  [TOTEM ] The network interface [192.168.0.9] is now up.
Dec  1 10:36:38 px3 corosync[6375]:  [SERV  ] Service engine loaded: corosync configuration map access [0]
Dec  1 10:36:38 px3 corosync[6375]:  [QB    ] server name: cmap
Dec  1 10:36:38 px3 corosync[6375]:  [SERV  ] Service engine loaded: corosync configuration service [1]
Dec  1 10:36:38 px3 corosync[6375]:  [QB    ] server name: cfg
Dec  1 10:36:38 px3 corosync[6375]:  [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Dec  1 10:36:38 px3 corosync[6375]:  [QB    ] server name: cpg
Dec  1 10:36:38 px3 corosync[6375]:  [SERV  ] Service engine loaded: corosync profile loading service [4]
Dec  1 10:36:38 px3 corosync[6375]:  [QUORUM] Using quorum provider corosync_votequorum
Dec  1 10:36:38 px3 corosync[6375]:  [SERV  ] Service engine loaded: corosync vote quorum service v1.0 [5]
Dec  1 10:36:38 px3 corosync[6375]:  [QB    ] server name: votequorum
Dec  1 10:36:38 px3 corosync[6375]:  [SERV  ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Dec  1 10:36:38 px3 corosync[6375]:  [QB    ] server name: quorum
Dec  1 10:36:38 px3 corosync[6375]:  [TOTEM ] JOIN or LEAVE message was thrown away during flush operation.
Dec  1 10:36:38 px3 corosync[6375]:  [TOTEM ] JOIN or LEAVE message was thrown away during flush operation.
Dec  1 10:36:38 px3 corosync[6375]:  [TOTEM ] A new membership (192.168.0.9:1880) was formed. Members joined: 3
Dec  1 10:36:38 px3 corosync[6375]:  [QUORUM] Members[1]: 3
Dec  1 10:36:39 px3 corosync[6375]:  [MAIN  ] Completed service synchronization, ready to provide service.
Dec  1 10:36:39 px3 corosync[6375]:  [TOTEM ] A new membership (192.168.0.7:1884) was formed. Members joined: 1 2
Dec  1 10:36:39 px3 corosync[6375]:  [CMAP  ] Received config version (6) is different than my config version (5)! Exiting
Dec  1 10:36:39 px3 corosync[6375]:  [SERV  ] Unloading all Corosync service engines.
Dec  1 10:36:39 px3 corosync[6375]:  [QB    ] withdrawing server sockets
Dec  1 10:36:39 px3 corosync[6375]:  [SERV  ] Service engine unloaded: corosync vote quorum service v1.0
Dec  1 10:36:39 px3 corosync[6375]:  [QB    ] withdrawing server sockets
Dec  1 10:36:39 px3 corosync[6375]:  [SERV  ] Service engine unloaded: corosync configuration map access
Dec  1 10:36:39 px3 corosync[6375]:  [QB    ] withdrawing server sockets
Dec  1 10:36:39 px3 corosync[6375]:  [SERV  ] Service engine unloaded: corosync configuration service
Dec  1 10:36:39 px3 corosync[6375]:  [QB    ] withdrawing server sockets
Dec  1 10:36:39 px3 corosync[6375]:  [SERV  ] Service engine unloaded: corosync cluster closed process group service v1.01
Dec  1 10:36:39 px3 corosync[6375]:  [QB    ] withdrawing server sockets
Dec  1 10:36:39 px3 corosync[6375]:  [SERV  ] Service engine unloaded: corosync cluster quorum service v0.1
Dec  1 10:36:39 px3 corosync[6375]:  [SERV  ] Service engine unloaded: corosync profile loading service
Dec  1 10:36:39 px3 corosync[6375]:  [MAIN  ] Corosync Cluster Engine exiting normally
Dec  1 10:37:39 px3 corosync[6369]: Starting Corosync Cluster Engine (corosync): [FAILED]
Dec  1 10:37:39 px3 systemd[1]: corosync.service: control process exited, code=exited status=1
Dec  1 10:37:39 px3 systemd[1]: Unit corosync.service entered failed state.
Dec  1 10:37:39 px3 pvedaemon[6367]: command 'systemctl start corosync' failed: exit code 1

The Path /etc/pve on node px3 is read only.
Can i repair without new install the node px3 ?
 
what is meant by Clear ?
i tested on node3 purge and reinstall:
Code:
apt-get purge proxmox-ve
apt-get autoremove
apt-get purge `dpkg -l | grep ^rc | awk '{print $2}'`
apt-get install proxmox-ve

and then:
Code:
root@px1 /etc/pve > pvecm add 192.168.0.9
authentication key already exists

what must i manual clean ?
 
ok sorry, in other words.
i like reinstall the proxmox node without new install the debian jessie base.
 
thanks dietmar.
i was typing too fast the wrong command [delnode]. and i no read the wiki, so i not shutdown the node before delnode :rolleyes:

i found another way without new install the node px3
Code:
root@px3 / > service pve-cluster stop
root@px3 / > rm /var/lib/pve-cluster/config.db*
root@px3 / > scp root@px1:/var/lib/pve-cluster/config.db /var/lib/pve-cluster/config.db
root@px3 / > chmod 600 /var/lib/pve-cluster/config.db
root@px3 / > reboot

Code:
Dec  1 18:38:33 px3 corosync[1087]:  [MAIN  ] Corosync Cluster Engine ('2.3.5'): started and ready to provide service.
Dec  1 18:38:33 px3 corosync[1087]:  [MAIN  ] Corosync built-in features: augeas systemd pie relro bindnow
Dec  1 18:38:33 px3 corosync[1088]:  [TOTEM ] Initializing transport (UDP/IP Multicast).
Dec  1 18:38:33 px3 corosync[1088]:  [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Dec  1 18:38:33 px3 corosync[1088]:  [TOTEM ] The network interface [192.168.0.9] is now up.
Dec  1 18:38:33 px3 corosync[1088]:  [SERV  ] Service engine loaded: corosync configuration map access [0]
Dec  1 18:38:33 px3 corosync[1088]:  [QB    ] server name: cmap
Dec  1 18:38:33 px3 corosync[1088]:  [SERV  ] Service engine loaded: corosync configuration service [1]
Dec  1 18:38:33 px3 corosync[1088]:  [QB    ] server name: cfg
Dec  1 18:38:33 px3 corosync[1088]:  [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Dec  1 18:38:33 px3 corosync[1088]:  [QB    ] server name: cpg
Dec  1 18:38:33 px3 corosync[1088]:  [SERV  ] Service engine loaded: corosync profile loading service [4]
Dec  1 18:38:33 px3 corosync[1088]:  [QUORUM] Using quorum provider corosync_votequorum
Dec  1 18:38:33 px3 corosync[1088]:  [SERV  ] Service engine loaded: corosync vote quorum service v1.0 [5]
Dec  1 18:38:33 px3 corosync[1088]:  [QB    ] server name: votequorum
Dec  1 18:38:33 px3 corosync[1088]:  [SERV  ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Dec  1 18:38:33 px3 corosync[1088]:  [QB    ] server name: quorum
Dec  1 18:38:33 px3 corosync[1088]:  [TOTEM ] JOIN or LEAVE message was thrown away during flush operation.
Dec  1 18:38:33 px3 corosync[1088]:  [TOTEM ] A new membership (192.168.0.9:1928) was formed. Members joined: 3
Dec  1 18:38:33 px3 corosync[1088]:  [QUORUM] Members[1]: 3
Dec  1 18:38:33 px3 corosync[1088]:  [MAIN  ] Completed service synchronization, ready to provide service.
Dec  1 18:38:33 px3 corosync[1088]:  [TOTEM ] A new membership (192.168.0.7:1932) was formed. Members joined: 1 2
Dec  1 18:38:33 px3 corosync[1088]:  [QUORUM] This node is within the primary component and will provide service.
Dec  1 18:38:33 px3 corosync[1088]:  [QUORUM] Members[3]: 1 2 3
Dec  1 18:38:33 px3 corosync[1088]:  [MAIN  ] Completed service synchronization, ready to provide service.
Dec  1 18:38:34 px3 corosync[1081]: Starting Corosync Cluster Engine (corosync): [  OK  ]

I'll never do it again :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!