Removing cluster nodes: Cannot initialize CMAP service

Bowedfloor

Active Member
May 27, 2016
4
0
41
25
Hi,

I have a proxmox 4.2 box that was part of a 3 way cluster, but 2 of the nodes went offline.

For forever anytime I had to reboot my vm's I just had to pvecm e 1 to fudge quorum.

NOW I get '
Cannot initialize CMAP service

Also corosync refuses to start. I turned on debugging but I haven't seen anything useful come out of it.

In /var/log/messages its getting spammed by

May 26 17:22:45 pve kernel: [ 255.118804] vmbr0: port 1(eth3) received tcn bpdu
May 26 17:22:45 pve kernel: [ 255.118812] vmbr0: topology change detected, propagating
May 26 17:22:46 pve kernel: [ 256.118761] vmbr0: port 1(eth3) received tcn bpdu
May 26 17:22:46 pve kernel: [ 256.118772] vmbr0: topology change detected, propagating
May 26 17:22:47 pve kernel: [ 257.118888] vmbr0: port 1(eth3) received tcn bpdu

Over and over again. Is this related?

So, I cannot start any vm's at this point. I get error 500 no quorum.
 
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: exit-code) since Thu 2016-05-26 20:51:26 MDT; 2min 48s ago
Process: 15155 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

May 26 20:50:25 pve corosync[15162]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
May 26 20:50:25 pve corosync[15162]: [MAIN ] Initializing IPC on cpg [2]
May 26 20:50:25 pve corosync[15162]: [MAIN ] No configured qb.ipc_type. Using native ipc
May 26 20:50:25 pve corosync[15162]: [QB ] server name: cpg
May 26 20:50:25 pve corosync[15162]: [SERV ] Service engine loaded: corosync profile loading service [4]
May 26 20:50:25 pve corosync[15162]: [MAIN ] NOT Initializing IPC on pload [4]
May 26 20:51:26 pve corosync[15155]: Starting Corosync Cluster Engine (corosync): [FAILED]
May 26 20:51:26 pve systemd[1]: corosync.service: control process exited, code=exited status=1
May 26 20:51:26 pve systemd[1]: Failed to start Corosync Cluster Engine.
May 26 20:51:26 pve systemd[1]: Unit corosync.service entered failed state.

Then

root@pve:~# pvecm e 1
Cannot initialize CMAP service

I'm not sure how to debug this further...