cman keeps crashing

Do you have some spare machine to setup a third cluster? I wonder if the same happens on a newly installed system?
 
Here logs:

Apr 11 12:42:52 terrance corosync[5298]: [TOTEM ] FAILED TO RECEIVE
Apr 11 12:42:54 terrance pmxcfs[5214]: [quorum] crit: quorum_dispatch failed: 2
Apr 11 12:42:54 terrance fenced[5352]: cluster is down, exiting
Apr 11 12:42:54 terrance fenced[5352]: daemon cpg_dispatch error 2
Apr 11 12:42:54 terrance dlm_controld[5366]: cluster is down, exiting
Apr 11 12:42:54 terrance dlm_controld[5366]: daemon cpg_dispatch error 2
Apr 11 12:42:54 terrance pmxcfs[5214]: [libqb] warning: epoll_ctl(del): Bad file descriptor (9)
Apr 11 12:42:54 terrance pmxcfs[5214]: [confdb] crit: confdb_dispatch failed: 2
Apr 11 12:42:56 terrance pmxcfs[5214]: [libqb] warning: epoll_ctl(del): Bad file descriptor (9)
Apr 11 12:42:56 terrance pmxcfs[5214]: [dcdb] crit: cpg_dispatch failed: 2
Apr 11 12:42:56 terrance kernel: dlm: closing connection to node 2
Apr 11 12:42:56 terrance kernel: dlm: closing connection to node 1
Apr 11 12:42:58 terrance pmxcfs[5214]: [dcdb] crit: cpg_leave failed: 2
Apr 11 12:43:00 terrance pmxcfs[5214]: [libqb] warning: epoll_ctl(del): Bad file descriptor (9)
Apr 11 12:43:00 terrance pmxcfs[5214]: [dcdb] crit: cpg_dispatch failed: 2
Apr 11 12:43:02 terrance pmxcfs[5214]: [dcdb] crit: cpg_leave failed: 2
Apr 11 12:43:04 terrance pmxcfs[5214]: [libqb] warning: epoll_ctl(del): Bad file descriptor (9)
Apr 11 12:43:04 terrance pmxcfs[5214]: [quorum] crit: quorum_initialize failed: 6
Apr 11 12:43:04 terrance pmxcfs[5214]: [quorum] crit: can't initialize service
Apr 11 12:43:04 terrance pmxcfs[5214]: [confdb] crit: confdb_initialize failed: 6
Apr 11 12:43:04 terrance pmxcfs[5214]: [quorum] crit: can't initialize service
Apr 11 12:43:04 terrance pmxcfs[5214]: [dcdb] notice: start cluster connection
Apr 11 12:43:04 terrance pmxcfs[5214]: [dcdb] crit: cpg_initialize failed: 6
Apr 11 12:43:04 terrance pmxcfs[5214]: [quorum] crit: can't initialize service
Apr 11 12:43:06 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 2
Apr 11 12:43:06 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 2
Apr 11 12:43:06 terrance pmxcfs[5214]: [dcdb] notice: start cluster connection
Apr 11 12:43:06 terrance pmxcfs[5214]: [dcdb] crit: cpg_initialize failed: 6
Apr 11 12:43:06 terrance pmxcfs[5214]: [quorum] crit: can't initialize service
Apr 11 12:43:06 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:06 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:06 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:06 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:12 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:12 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:12 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:12 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:12 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:12 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:22 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:22 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:22 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:22 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:22 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:22 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:32 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:32 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:32 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:32 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:32 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
Apr 11 12:43:32 terrance pmxcfs[5214]: [status] crit: cpg_send_message failed: 9
 
Ok, got it, if i disable iptables, it seems to work as expected...

But, can't find working rules for iptables, here what i have:

iptables -P INPUT DROP
iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
# Corosync
iptables -A INPUT -p udp --dport 5404 -j ACCEPT
iptables -A INPUT -p udp --dport 5405 -j ACCEPT
iptables -A INPUT -p udp --dst 239.192.12.206 --dport 5405 -j ACCEPT

Am i missing something ?
 
>iptables -A INPUT -p udp --dst 239.192.12.206 --dport 5405 -j ACCEPT

Ok, it works with the above line, i was wrong :)
 
Mar 25 00:22:00 novaprospekt corosync[210647]: [TOTEM ] FAILED TO RECEIVE

mdevilz, i read in a corosync ml post that this line imply a problem with multicast on your network... For me, it was iptables...
 
Ok, fail this morning, i just disable iptables to see if it fails without...

Here my rules:

#####################################################################
# RAZ
iptables -F
iptables -t nat -F


iptables -P INPUT DROP
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT


iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT


iptables -A INPUT -i lo -j ACCEPT
iptables -A INPUT -p icmp -j ACCEPT

# ACCEPT connections between cluster nodes
iptables -A INPUT -s 194.253.148.3 -j ACCEPT
iptables -A INPUT -s 194.253.148.17 -j ACCEPT
iptables -A INPUT -s 194.253.148.36 -j ACCEPT

# proxmox admin web
iptables -A INPUT -p tcp -s 194.253.148.0/24 --dport 443 -j ACCEPT
iptables -A INPUT -p tcp -s 194.253.148.0/24 --dport 80 -j ACCEPT
iptables -A INPUT -p tcp -s 194.253.148.0/24 --dport 8006 -j ACCEPT
# ssh
iptables -A INPUT -p tcp -s 194.253.148.0/24 --dport 22 -j ACCEPT
# Corosync
iptables -A INPUT -p udp --dst 239.192.12.206 --dport 5405 -j ACCEPT

#####################################################################

Am i missing something ?
 
Last edited:
You tested with the iptables command I suggested?

iptables -I INPUT -p udp -m state --state NEW -m multiport --dports 5404,5405 -j ACCEPT

is, i think, equivalent to:

# ACCEPT connections between cluster nodes
iptables -A INPUT -s 194.253.148.3 -j ACCEPT
iptables -A INPUT -s 194.253.148.17 -j ACCEPT
iptables -A INPUT -s 194.253.148.36 -j ACCEPT

I'm looking at this line:
iptables -A INPUT -p udp --dst 239.192.12.206 --dport 5405 -j ACCEPT

i think it is wrong...

I have replaced it with:
iptables -A INPUT -m pkttype --pkt-type multicast -j ACCEPT

No problem for now, will post a comment if it really fix my iptables issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!