Building cluster fails (quorum)

Patschi

Member
Jul 23, 2013
51
0
6
29
Austria
pkern.at
Hello,

I'm trying since hours to get a Proxmox cluster working, but every time I try to add my second node to the cluster, it fails while waiting for quorum. Because my provider does not allow multicast traffic, I needed to create a VPN to get this working - also tested it successfully with "ssmping". For building a VPN tunnel I'm using "tinc" - a small & lightweight VPN server - Used following tutorial: http://forum.ovh.co.uk/showthread.php?7071-Poor-man-s-Proxmox-cluster-with-NAT. All internal IPs of the vpn tunnel are reachable.

Setup
My setup:
  • 2x server (both same hardware and provider, installed minimal Debian Wheezy and manually Proxmox VE, Community subscription)
  • VPN tunnel with tinc (works: ping, ssh, multicast, ...)
  • node1 ip: 192.168.99.1 (master)
  • node2 ip: 192.168.99.2 (should be added to cluster)
  • my cluster name: MainCluster

Already tested
  • cleared iptables rules
  • testing around with different /etc/hosts entries
  • ...and much other things I forgot...

pvecm command output
When I try to add a node to my cluster, I get following output:
Code:
# pvecm add node1
copy corosync auth key
stopping pve-cluster service
[ ok ] Stopping pve cluster filesystem: pve-cluster.
backup old database
[ ok ] Starting pve cluster filesystem : pve-cluster.
Starting cluster:
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... [  OK  ]
   Waiting for quorum... Timed-out waiting for cluster
[FAILED]
waiting for quorum...^C

Log (syslog):
Code:
Dec 15 19:00:10 node2 pmxcfs[15746]: [main] notice: teardown filesystem
Dec 15 19:00:11 node2 pmxcfs[15746]: [main] notice: exit proxmox configuration filesystem (0)
Dec 15 19:00:12 node2 pmxcfs[15860]: [quorum] crit: quorum_initialize failed: 6
Dec 15 19:00:12 node2 pmxcfs[15860]: [quorum] crit: can't initialize service
Dec 15 19:00:12 node2 pmxcfs[15860]: [confdb] crit: confdb_initialize failed: 6
Dec 15 19:00:12 node2 pmxcfs[15860]: [quorum] crit: can't initialize service
Dec 15 19:00:12 node2 pmxcfs[15860]: [dcdb] crit: cpg_initialize failed: 6
Dec 15 19:00:12 node2 pmxcfs[15860]: [quorum] crit: can't initialize service
Dec 15 19:00:12 node2 pmxcfs[15860]: [dcdb] crit: cpg_initialize failed: 6
Dec 15 19:00:12 node2 pmxcfs[15860]: [quorum] crit: can't initialize service
Dec 15 19:00:12 node2 kernel: DLM (built Oct 14 2013 08:10:28) installed
Dec 15 19:00:12 node2 corosync[15954]:   [MAIN  ] Corosync Cluster Engine ('1.4.5'): started and ready to provide service.
Dec 15 19:00:12 node2 corosync[15954]:   [MAIN  ] Corosync built-in features: nss
Dec 15 19:00:12 node2 corosync[15954]:   [MAIN  ] Successfully read config from /etc/cluster/cluster.conf
Dec 15 19:00:12 node2 corosync[15954]:   [MAIN  ] Successfully parsed cman config
Dec 15 19:00:12 node2 corosync[15954]:   [MAIN  ] Successfully configured openais services to load
Dec 15 19:00:12 node2 corosync[15954]:   [TOTEM ] Initializing transport (UDP/IP Multicast).
Dec 15 19:00:12 node2 corosync[15954]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Dec 15 19:00:12 node2 corosync[15954]:   [TOTEM ] The network interface [192.168.99.2] is now up.
Dec 15 19:00:12 node2 corosync[15954]:   [QUORUM] Using quorum provider quorum_cman
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
Dec 15 19:00:12 node2 corosync[15954]:   [CMAN  ] CMAN 1364188437 (built Mar 25 2013 06:14:01) started
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync CMAN membership service 2.90
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: openais cluster membership service B.01.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: openais event service B.01.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: openais checkpoint service B.01.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: openais message service B.03.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: openais distributed locking service B.03.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: openais timer service A.01.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync extended virtual synchrony service
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync configuration service
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync cluster config database access v1.01
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync profile loading service
Dec 15 19:00:12 node2 corosync[15954]:   [QUORUM] Using quorum provider quorum_cman
Dec 15 19:00:12 node2 corosync[15954]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
Dec 15 19:00:12 node2 corosync[15954]:   [MAIN  ] Compatibility mode set to whitetank.  Using V1 and V2 of the synchronization engine.
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] CLM CONFIGURATION CHANGE
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] New Configuration:
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] Members Left:
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] Members Joined:
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] CLM CONFIGURATION CHANGE
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] New Configuration:
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] #011r(0) ip(192.168.99.2)
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] Members Left:
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] Members Joined:
Dec 15 19:00:12 node2 corosync[15954]:   [CLM   ] #011r(0) ip(192.168.99.2)
Dec 15 19:00:12 node2 corosync[15954]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Dec 15 19:00:12 node2 corosync[15954]:   [QUORUM] Members[1]: 2
Dec 15 19:00:12 node2 corosync[15954]:   [QUORUM] Members[1]: 2
Dec 15 19:00:12 node2 corosync[15954]:   [CPG   ] chosen downlist: sender r(0) ip(192.168.99.2) ; members(old:0 left:0)
Dec 15 19:00:12 node2 corosync[15954]:   [MAIN  ] Completed service synchronization, ready to provide service.
Dec 15 19:00:18 node2 pmxcfs[15860]: [status] notice: update cluster info (cluster name  MainCluster, version = 23)
Dec 15 19:00:18 node2 pmxcfs[15860]: [dcdb] notice: members: 2/15860
Dec 15 19:00:18 node2 pmxcfs[15860]: [dcdb] notice: all data is up to date
Dec 15 19:00:18 node2 pmxcfs[15860]: [dcdb] notice: members: 2/15860
Dec 15 19:00:18 node2 pmxcfs[15860]: [dcdb] notice: all data is up to date

/etc/hosts
On both nodes the /etc/hosts file is identical:
Code:
127.0.0.1 localhost

# Proxmox nodes
192.168.99.1 node1
192.168.99.2 node2

# IPv6
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

pveversion: (pveversion -v)
Code:
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-24 (running version: 3.1-24/060bd5a6)
pve-kernel-2.6.32-24-pve: 2.6.32-111
pve-kernel-2.6.32-25-pve: 2.6.32-113
pve-kernel-2.6.32-22-pve: 2.6.32-107
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-9
libpve-access-control: 3.0-8
libpve-storage-perl: 3.0-18
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1

I'm running out of ideas. Thanks for any suggestions!
 
Last edited:
Hello mir,

thanks for your response. Content of cluster.conf is:
Code:
<?xml version="1.0"?>
<cluster name="MainCluster" config_version="23">


  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>


  <clusternodes>
  <clusternode name="node1" votes="1" nodeid="1"/>
  <clusternode name="node2" votes="1" nodeid="2"/></clusternodes>


</cluster>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!