Creating Cluster of 2 Nodes don't work

reinsle

Member
Aug 2, 2012
8
0
21
Hi Forum,

i try to create an Cluster of 2 Nodes, based on Debian Squeeze and installed actual Proxmox 2.1 (pveversion shows pve-manager/2.1/bdd3663d on each node).

pvecm create on first nodes works. But while adding the second node to the cluster, i receive:

--- cut ---
root@egon ~ # pvecm add olaf
copy corosync auth key
stopping pve-cluster service
Stopping pve cluster filesystem: pve-cluster.
backup old database
Starting pve cluster filesystem : pve-cluster.
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
Stopping cluster:
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Waiting for corosync to shutdown:[ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]
waiting for quorum...
--- cut ---
(The output here is after removeing /etc/cluster and /var/lib/pve-cluster and restarting, but a fresh installed System shows the same.)

and after that i wait for hours :)

Searching the Forum i found Infos about not working multicast, but the 2 Servers are connected on one not managable Switch. Firewall (iptables) is not active.

Has anybody an Idea how to resolve the Problem?

Thanks a lot

Robert
 
Hi Tom,

using IP-Adress causes nearly the same:

--- cut ---
root@egon ~ # pvecm add 192.168.99.11
node egon already defined
copy corosync auth key
stopping pve-cluster service
Stopping pve cluster filesystem: pve-cluster.
backup old database
Starting pve cluster filesystem : pve-cluster.
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
Stopping cluster:
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Waiting for corosync to shutdown:[ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]
waiting for quorum...
--- cut ---


Robert
 
Hi Tom,

on Node running asmping:

--- cut ---
root@egon ~ # asmping 224.0.2.1 192.168.99.11
asmping joined (S,G) = (*,224.0.2.234)
pinging 192.168.99.11 from 192.168.99.13
unicast from 192.168.99.11, seq=1 dist=0 time=0.291 ms
unicast from 192.168.99.11, seq=2 dist=0 time=0.291 ms
multicast from 192.168.99.11, seq=2 dist=0 time=0.332 ms
unicast from 192.168.99.11, seq=3 dist=0 time=0.178 ms
multicast from 192.168.99.11, seq=3 dist=0 time=0.202 ms
unicast from 192.168.99.11, seq=4 dist=0 time=0.201 ms
multicast from 192.168.99.11, seq=4 dist=0 time=0.228 ms
unicast from 192.168.99.11, seq=5 dist=0 time=0.211 ms
multicast from 192.168.99.11, seq=5 dist=0 time=0.238 ms
unicast from 192.168.99.11, seq=6 dist=0 time=0.219 ms
multicast from 192.168.99.11, seq=6 dist=0 time=0.266 ms
unicast from 192.168.99.11, seq=7 dist=0 time=0.186 ms
multicast from 192.168.99.11, seq=7 dist=0 time=0.226 ms
unicast from 192.168.99.11, seq=8 dist=0 time=0.206 ms
multicast from 192.168.99.11, seq=8 dist=0 time=0.237 ms
^C
--- 192.168.99.11 statistics ---
8 packets transmitted, time 7508 ms
unicast:
8 packets received, 0% packet loss
rtt min/avg/max/std-dev = 0.178/0.222/0.291/0.045 ms
multicast:
7 packets received, 0% packet loss since first mc packet (seq 2) recvd
rtt min/avg/max/std-dev = 0.202/0.247/0.332/0.038 ms
root@egon ~ #
--- cut ---

node running ssmpingd

--- cut ---
root@olaf ~ # ssmpingd
received request from 192.168.99.13
received request from 192.168.99.13
received request from 192.168.99.13
received request from 192.168.99.13
received request from 192.168.99.13
received request from 192.168.99.13
received request from 192.168.99.13
received request from 192.168.99.13
^C
root@olaf ~ #
--- cut ---

Robert

P.S. Thanks a lot for your Help :)
 
Hi Dietmar,

--- cut ---
root@olaf ~ # /etc/init.d/cman status
cluster is running.
--- cut ---

so i think, cman is running.

Robert
 
Hi,

oh, don't checkd it :))

--- cut ---
Aug 3 10:47:01 egon pmxcfs[2105]: [status] crit: cpg_send_message failed: 9
Aug 3 10:47:01 egon pmxcfs[2105]: [status] crit: cpg_send_message failed: 9
Aug 3 10:47:01 egon pmxcfs[2105]: [status] crit: cpg_send_message failed: 9
Aug 3 10:47:01 egon pmxcfs[2105]: [status] crit: cpg_send_message failed: 9
Aug 3 10:47:01 egon pmxcfs[2105]: [status] crit: cpg_send_message failed: 9
Aug 3 10:47:01 egon pmxcfs[2105]: [status] crit: cpg_send_message failed: 9
Aug 3 10:47:11 egon pmxcfs[2105]: [status] crit: cpg_send_message failed: 9
--- cut ---

this is the node, i will connedt to the other

on the other node (pvecm create ...) there is nothing special on log.

Do you knew what this says to me?

Thanks a lot

Robert
 
no pastebin please. if you want provide logs, attach it as zip.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!