Proxmox cluster node won't join cluster after reboot

philister

Member
Jun 18, 2012
31
0
6
Hello all,

I have two servers in my test lab running proxmox 2.1. I configured them as a two node cluster:

node1: pvecm create <cluster name>
node2: pvecm add <ip-of-node-1>

So far so good, everything worked as expected. But then, I rebooted node2 (which wasn't running any VMs), and now it can't join the cluster. During bootup it says "timed out waiting for quorum", the same message I get when trying to start cman (/etc/init.d/cman start). The whole cluster was inoperable at this point. I couldn't create a new VM on node1 (Message: Cluster not ready - no quorum?). I couldn't remove node2 from the cluster on the CLI (Message: Cluster not ready - no quorum?).

I then issued the following command on both nodes:

pvecm expected 1
pvecm delnode <other node>

That made at least the seperate nodes functional again, but they're not clustered, of course.

It would be great if anyone could

1. help me to get the two nodes back in the cluster
2. tell me if this is expected behaviour of a cluster: reboot one server and loose the cluster functionaluity altogether

Or maybe I am missing something?


Thanks a lot for any advice.
Phil

PS.
I do not have a "quorum disk".
 
Hello all,

I have two servers in my test lab running proxmox 2.1. I configured them as a two node cluster:

node1: pvecm create <cluster name>
node2: pvecm add <ip-of-node-1>

So far so good, everything worked as expected. But then, I rebooted node2 (which wasn't running any VMs), and now it can't join the cluster. During bootup it says "timed out waiting for quorum", the same message I get when trying to start cman (/etc/init.d/cman start). The whole cluster was inoperable at this point. I couldn't create a new VM on node1 (Message: Cluster not ready - no quorum?). I couldn't remove node2 from the cluster on the CLI (Message: Cluster not ready - no quorum?).

I then issued the following command on both nodes:

pvecm expected 1
pvecm delnode <other node>

That made at least the seperate nodes functional again, but they're not clustered, of course.

It would be great if anyone could

1. help me to get the two nodes back in the cluster
2. tell me if this is expected behaviour of a cluster: reboot one server and loose the cluster functionaluity altogether

Or maybe I am missing something?


Thanks a lot for any advice.
Phil

PS.
I do not have a "quorum disk".
Hi,
perhaps trouble with multicast? Search for postings around multicast/unicast.

Udo
 
Hi,
perhaps trouble with multicast? Search for postings around multicast/unicast.

Udo

Hi Udo,

Thanks for the hint. I tested it and this is the output:

Code:
root@pmx1:~# asmping 224.0.2.1 192.168.0.76
asmping joined (S,G) = (*,224.0.2.234)
pinging 192.168.0.76 from 192.168.0.77
  unicast from 192.168.0.76, seq=1 dist=0 time=0.272 ms
  unicast from 192.168.0.76, seq=2 dist=0 time=0.319 ms
multicast from 192.168.0.76, seq=2 dist=0 time=0.329 ms
  unicast from 192.168.0.76, seq=3 dist=0 time=0.274 ms
multicast from 192.168.0.76, seq=3 dist=0 time=0.284 ms
  unicast from 192.168.0.76, seq=4 dist=0 time=0.238 ms
multicast from 192.168.0.76, seq=4 dist=0 time=0.248 ms
  unicast from 192.168.0.76, seq=5 dist=0 time=0.178 ms
multicast from 192.168.0.76, seq=5 dist=0 time=0.187 ms
  unicast from 192.168.0.76, seq=6 dist=0 time=0.294 ms
multicast from 192.168.0.76, seq=6 dist=0 time=0.304 ms
  unicast from 192.168.0.76, seq=7 dist=0 time=0.237 ms
multicast from 192.168.0.76, seq=7 dist=0 time=0.246 ms
  unicast from 192.168.0.76, seq=8 dist=0 time=0.288 ms
multicast from 192.168.0.76, seq=8 dist=0 time=0.299 ms
^C
--- 192.168.0.76 statistics ---
8 packets transmitted, time 7605 ms
unicast:
   8 packets received, 0% packet loss
   rtt min/avg/max/std-dev = 0.178/0.262/0.319/0.044 ms
multicast:
   7 packets received, 0% packet loss since first mc packet (seq 2) recvd
   rtt min/avg/max/std-dev = 0.187/0.271/0.329/0.044 ms

Looks OK to me. Any other Ideas how I can rejoin the two nodes? Thanks in advance.

Phil
 
Can someone tell me how to remove the cluster config altogether? I can only find Information on this regarding pve 1.x.

It would be nice if it was as easy to destroy a cluster as it is to create one. I would simply like to tell my pve node: "You're alone my friend, there is no cluster mate you have to wait for / care about." How can this be accomplished? It can't be that hard, can it?

Thank you very much.
 
Does anybody know how to recreate a cluster under proxmox ve 2.1? I'd really appreciate a hint here.

And apart from that: Is that the way it's supposed to work? Reboot one node in a cluster and screw the whole setup? Would the behaviour have been any different if I'd had three cluster nodes (which is my plan for production environment)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!