Lost node, now cannot re-add it

rugby

Member
Oct 24, 2009
138
0
16
I've followed the steps in the wiki and ensured my switch has multi-cast enabled but cannot add my replaced node back into my cluster. I get hung up at "Waiting for Quorum" Here's the output from new replacement node:

root@proxmox03:~# pvecm add X.X.X.X
The authenticity of host 'X.X.X.X (X.X.X.X)' can't be established.
RSA key fingerprint is 7d:35:d2:e1:f1:d4:24:51:92:97:7e:7e:b4:66:57:38.
Are you sure you want to continue connecting (yes/no)? yes
root@X.X.X.X's password:
copy corosync auth key
stopping pve-cluster service
Stopping pve cluster filesystem: pve-cluster.
backup old database
Starting pve cluster filesystem : pve-cluster.
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
waiting for quorum...

Here's the output on one of the cluster members:


root@proxmox01:~# pvecm nodes
Node Sts Inc Joined Name
1 M 516 2013-11-16 08:37:47 proxmox01
2 M 564 2013-12-23 12:35:59 proxmox02
3 X 0 proxmox03
4 M 564 2013-12-23 12:35:59 proxmox04

Ideas?
 
I've followed the steps in the wiki and ensured my switch has multi-cast enabled but cannot add my replaced node back into my cluster. I get hung up at "Waiting for Quorum" Here's the output from new replacement node:
...
Ideas?

try

#clustat

and see if "Member Status: Quorate" is there.

otherwise, try

#pvecm e 1

to temporarily put cluster quorate manually.

then you should be able to join new node and quorum should be reset automatically

Marco
 
try

#clustat

and see if "Member Status: Quorate" is there.

otherwise, try

#pvecm e 1

to temporarily put cluster quorate manually.

then you should be able to join new node and quorum should be reset automatically

Marco


root@proxmox03:~# clustat
Cluster Status for Clusterfish @ Mon Dec 30 10:53:22 2013
Member Status: Inquorate

ran pvecm e 1 and it changed to:


root@proxmox03:~# clustat
Cluster Status for Clusterfish @ Mon Dec 30 10:57:13 2013
Member Status: Quorate


Member Name ID Status
------ ---- ---- ------
proxmox01 1 Offline
proxmox02 2 Offline
proxmox03 3 Online, Local
proxmox04 4 Offline

but it doesn't join and when I try it again it says "authentication key already exists."
 
Ok, new development. I restarted Proxmox04 and now it sees Proxmox03 but not 02 or 01.

root@proxmox04:~# clustat
Cluster Status for Clusterfish @ Mon Dec 30 11:14:17 2013
Member Status: Inquorate


Member Name ID Status
------ ---- ---- ------
proxmox01 1 Offline
proxmox02 2 Offline
proxmox03 3 Online
proxmox04 4 Online, Local


root@proxmox04:~#

Looks like restarting the other 2 nodes should bring it all back together.

**I rebooted the other 2 nodes and all 4 show up now. I think somewhere along the line somebody here had updated some hosts and caused a problem.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!