Cluster node lost after reboot

samontetro

Active Member
Jun 19, 2012
78
2
28
Grenoble, France
I'm setting up a new cluster with proxmox 4.4-5. on 3 identical nodes. The cluster was setup in unicast (as multicast was not working) then moved successfully to multicast since the network config allows now multicast.
I'm in the step of installing CEPH.

Yesterday I reboot the third node and since this reboot it cannot join anymore the cluster.
In the web interface connected on node1 or node2, I saw node3 but with the red cross. I can check information on it (available storage, resources... etc)
In the web interface connected on node3, node1 and node 2 have the red cross but informations are also available (storage...).

The ring ID is different between node3 and node1 which has the same than node2
Is it the problem ? How to go back to a 3 nodes cluster ?

Thanks

Patrick

On node2,
root@proxmost2:/etc/network# pvecm status
Quorum information
------------------
Date: Fri Mar 10 11:17:17 2017
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000002
Ring ID: 1/2484
Quorate: Yes

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 2
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.22.225
0x00000002 1 192.168.22.226 (local)


On node3:
root@proxmost3:~# pvecm status
Quorum information
------------------
Date: Fri Mar 10 11:19:37 2017
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000003
Ring ID: 3/126120
Quorate: No

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000003 1 192.168.22.227 (local)

 
I've tryed an update of all the nodes. Situation was worst on node3 as I fell in "crit: cpg_initialize failed: 2" messages returned by "systemctl status pve-cluster", and "Cannot initialize CMAP service" messages returned by "pvecm status" command...

So (I'm not in prod) I decide to try to remove and re-insert this node3. On node3 I follow the instructions provided in:
http://blog.sjas.de/posts/proxmox-delete-and-recreate-cluster.html

systemctl stop pvestatd.service
systemctl stop pvedaemon.service
systemctl stop pve-cluster.service
systemctl stop corosync
systemctl stop pve-cluster
pmxcfs -l
rm /etc/pve/corosync.conf
rm /etc/corosync/*
rm /var/lib/corosync/*
rm -rf /etc/pve/nodes/*
sqlite3 /var/lib/pve-cluster/config.db "select * from tree where name='corosync.conf'"
sqlite3 /var/lib/pve-cluster/config.db "delete from tree where name='corosync.conf'"
sqlite3 /var/lib/pve-cluster/config.db "select * from tree where name='corosync.conf'"


But I think the database was corrupted as "select * from tree where name='corosync.conf'" returned nothing before removing informations.

Then I reboot node3 successfully and re-add it to the cluster with "pvecm add proxmost1.hmg.priv".
All seams to work fine now:

root@proxmost3:~# pvecm status
Quorum information
------------------
Date: Fri Mar 10 15:36:58 2017
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000003
Ring ID: 1/145852
Quorate: Yes

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.22.225
0x00000002 1 192.168.22.226
0x00000003 1 192.168.22.227 (local)


The problem seams solved but not understanding why I've got this failure is quite disapointing for me. :(
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!