Cluster suddenly stops existing..

AngryAdm

Member
Sep 5, 2020
145
30
18
93
I had a 3 node cluster setup, then I added a 4th node and suddenly the 3rd node was not part of the cluster and the cluster seems to not exist anymore.

I reinstall ALL 4 nodes fresh, joined them to a new cluster and now I have the same situation.

What is going on here? The "join information" is also greyed out...
1604269579216.png
 

Attachments

  • 1604269547719.png
    1604269547719.png
    40.7 KB · Views: 1
Last edited:
Reinstall from ISO, including wiping the directory /etc/pve/?
 
Can you please post
Code:
pvecm status
from the bad node and at least one other node?
 
The cluster seems to have started existing again magically while I was sleeping... Join information is no longer greyed out.

root@pve01:~# pvecm status
Cluster information
-------------------
Name: ALMA-MATER
Config Version: 4
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon Nov 2 09:58:04 2020
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 1.b6
Quorate: Yes

Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.0.5 (local)
0x00000002 1 10.0.0.6
0x00000003 1 10.0.0.7
0x00000004 1 10.0.0.8

---------------------------------------------------------

root@pve04:~# pvecm status
Cluster information
-------------------
Name: ALMA-MATER
Config Version: 4
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Mon Nov 2 10:03:57 2020
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000004
Ring ID: 1.b6
Quorate: Yes

Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.0.5
0x00000002 1 10.0.0.6
0x00000003 1 10.0.0.7
0x00000004 1 10.0.0.8 (local)
root@pve04:~#
 
Last edited:
  • Like
Reactions: Dominic
What I noticed before reinstalling yesterday was that /etc/pve/nodes did not contain information for node3 on node 1+2+4. Node3 which was the one that stopped being part of the cluster after adding pve04. These files are however present now on all nodes.
The files were present on node3 itself for all 4 nodes, but logging directly into node3 displayed that n1+2+4 were offline.
Loggin into pve01 it showed 1+2+4 online and 3 offline
 
Last edited:
  • Like
Reactions: Dominic

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!