Broken Cluster on New Install

godber

New Member
Apr 24, 2023
3
1
3
Hi All,

I was setting up a new ProxMox cluster with three hosts, A, B, and C. I made the mistake of turning B off when I added C (I was worried I was going to pop a circuit breaker). So now that I have all three turned on, A and C show up as in the cluster and B is not joined to the cluster. It shows up in the "Datacenter" with a Red X. Can I do something to manually join B back into the cluster? Is this a case where something like the process shown in the following thread would work?

https://forum.proxmox.com/threads/how-to-totally-destroy-a-cluster-then-re-create-it.99123/

Thanks
 
I made the mistake of turning B off when I added C
Been there, done that.

The nodes which were online have a newer version of /etc/pve/corosync.conf now than the node which was offline. My expectation was that the turned-on-later node would fetch an updated version automatically. It didn't.

My solution was to manually copy that newer content to the temporarily-down node. Be careful, that file is crucial --> always make a backup-copy before tinkering with these files.

And search and learn regarding the relationship to /etc/corosync/corosync.conf first.


Good luck.
 
  • Like
Reactions: godber
Thanks for the response UdoB.

Inspecting the /etc/pve/corosync.conf and /etc/corosync/corosync.conf files I can see they are the same, the one in the /etc/pve filesystem is not writable. I imagine this means something, I'll try digging more later in the day. I'm not finding how to make that file editable ... I've read the following:

https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_corosync_configuration

But this only seems to discuss the case where things are working as they should. I've seen references to running pmxcfs -l to run that in local mode. I'll dig more into this possibility and read more later today.

The node B has the current status after I set it to expect 1

Code:
pvecm status
Cluster information
-------------------
Name:             pve-B
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Wed May  3 15:44:08 2023
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000002
Ring ID:          2.10db
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   1
Highest expected: 1
Total votes:      1
Quorum:           1
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 10.18.0.11 (local)

-Austin
 
I'm pretty sure this worked right. Here's what I did:

Code:
  systemctl stop pve-cluster
  systemctl stop corosync
  pmxcfs -l
  cp /tmp/corosync.conf /etc/pve/corosync.conf

Now the status looks good on node B and /etc/pve/priv/known_hosts is in sync now. B now shows up green in the UI too.
 
  • Like
Reactions: UdoB

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!