Cluster Issues

Oct 12, 2025
2
0
1
Trying to get a new cluster setup with 4 identical nodes. I currently have 3 of the nodes setup and working in a cluster. When I add the 4th node, it never fully connects and eventually creates issues, such as all four nodes losing connection. They go grey with the question mark or a red x. As soon as I power off the 4th node, the other 3 start working correctly again.

When I run pvecm status, the 1st node's Ring ID changes. I've removed the 4th node, wiped it clean and re-added it with the same results.

I'm at a loss, any ideas?
 
Clue there ... you wiped it clean. And then you probably rejoined it with the same name ...

Did you delete /etc/pve/nodes/OLD-NODE-YOU-NUKED before rejoining the rebuilt machine?
Did you comment out the old ssh key in /etc/pve/priv/authorized_keys before rejoining the rebuilt machine?

Also, are all the nodes running corosync on the same subnet?
I've had issues like that when I selected the wrong subnet for the cluster join.
 
Last edited:
HI, stale node entries or mismatched SSH keys can definitely cause cluster sync chaos.

In addition, make sure the new node’s ring0_addr matches the existing subnet in /etc/pve/corosync.conf, and that /etc/hosts across all nodes correctly maps each node’s cluster IP. Any mismatch there will break quorum when the 4th joins.
 
  • Like
Reactions: tcabernoch