Joining new PVE server to an existing cluster

Boss

Member
Apr 22, 2024
34
0
11
Hi, I've created a new PVE server and went to join it to an existing cluster (4 nodes) using Node-1.
It appears to have joined the cluster successfully, however the existing Node-1 that I used to join it to, shows that the new node is unknown (Icons are Question Marks )

I've run the web manager on all the other nodes and they show it as connected.
Any tips on what I need to check to make the new node show as available on Node-1?
Thanks.... Fred
 
Hi,
It's a normal behaviour for several minutes. After the cluster elections and few minutes the cluster should become green. Please double check and if it's still in unknown state please check if the network setup is consistent. Do all the nodes have proper ip config in the corosync subnet?
 
Thanks for your prompt response - it was still showing unknown after a couple of hours on that first Node but OK on all the other nodes. So it seemed something specific to that first node.
Just thinking it through further, the new node has the hostname of an old PVE server (different IP address) that was removed from the cluster some time ago, maybe there was something left behind when it was removed?

Not sure what you mean by "corosync subnet"? All the nodes in the cluster are on the same subnet.

I might remove the new node from the cluster and ensure that there is no reference to that node on the other nodes and then try joining it again.
 
Please check /etc/hosts file on all of the servers. And also check /root/.ssh on the node1 - it might contain entries for the old removed node. What I mean is that node1 has ssh key of the removed node in known_hosts or authorized keys. The format of the file is ip then the ssh key. Please verify nothing left there.
 
Thanks - I will double check that as well. One of the first things I did was ssh as root from the new server to all the ones in the cluster. It then announced that the auth keys were wrong and offered to delete them. This appeared to work as I then got a login prompt to the remote server. Conversely I ran a ssh as root from each of the existing servers to the new node. Because it worked, I assumed that the old entries had been deleted. Haven't checked the hosts file so will do this as well. Thanks for help - much appreciated!.
 
I ran a "ssh-copy-id -f root@PVE-host" from the new PVE server to all the existing servers and then from them to the new server in an effort to clean up any existing entries.
I still have the same issue that one server in the cluster shows the new one as unknown. I also managed to migrate a VM from an existing node to the new one. So there is something still not right with that server. I checked the hosts file and no entries pointing to the new server exist.