Reinstalling a node within a cluster

John Allison

Well-Known Member
Feb 1, 2018
33
4
48
Gateshead UK
www.adlinktech.com
I have a node which keeps randomly rebooting, sometimes it will go weeks and be fine then suddenly reboot. Its a Dell Poweredge 630, Dell have checked the hardware and say they cannot find anything so im in the horrid position of having to perform a clean re-install. Im not convinced that it isnt hardware but cant think of any other way to prove this.

As this node is part of a 3 node cluster, is removing it, re-installing it, and then re-adding it to the cluster an easy process? Any tips?
 
  • Like
Reactions: Darious
If you delete the previous node as mentioned in the documentation ('pvecm delnode <node>') it should be fine adding a node with the same name and IP again.
 
I followed the instructions fine, but the removed node was still showing up in the web view, so I deleted the node directory under /etc/pve/nodes/ from a remaining node.
Then after re-installing, I also had to reissue the subscription key in order to get updates working again.
Initially, the webview showed a load of ???? symbols for one of the old nodes, but after a reboot that seemed to clear up.
However when trying to access the newly re-installed node from one of the other nodes I got errors relating to ssh keys,
and so had to run 'ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R xxxxxxxx' a few times, which fixed that problem.

So theres' a bit of cleaning up missing from the documentation, could this be added, or was I just unlucky with my experience?