Proxmox node HA failed - bring back up

MisterDeeds

Active Member
Nov 11, 2021
143
33
33
35
Hello together

I am running proxmox with 4 nodes in a cluster and HA federation. After a failure of a node, this node show only "Idle".
1.PNG

Furthermore, VMs can no longer be migrated to it. Only the following message appears:
2.PNG

What must be done after such a failure, so that the host is correctly integrated into the HA group again?

Thank you and best regards
 
Thank you very much for the hint. Unfortunately I have no success with it. Currently the host does not allow any connection at all. But ping still works:

Unbenannt.PNG

i will reinstall the node, that should make it work.

Thanks anyway and best regards
 
I think we would need bit more information on the status of the cluster now. Are you able to login to this faulty node? Is the cluster still healthy? Was there any upgrade that was done?
Also was there a change in host-name while the node was in cluster? I remember i was having a similar issue while i tried to update the host-name and other nodes did not recognize the change and would not do any cluster activities. Only manual ssh was possible at this point. we had to manually revert this changes and reboot to make the cluster work again.
 
Hi bzb-rs, thanks for the feedback. It only concerned the SSH connection within the nodes. So no matter to which side no SSH connection could be established. Directly on the node 4 (e.g. from my PC) I could establish the SSH connection. Since the first 3 nodes could do this to each other, but not the 4th - I assume that the problem was on this one. I have now reinstalled this and now everything works again.

Thanks for the time and best regards
 
Just for completeness’ sake, if you have a situation where the nodes cannot SSH to each other, and you get the error about a changed key or something similar, try to run the following command on that node: pvecm updatecerts. One thing it will do, is place the current host keys in the know_hosts file, which is shared among the cluster nodes via Corosync and the pmxcfs. In many situations this should solve such a problem.
 
  • Like
Reactions: bzb-rs

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!