How does PVE handle Host related failures with HA


Active Member
May 25, 2019
Hi Community

We run a VMware environment and seeing some issues with HA failing to detect a host issue and move VM's to functioning hosts.

This has prompted my question today about how PVE manages hosts that have partially failed and are not 100% responsive but responsive enough to respond to PING or to ssh into the host.

What actually triggers an automated migration of VM > new host and HA to start.

What experiences have others had with this and how have the issues been handled?

that have partially failed and are not 100% responsive
what exactly does this mean for you? a high load is not really an error condition

our stack relies on corosync for quorum, as soon as corosync looses connection to a node, and a specified timeout of 60 second occurs, the node fences itself and
the ha stack can takeover the vms

better described in our reference documentation
Hi dcspak

Thanks for replying.

I’m not concerned with high load that’s a different issue and not mentioned in my post.

I’m not 100% sure what VMware vSphere uses for its HA apart from using data stores as a witness.

The issue we are facing atm with vSphere is why I’m posing this question to learn more about how PVE manages and detects failures.

Can you please advise how does corosync work with HA in more detail what does it actually do and what methods does it use to monitor a Connection?



The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!