Shutdown Policy: Failover

proxmox_larry · Dec 19, 2019

Hey guys,
I'm currently using a 4 node HA-cluster with Ceph. I really appreciate the Failover Policy, but
it takes a lot of time until the VM is being restored automatically on another node ~ 3 Min!

Is it possible to speed up this process?
After a poweroff of the node, it would be nice to have the VM immediately running on another node...
Thanks!

adamb · Dec 19, 2019

proxmox_larry said:
Hey guys,
I'm currently using a 4 node HA-cluster with Ceph. I really appreciate the Failover Policy, but
it takes a lot of time until the VM is being restored automatically on another node ~ 3 Min!

Is it possible to speed up this process?
After a poweroff of the node, it would be nice to have the VM immediately running on another node...
Thanks!

If you move to proxmox 6.1, you could start using this policy instead.

The HA stack has been improved and comes with a new 'migrate' shutdown policy, migrating running services to another node on shutdown.

t.lamprecht · Dec 19, 2019

And it will even migrating it back, once the original node comes up again, at least if the HA-service did not get moved another time (e.g., manually) in-between.

proxmox_larry · Dec 19, 2019

This only works for a controlled shutdown, but when you perform a immediately poweroff of the node (e. g. plug off the power supply) the VM won't be migrated.....

adamb · Dec 19, 2019

proxmox_larry said:
This only works for a controlled shutdown, but when you perform a immediately poweroff of the node (e. g. plug off the power supply) the VM won't be migrated.....

Are you just testing failure scenario's? I don't think messing with the timeouts and stuff is something you would want to do. If I had to guess there are reasons why they aren't super aggressive.

proxmox_larry · Dec 19, 2019

But waiting 3 minutes for a VM to come up after a immediately poweroff, is not acceptable in HA environment!

adamb · Dec 19, 2019

proxmox_larry said:
But waiting 3 minutes for a VM to come up after a immediately poweroff, is not acceptable in HA environment!

That is subjective. Its perfectly acceptable in my enviroment where node failures are extremely rare. They will probably make this better in the future, but at this point its just how it is.

They are also always open to patches as well.

t.lamprecht · Dec 19, 2019

proxmox_larry said:
This only works for a controlled shutdown, but when you perform a immediately poweroff of the node (e. g. plug off the power supply) the VM won't be migrated.....

Yeah, sure, that's by design - then the node must be fenced! The policies are for triggered shutdowns/reboots, as the name suggests.

proxmox_larry said:
But waiting 3 minutes for a VM to come up after a immediately poweroff, is not acceptable in HA environment!

First, we try to fence sooner than 3 minutes, actually at 60s since that last heartbeat of a node, then completing fencing can require another 60 seconds before a service can be recovered. So one is normally around 120 seconds until a HA Service gets started again on another node.
That's still over 99.999% availability ( https://pve.proxmox.com/pve-docs/chapter-ha-manager.html ) if there's one to two outages per year. Active fencing (patches are on the list) could bring you down to 60seconds, provided that the fence device actually works (some are quite the mess), but that doesn't bring you even another of the famous "nines" for availability, but adds cost and complexity.

IF you want real HA, and in your environment 2 minutes of down time in the worst case scenario a cluster can still survive with being quorate then adding a VM to the Proxmox VE HA stack is just one part in a much bigger design you need to plan carefully.

Actually, in our docs we talk about this a bit, so maybe it'd be worth if you read them: https://pve.proxmox.com/pve-docs/chapter-ha-manager.html

IF you really require higher availability, or run in very unstable environments, you need to start making your service redundant, and that on a service level, not the stack below. Those can handle failover maybe even in milliseconds depending on the service type, e.g., use load balancer.
Or host the VM you want to have really HA two or three times, spread over the cluster using HA groups, then the HA stack just saves you for avalanche failures in medium to bigger clusters (>10 nodes) where over three nodes can fail. You need to understand that a HA manager so deep down in the stack, like the PVE HA-Manager, the discontinued rgmanager or pacemanager, are rather a safety net for bigger (mass) failures, they work as general HA solution when 99.999% can be enough - but for more all of them will need application level redundancy. You do not want to fence (maybe even actively) just because a node was 5 or 10 seconds unresponsive, this often produces more harm (other nodes need to take the load, and if all are already highly loaded (and thus one node responded a bit slower in the first place) you make the situation even worth).

We do not plan to accept patches regarding those timeout, it was carefully chosen and deeply integrated in a lot of things like cluster locks, further, it's simply the wrong knob to change when one wants higher availability.

Search

Search

Shutdown Policy: Failover

proxmox_larry

Member

adamb

Famous Member

t.lamprecht

Proxmox Staff Member

proxmox_larry

Member

adamb

Famous Member

proxmox_larry

Member

adamb

Famous Member

t.lamprecht

Proxmox Staff Member

We value your privacy