Hi,
I've just added a third node to my Proxmox 6 cluster to have HA enabled for my critical VMs, and therefore be able to do maintenance on nodes without having to worry about downtimes. Or this is what I thought I would have, rather ....
The current HA implementation is not HA: it's quick failover. There is a downtime, and I think it's a huge problem.
I don't think I'm the only one. I've seen posts from 2017 already requesting the exact same thing.
https://forum.proxmox.com/threads/node-restart-and-automatically-migrate-vms-ha.35882/
I've seen the feature request
https://bugzilla.proxmox.com/show_bug.cgi?id=1378
which led to the implementation of the shutdown_policy switch.
But this is still not HA at all.
I chose shutdown_policy = failover
I intentionally reboot a node for maintenance, to upgrade packages on it.
All VMs and containers on that host are switched off, so they are all OFFLINE. At this point it's already game over.
Some seconds later, the ones I declared as HA resources are started on another node.
This is NOT why someone would setup a cluster for, IMO.
I (of course!) want all the HA resources on that node to be migrated to any other node BEFORE all the other (non-HA) resources on that node are switched off.
But don't switch off all of them first!
If I take the time to declare some resources as HA, it's because I mean it: I don't want them to go down (as much as it can be avoided. If the node crashes, of course, it's another story)
I read in those old posts that the user should bulk migrate all the resources on the node if he wants to achieve that result.
What's the point of HA then, if the user has to do some manual actions?
And I don't want to migrate ALL resources. The other nodes might not have the capacity to take them all.
But the resources I explicitly declared as HA, the ones which are the most important to me, those I of course want to have them online all the time.
The ability to do maintenance on nodes is one of the top reasons why one would deploy a cluster of hypervisor nodes.
Can we have an option to "live migrate HA resources before node reboot/shutdown" please?
Thanks in advance.
I've just added a third node to my Proxmox 6 cluster to have HA enabled for my critical VMs, and therefore be able to do maintenance on nodes without having to worry about downtimes. Or this is what I thought I would have, rather ....
The current HA implementation is not HA: it's quick failover. There is a downtime, and I think it's a huge problem.
I don't think I'm the only one. I've seen posts from 2017 already requesting the exact same thing.
https://forum.proxmox.com/threads/node-restart-and-automatically-migrate-vms-ha.35882/
I've seen the feature request
https://bugzilla.proxmox.com/show_bug.cgi?id=1378
which led to the implementation of the shutdown_policy switch.
But this is still not HA at all.
I chose shutdown_policy = failover
I intentionally reboot a node for maintenance, to upgrade packages on it.
All VMs and containers on that host are switched off, so they are all OFFLINE. At this point it's already game over.
Some seconds later, the ones I declared as HA resources are started on another node.
This is NOT why someone would setup a cluster for, IMO.
I (of course!) want all the HA resources on that node to be migrated to any other node BEFORE all the other (non-HA) resources on that node are switched off.
But don't switch off all of them first!
If I take the time to declare some resources as HA, it's because I mean it: I don't want them to go down (as much as it can be avoided. If the node crashes, of course, it's another story)
I read in those old posts that the user should bulk migrate all the resources on the node if he wants to achieve that result.
What's the point of HA then, if the user has to do some manual actions?
And I don't want to migrate ALL resources. The other nodes might not have the capacity to take them all.
But the resources I explicitly declared as HA, the ones which are the most important to me, those I of course want to have them online all the time.
The ability to do maintenance on nodes is one of the top reasons why one would deploy a cluster of hypervisor nodes.
Can we have an option to "live migrate HA resources before node reboot/shutdown" please?
Thanks in advance.