Enabling 'ha-rebalance-on-start' reboots running VM after adding it to HA-manager

Oct 8, 2021
22
1
8
I recently updated our clusters to 7.4 and played around with CRS and ha-rebalance-on-start. I noticed that when I added a new VM to the HA manger with ha-rebalance-on-start enabled and the VM would then be migrated to another node, it shuts down the VM before migration and then starts it again after migration finished. The scheduler is set to static-load but I haven't tested yet if setting it to basic makes any difference.

Steps to reproduce:
1. Have a working cluster with ha-rebalance-on-start enabled
2. move a VM which is not added to the HA manager to a node with a higher load / more VMs (so it would be migrated to another node after adding it to HA manager)
3. add the running VM to HA manager

Result:
The VM is shut down, migrated and then started again by the HA manager.

Since I cannot find any mention of this behaviour in the documentation, is this expected?
 
Last edited:
  • Like
Reactions: herzkerl
Hi,
I can reproduce the issue. Yes, this is indeed unexpected and should be improved. I'd say, it should be one of:
  • VM should be online migrated.
  • VM should not be migrated at all.
Both can make sense, the question is just if adding a VM with state started to HA should be considered a start for ha-rebalance-on-start even if it's already running.
 
I guess it comes down to what's actually meant by started. Does it mean:

1. the HA manager is started for the VM
2. the VM itself is in state started (although the terminology in PVE is running in general)

My interpretation and expectation would be 1. with the VM then being online migrated as a consequence.
 
I guess it comes down to what's actually meant by started. Does it mean:

1. the HA manager is started for the VM
I get what you mean, but it sound strange to me to use the term "started" here: The HA manager is already running, nothing new is started when you add the VM to HA. The VM is just registered as a managed service for HA.

2. the VM itself is in state started (although the terminology in PVE is running in general)

My interpretation and expectation would be 1. with the VM then being online migrated as a consequence.
I discussed this with a colleague and the intention behind ha-rebalance-on-start is to rebalance when a HA-managed service is started. Adding it to HA while it's already running does not start the service, so we'll opt for 2. It also fits the documentation, which explicitly talks about the stopped->start transition:
HA service stopped → start transtion (opt-in). Requesting that a stopped service should be started is an good opportunity to check for the best suited node as per the CRS algorithm, as moving stopped services is cheaper to do than moving them started, especially if their disk volumes reside on shared storage. You can enable this by setting the ha-rebalance-on-start CRS option in the datacenter config. You can change that option also in the Web UI, under Datacenter → Options → Cluster Resource Scheduling.
 
Hi, I would just like to chime in here and ask for status. Have you been able to discuss and find a solution/patch? I discovered this issue today when enabling HA for a VM and it very unexpectedly (to me, at least) moved it to another node, shut it down and started it again. I am currently on pve 7.4-3.
BR
Bjørn
 
  • Like
Reactions: herzkerl
Hi,
Hi, I would just like to chime in here and ask for status. Have you been able to discuss and find a solution/patch? I discovered this issue today when enabling HA for a VM and it very unexpectedly (to me, at least) moved it to another node, shut it down and started it again. I am currently on pve 7.4-3.
BR
Bjørn
a fix is in pve-ha-manager >= 4.0.1 but since it's not a critical/security issue, it didn't get backported to Proxmox VE 7.
 
To complete the trail for others... where the heck is ha-rebalance-on-start located?

"You can enable this by setting the ha-rebalance-on-start CRS option in the datacenter config. You can change that option also in the Web UI, under Datacenter → Options → Cluster Resource Scheduling." per https://pve.proxmox.com/wiki/High_Availability

Unfortunately when I changed this setting it went from migrating every time I boot to failing to boot, after I added my FUSED LXC Ubuntu riding on a Ceph RBD to HA. It keeps on pushing to my other node after I tell it to go back to the node I want it on and start :(.
 
Last edited:
To complete the trail for others... where the heck is ha-rebalance-on-start located?

"You can enable this by setting the ha-rebalance-on-start CRS option in the datacenter config. You can change that option also in the Web UI, under Datacenter → Options → Cluster Resource Scheduling." per https://pve.proxmox.com/wiki/High_Availability

Unfortunately when I changed this setting it went from migrating every time I boot to failing to boot, after I added my FUSED LXC Ubuntu riding on a Ceph RBD to HA. It keeps on pushing to my other node after I tell it to go back to the node I want it on and start :(.EDIT- I solved my problem by going to Datacenter → HA → Groups and removing priority from each node. Although the node I wanted was set to 1, it was still diverting to priority 2 node for some reason?
 
Last edited:
Hi,
Unfortunately when I changed this setting it went from migrating every time I boot to failing to boot, after I added my FUSED LXC Ubuntu riding on a Ceph RBD to HA. It keeps on pushing to my other node after I tell it to go back to the node I want it on and start :(.EDIT- I solved my problem by going to Datacenter → HA → Groups and removing priority from each node. Although the node I wanted was set to 1, it was still diverting to priority 2 node for some reason?
a higher value means higher priority, so the node with priority 2 is preferred.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!