Negative affinity moved both VMs

SteveITS

Renowned Member
Feb 6, 2025
747
312
68
I realized after upgrading to PVE9 we'd somehow ended up with two VMs on the same node that are usually and should probably be separate (though not critical). I thought that was a great use of the new HA negative affinity feature, so I created one for the two VMs. To my mild surprise both VMs moved to a new host. Is that expected? I would have thought only one needed to migrate? If nothing else it would be slightly less resource-intensive to only move one, upon rule creation.

Or, the obvious workaround is to move one first, then create the rule.
 
Hi!

Yes, currently, this is the expected behavior as the HA Manager will detect that each HA resource is on the same node as a HA resource it must not be together with and schedules both to move to other nodes.

Negative HA resource affinity rules are relatively strict at the moment, as in 'these HA resources have some hard constraint to never run on the same node' (for example, both need a physical GPU passed-through where only one is available per node), so creating such a rule for HA resources running on the same node resolves these scenarios correctly, but currently isn't the most efficient at doing so.

This can certainly be resolved more efficiently by not moving at least one of these HA resources to another node, I created a Bugzilla entry for this [0], thanks for the input!

[0] https://bugzilla.proxmox.com/show_bug.cgi?id=7475
 
  • Like
Reactions: SteveITS
Thanks. I suppose the corollary, whichI did not try to test, is: when both move, does HA know to move them to a different node? Or is it currently trying "random other host"? For instance that would be an issue in a 2 node cluster with QDevice...both could keep moving to the other node. (hypothetically, and any cluster where "N" VMs are negative affinity, with the same "N" nodes)
 
Good question! But yes, in the proposed 2-node cluster setup with a QDevice, the HA Manager will only move one of the HA resources to another node and keep the other one as-is, because it is known that there aren't any other viable nodes to migrate to.

In fact, it is the same behavior as putting a strict node affinity rule on one of these HA resources on their current node, while the other HA resource does not have any node affinity constraint. In that situation, the HA resource will only move the freely movable HA resource while the other HA resource with the node affinity constraint stays where it is.

Of course this cannot be done for negative resource affinity rules, which are impossible, such as 3 HA resources in a 2-node cluster as this can never be satisfied and therefore such rules will be rejected on creation or disabled if they got in the HA rules config in any other way. The same happens for more intricate cases, such if the node affinity rules constrain the HA resources in a negative resource affinity rule so much, that it cannot satisfy the negative resource affinity rules. These cases are summarized in the docs [0], if you're interested.

Hope this clarifies the behavior of the HA affinity rules a little more!

[0] https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_rule_conflicts