Questions about the dynamic CRS

Cookiefamily

Renowned Member
Jan 29, 2020
145
41
68
Germany
Hello,

I noticed that a dynamic mode was introduced to the CRS (yay! Been waiting for this for so long and it really comes in handy now with us planning our migration away from VMware, thank you so much Team!!!).
I enabled it on all my testing environments and it seemed to work pretty well.

There are two modes the CRS can run in, TOPSIS and "Brute Force" with the latter one being the default. What are the differences in practice between the two modes? Are there scenarios where you should choose one over the other?

In my small test clusters it distributed the load really well, but they are under almost no load CPU and memory wise so I couldn't really try some scenarios yet where ProxLB from credativ would fail in our production environment.
The issue was VMs with big "imbalances" of a lot of RAM and little CPU and vice versa.
What metrics does the CRS take into account? Both memory and CPU? How does it weigh between those (or does it do any weighting at all)?
 
Hi!

Thanks for the feedback!

The issue was VMs with big "imbalances" of a lot of RAM and little CPU and vice versa.
What metrics does the CRS take into account? Both memory and CPU? How does it weigh between those (or does it do any weighting at all)?
The load balancer takes both memory and CPU in account. Ad weighing, see the next paragraphs.

There are two modes the CRS can run in, TOPSIS and "Brute Force" with the latter one being the default. What are the differences in practice between the two modes? Are there scenarios where you should choose one over the other?
The load balancer can score the balancing migrations by either one of these methods.

The brute-force method (as in 'greedily find the best balancing migration') does currently weigh average CPU load and memory usage as equal. Though it might be weighed differently in the future as well, it is a well-balanced starting point as both resources (CPU and memory) can cause pressure and therefore degradation in resource utilization over time.

The TOPSIS method does weigh memory as more important than the CPU load: a 5:1 ratio for average CPU/memory usage and a 10:5 ratio for CPU/memory peaks to signify that memory is a truly limited resource while high CPU pressure does 'only' degrade the processing time. This is already the method we used for scoring nodes to start new HA resources (if rebalance-on-start is enabled).

The TOPSIS method might be helpful for more memory-bound applications, though as for many applications an equal balance for both resources is useful as well as cpu pressure often being the more common problem, the brute force method is the current default.

Hope this helps!

PS: There is a patch series in review, which overhauls the CRS section itself and adds documentation for the new load balancing system here [0]. External feedback on these patches are also very welcome if things could be made clearer or certain things should be elaborated on more!

[0] https://lore.proxmox.com/pve-devel/20260415091635.162224-20-d.kral@proxmox.com/
 
  • Like
Reactions: Cookiefamily
Using the GUI to set the HA scheduling to "dynamic load" and checking the "Automatically rebalanceHA resources" leads to this error:

Code:
crs: invalid format - format error crs.ha: value 'dynamic' does not have a value in the enumeration 'basic, static' crs.ha-auto-rebalance: property is not defined in schema and the schema does not allow additional properties

This is on a PVE 9.1.9 (enterprise repo) server. `pve-ha-manager` is installed as version 5.1.3 though? The patch is for 5.2.0+
 
Last edited:
Also: the docs seem to be missing the difference between the dynamisch and the static load scheduler. And the differences between the "brute force" and "TOPSIS" method. Google AI gave me an answer, but idk if its correct.

There are more patches in the patch series - that explain these. Sorry!
 
Last edited:
Hi!

Thanks for the feedback!


The load balancer takes both memory and CPU in account. Ad weighing, see the next paragraphs.


The load balancer can score the balancing migrations by either one of these methods.

The brute-force method (as in 'greedily find the best balancing migration') does currently weigh average CPU load and memory usage as equal. Though it might be weighed differently in the future as well, it is a well-balanced starting point as both resources (CPU and memory) can cause pressure and therefore degradation in resource utilization over time.

The TOPSIS method does weigh memory as more important than the CPU load: a 5:1 ratio for average CPU/memory usage and a 10:5 ratio for CPU/memory peaks to signify that memory is a truly limited resource while high CPU pressure does 'only' degrade the processing time. This is already the method we used for scoring nodes to start new HA resources (if rebalance-on-start is enabled).

The TOPSIS method might be helpful for more memory-bound applications, though as for many applications an equal balance for both resources is useful as well as cpu pressure often being the more common problem, the brute force method is the current default.

Hope this helps!

PS: There is a patch series in review, which overhauls the CRS section itself and adds documentation for the new load balancing system here [0]. External feedback on these patches are also very welcome if things could be made clearer or certain things should be elaborated on more!

[0] https://lore.proxmox.com/pve-devel/20260415091635.162224-20-d.kral@proxmox.com/

Thanks for the insights! Have you thought about a exlude for containers, as these get automatically moved which causes a downtime?
 
@dakralex Thank you very much for the answer! That clears things up a lot.

As for the modes, I think we will just need to try the modes and see what happens. TOPSIS sounds best for our production clusters as they are usually memory limited.

One thing I would have as feedback is that it is way more "expensive" to do live migrations of VMs with vGPU resources as a migration halts them for extended periods of time (8gb takes ~6s for us, 24gb ~20s etc.). So ideally we would move those last as long as there are better options for shuffling VMs around.
For now I guess I can create an affinity group to "pin" them to one host with higher priority to not get them to balance except for in the case of host failures.