Static-Load Scheduler ha service distribution problem

michalg91 · Aug 22, 2024

We've got 6 node cluster configured with static load scheduler and crs enabled. We have couple of ha-groups.
One of them is configured to have two nodes with priority weight 30 and one with priority 20. Max parallel migration jobs are set to 6.

Code:

group: ha_group_nonprod
        comment nonprod nodes
        nodes node1:30,node2:30,node3:20
        nofailback 0
        restricted 0
group: ha_group_prod
        comment prod nodes
        nodes node1:10,node2:10,node6:30,node5:30,node3:30,node4:30
        nofailback 0
        restricted 0

Recently we were ongoing maintenance on node2 and needed to migrate all of ha resources from it. So i did:

Code:

ha-manager crm-command node-maintenance enable node2

And here we are going to problem
Even that ha_group_nonprod has 3 nodes, all resources from node2 were scheduled to migrate to node1. So in the middle of migration no ram left on node1 and it rebooted.
Node3 had plenty of ram available. So i am thinking how this algorithm chooses node, is it considering lower weights? Why it doesn't migrate any vm to node3? is it count vms during migration?

I read the section in documentation about this scheduler couple of times and i think it should work, but maybe it works only for one migration job at a time and doesn't consider other migration jobs ongoing?

Main goal of this configuration was to be able to take one node into maintenance and migrate it's vms to the rest available in hagroup.

Does anyone faced similar issue? Any suggestions how to prevent that scenario in the future?

Search

Search

Static-Load Scheduler ha service distribution problem

michalg91

New Member

We value your privacy