disarm-ha and arm-ha commands

RobFantini

Famous Member
May 24, 2012
2,105
123
133
Boston,Mass
Hello
See the following on pve-ha-manager changes.

How can I use disarm-ha when an entire cluster needs to be shut down?

pve-ha-manager (5.1.3) trixie; urgency=medium

* fix #2751: add disarm-ha and arm-ha commands for safe cluster-wide
maintenance, allowing the admin to temporarily disable automatic fencing
and recovery. Two resource modes are available: 'freeze' locks all
services in place, 'ignore' suspends HA tracking so services can be
managed manually. All HA service watchdogs are released when fully
disarmed, the underlying watchdog-mux must still keep the /dev/watchdog
device open as not all watchdog types support a graceful deactivate.
 
Impact - thank you for the fast response.

from the document:
Resource Modes
When disarming HA, you must choose a resource mode that controls how HA managedresources are handled while disarmed. The current state of resources is notaffected.

freeze
New commands and state changes are not applied. Services stay in their currentstate, but the HA stack does not react to failures or process new requests.This is the safest choice when you expect all nodes to remain running.

ignore
Resources are suspended from HA tracking and can be managed as if they were notHA managed. This allows you to manually start, stop, or migrate services whileHA is disarmed. Use this when you need to manually relocate services duringmaintenance. When re-arming, the CRM rechecks service locations against theconfiguration to pick up any manual migrations.


So I assume if we had to turn off the entire cluster we'd use:
Code:
ha-manager crm-command disarm-ha ignore

Do you see that as correct ?
 
I'm not sure. I don't have a cluster right now but in the past I just powered each node off normally and that worked okay.
 
How can I use disarm-ha when an entire cluster needs to be shut down?
The disarm-ha and arm-ha commands are mainly intended for specific maintenance tasks, where the whole cluster communication stack is temporarily unavailable or other situations, where one wants to avoid the HA stack make a node fence.

The HA Manager should be able to handle complete cluster shutdowns, see this section in the docs [0], but the disarm-ha and arm-ha commands can make this safer to do, especially if the startup time of the nodes varies heavily.

So I assume if we had to turn off the entire cluster we'd use:
ha-manager crm-command disarm-ha ignore
Do you see that as correct ?
If the cluster is expected to be shutdown, then ignore would be the more appropriate method.

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pveceph_shutdown
 
I have not tried the following but am posting for future reference. from grok :
https://grok.com/share/bGVnYWN5LWNvcHk_d940cfad-264d-46e6-a144-3e3e02625ae5

Recommended command for full cluster shutdown

1 Run this on any one node (it affects the whole cluster):

ha-manager crm-command disarm-ha ignore

2 Verify status Look for the fencing/CRM line showing "disarmed".
ha-manager status

3 Gracefully shut down your HA VMs/CTs (optional but cleanest — you can do this manually now that HA won't interfere).

4 Shut down the nodes (in any order; HA won't try to migrate things).

5 Power everything back on and wait for the cluster to become quorate.

6 Re-arm HA:

ha-manager crm-command arm-ha


also see https://github.com/proxmox/pve-docs/blob/master/ha-manager.adoc#disarming-ha-for-cluster-maintenance
 
Last edited:
The disarm-ha and arm-ha commands are mainly intended for specific maintenance tasks, where the whole cluster communication stack is temporarily unavailable or other situations, where one wants to avoid the HA stack make a node fence.
As the commands are only available since PVE 9.X,
how does one temporarily disable/enable the HA-Stack for the whole Cluster on PVE8.3 (pve-ha-manager: 4.0.6)?
Is there a safe procedure?

We do have a maintenance upcoming, where the cluster networks could be unavailable for some time and we do not want to risk a reset of all the nodes.

Thanks!
 
how does one temporarily disable/enable the HA-Stack for the whole Cluster on PVE8.3 (pve-ha-manager: 4.0.6)?
Is there a safe procedure?
The safest procedure to disarm the HA stack before these disarm-ha/arm-ha commands were introduced is to stop the pve-ha-lrm service on each node individually, and after these are all confirmed to be 'restart mode' and all HA resources in 'freeze' by ha-manager status and then the pve-ha-crm service should be stopped on each node as well.

See this post [0] for more information.

[0] https://forum.proxmox.com/threads/c...soon-as-i-join-a-new-node.116804/#post-510633