disarm-ha and arm-ha commands

RobFantini

Famous Member
May 24, 2012
2,092
118
133
Boston,Mass
Hello
See the following on pve-ha-manager changes.

How can I use disarm-ha when an entire cluster needs to be shut down?

pve-ha-manager (5.1.3) trixie; urgency=medium

* fix #2751: add disarm-ha and arm-ha commands for safe cluster-wide
maintenance, allowing the admin to temporarily disable automatic fencing
and recovery. Two resource modes are available: 'freeze' locks all
services in place, 'ignore' suspends HA tracking so services can be
managed manually. All HA service watchdogs are released when fully
disarmed, the underlying watchdog-mux must still keep the /dev/watchdog
device open as not all watchdog types support a graceful deactivate.
 
Impact - thank you for the fast response.

from the document:
Resource Modes
When disarming HA, you must choose a resource mode that controls how HA managedresources are handled while disarmed. The current state of resources is notaffected.

freeze
New commands and state changes are not applied. Services stay in their currentstate, but the HA stack does not react to failures or process new requests.This is the safest choice when you expect all nodes to remain running.

ignore
Resources are suspended from HA tracking and can be managed as if they were notHA managed. This allows you to manually start, stop, or migrate services whileHA is disarmed. Use this when you need to manually relocate services duringmaintenance. When re-arming, the CRM rechecks service locations against theconfiguration to pick up any manual migrations.


So I assume if we had to turn off the entire cluster we'd use:
Code:
ha-manager crm-command disarm-ha ignore

Do you see that as correct ?
 
I'm not sure. I don't have a cluster right now but in the past I just powered each node off normally and that worked okay.
 
How can I use disarm-ha when an entire cluster needs to be shut down?
The disarm-ha and arm-ha commands are mainly intended for specific maintenance tasks, where the whole cluster communication stack is temporarily unavailable or other situations, where one wants to avoid the HA stack make a node fence.

The HA Manager should be able to handle complete cluster shutdowns, see this section in the docs [0], but the disarm-ha and arm-ha commands can make this safer to do, especially if the startup time of the nodes varies heavily.

So I assume if we had to turn off the entire cluster we'd use:
ha-manager crm-command disarm-ha ignore
Do you see that as correct ?
If the cluster is expected to be shutdown, then ignore would be the more appropriate method.

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pveceph_shutdown
 
I have not tried the following but am posting for future reference. from grok :
https://grok.com/share/bGVnYWN5LWNvcHk_d940cfad-264d-46e6-a144-3e3e02625ae5

Recommended command for full cluster shutdown

1 Run this on any one node (it affects the whole cluster):

ha-manager crm-command disarm-ha ignore

2 Verify status Look for the fencing/CRM line showing "disarmed".
ha-manager status

3 Gracefully shut down your HA VMs/CTs (optional but cleanest — you can do this manually now that HA won't interfere).

4 Shut down the nodes (in any order; HA won't try to migrate things).

5 Power everything back on and wait for the cluster to become quorate.

6 Re-arm HA:

ha-manager crm-command arm-ha


also see https://github.com/proxmox/pve-docs/blob/master/ha-manager.adoc#disarming-ha-for-cluster-maintenance
 
Last edited: