disarm-ha and arm-ha commands

RobFantini

Famous Member
May 24, 2012
2,091
118
133
Boston,Mass
Hello
See the following on pve-ha-manager changes.

How can I use disarm-ha when an entire cluster needs to be shut down?

pve-ha-manager (5.1.3) trixie; urgency=medium

* fix #2751: add disarm-ha and arm-ha commands for safe cluster-wide
maintenance, allowing the admin to temporarily disable automatic fencing
and recovery. Two resource modes are available: 'freeze' locks all
services in place, 'ignore' suspends HA tracking so services can be
managed manually. All HA service watchdogs are released when fully
disarmed, the underlying watchdog-mux must still keep the /dev/watchdog
device open as not all watchdog types support a graceful deactivate.
 
Impact - thank you for the fast response.

from the document:
Resource Modes
When disarming HA, you must choose a resource mode that controls how HA managedresources are handled while disarmed. The current state of resources is notaffected.

freeze
New commands and state changes are not applied. Services stay in their currentstate, but the HA stack does not react to failures or process new requests.This is the safest choice when you expect all nodes to remain running.

ignore
Resources are suspended from HA tracking and can be managed as if they were notHA managed. This allows you to manually start, stop, or migrate services whileHA is disarmed. Use this when you need to manually relocate services duringmaintenance. When re-arming, the CRM rechecks service locations against theconfiguration to pick up any manual migrations.


So I assume if we had to turn off the entire cluster we'd use:
Code:
ha-manager crm-command disarm-ha ignore

Do you see that as correct ?
 
I'm not sure. I don't have a cluster right now but in the past I just powered each node off normally and that worked okay.
 
How can I use disarm-ha when an entire cluster needs to be shut down?
The disarm-ha and arm-ha commands are mainly intended for specific maintenance tasks, where the whole cluster communication stack is temporarily unavailable or other situations, where one wants to avoid the HA stack make a node fence.

The HA Manager should be able to handle complete cluster shutdowns, see this section in the docs [0], but the disarm-ha and arm-ha commands can make this safer to do, especially if the startup time of the nodes varies heavily.

So I assume if we had to turn off the entire cluster we'd use:
ha-manager crm-command disarm-ha ignore
Do you see that as correct ?
If the cluster is expected to be shutdown, then ignore would be the more appropriate method.

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pveceph_shutdown