Configure HA for short, scheduled "no failover" outages

tannebil

New Member
Jun 26, 2023
8
0
1
I have a homelab 2-node+qdisk HA cluster using zfs shared storage across the two nodes. One of the nodes is actually a Proxmox VM running in Parallels on a beefy 2018 Mac Mini (which seemed like a terrible idea at first but has worked very well in practice).

My question is about scheduled backups. Every morning at 3am, I shutdown all my Parallels VMs to do complete, clean backups. It only takes a few minutes as I make the backup copies on the same APFS volume, immediately restart the VMs, and then backup the copies to TrueNAS) so I'd prefer to not have HA fail everything over for a planned 5 minute outage. Is there an easy way to configure HA so that node doesn't failover on a regular, planned outage or do I need to knock something up with CLI scripts and a coordinated scheduler on the node?

Thanks!
 
Hello.

Do note that if you stop or restart a VM from the web UI it won't be restored on another node. When you start the VM it might be re allocated on another node if you had "Rebalance on Start" option enabled (Datacenter -> Options -> Cluster Resource Scheduling on the web UI), but this is not a "failover" as you put it.

Another option is to disable HA for a guest before doing the operation. If you want to disable HA for one node from the CLI you can run

Code:
ha-manager status

to see the currently running HA resources, and then

Code:
ha-manager set RESOURCE --state ignored

to disable HA from one resource, this can be `vm:101` for example. You can set the state back to `started` to resume the HA service.
 
Last edited:
The VM being stopped is a Parallels VM running on a Mac Mini that's a Proxmox node in my cluster. It's one of five Parallels VMs that get shutdown by a macOS bash shell script before the backup runs on macOS. My Linux systems management skills are close to non-existent but it looks like I should be able to run those commands remotely using ssh so I'll give that a try. Can I run create a user account with just the necessary privileges to run the "ha-manager set resource" commands?

Thanks for the pointer!
 
AFAIK that is not configurable, it just follows the regular HA state cycle (at most 60s before a node is considered fenced, which triggers failover in that case). you probably want "freeze" as policy, which allows you to manually trigger recovery in case the maintainance takes longer than expected.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!