HA delays and locking issues

Rene H. Larsen · Dec 18, 2018

We are currently running tests on a new three node HA cluster configured with 2x10G networking and shared (Ceph) storage. We are using the latest stable packages for Proxmox 5.3.

The behaviour around reboots of servers is nowhere near as smooth as we would like. These are the issues that we see:

1) Doing a controlled shutdown of a node (from the web interface) seems to shut down (HA) containers and vms alright, but leave them in a 'started' state, meaning that we have to wait for fencing to kick in before they are migrated away and restarted. This causes unnecessary downtime unless we manually migrate away all services before restarting.

2) When shutting down a node, the corresponding node lock in /etc/pve/priv/lock/ is (at first) removed, as part of the shutdown procedure. But as soon as the node is fenced (see above), this lock is reobtained by one of the other nodes. Then, when the original node comes up, HA resources with a preference for this node (via HA groups) are immediately migrated back. In reality, the services in question are stopped on the failover nodes, and the returning node spends two minutes waiting for a lock timeout until it 'successfully acquired lock', and the services are started again.

Any idea why this is happening? Any logs or observations that can offer some insight?

Any help is appreciated.

dietmar · Dec 18, 2018

Rene H. Larsen said:
Any idea why this is happening? Any logs or observations that can offer some insight?

The 160 seconds timeout is necessary for the watchdog based fencing algorithm.

But if you know you want to shutdown, you can move the VMs before you do it.

Rene H. Larsen · Dec 19, 2018

I guess our workaround for node maintenance is going to be something like this:

Enable "nofallback" for HA group(s).
Migrate all affected services to other nodes.
Shut down/reboot node.
Do maintenance and restart node.
Disable "nofallback" and let services migrate back.

Could be smoother, but at least the tedious parts can be scripted.

Most other clustering solutions (e.g. Pacemaker) handle planned reboots and shutdowns of a node more smoothly: Affected services are migrated away and the node leaves the cluster voluntarily, avoiding a fencing operation. There is room for improvement in how Proxmox does things.

dietmar · Dec 19, 2018

Yes, I guess there is always room for improvements. First, there are patches for active fencing (which may speedup fencing). Second, we plan to introduce some kind of maintainance mode to automatically migrate VMs - but nothing ready so far ...

Search

Search

HA delays and locking issues

Rene H. Larsen

Active Member

dietmar

Proxmox Staff Member

Rene H. Larsen

Active Member

dietmar

Proxmox Staff Member