HA delays and locking issues

Dec 18, 2018
2
0
41
48
Denmark
We are currently running tests on a new three node HA cluster configured with 2x10G networking and shared (Ceph) storage. We are using the latest stable packages for Proxmox 5.3.

The behaviour around reboots of servers is nowhere near as smooth as we would like. These are the issues that we see:

1) Doing a controlled shutdown of a node (from the web interface) seems to shut down (HA) containers and vms alright, but leave them in a 'started' state, meaning that we have to wait for fencing to kick in before they are migrated away and restarted. This causes unnecessary downtime unless we manually migrate away all services before restarting.

2) When shutting down a node, the corresponding node lock in /etc/pve/priv/lock/ is (at first) removed, as part of the shutdown procedure. But as soon as the node is fenced (see above), this lock is reobtained by one of the other nodes. Then, when the original node comes up, HA resources with a preference for this node (via HA groups) are immediately migrated back. In reality, the services in question are stopped on the failover nodes, and the returning node spends two minutes waiting for a lock timeout until it 'successfully acquired lock', and the services are started again.

Any idea why this is happening? Any logs or observations that can offer some insight?

Any help is appreciated.
 
Any idea why this is happening? Any logs or observations that can offer some insight?

The 160 seconds timeout is necessary for the watchdog based fencing algorithm.

But if you know you want to shutdown, you can move the VMs before you do it.
 
I guess our workaround for node maintenance is going to be something like this:
  1. Enable "nofallback" for HA group(s).
  2. Migrate all affected services to other nodes.
  3. Shut down/reboot node.
  4. Do maintenance and restart node.
  5. Disable "nofallback" and let services migrate back.
Could be smoother, but at least the tedious parts can be scripted.

Most other clustering solutions (e.g. Pacemaker) handle planned reboots and shutdowns of a node more smoothly: Affected services are migrated away and the node leaves the cluster voluntarily, avoiding a fencing operation. There is room for improvement in how Proxmox does things.
 
Yes, I guess there is always room for improvements. First, there are patches for active fencing (which may speedup fencing). Second, we plan to introduce some kind of maintainance mode to automatically migrate VMs - but nothing ready so far ...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!