No HA?

cyruspy

Renowned Member
Jul 2, 2013
103
8
83
Hello!

On a new PVE 8.3 testing cluster:
- 4 nodes
- Ceph
* 4 nodes with OSD
* 3 nodes with managers & monitors
* Dedicated corosync VLANs on management bond (2*10GbE)
* Separate bonds for Ceph Access, Ceph Replica and VM traffic

I mistakenly rebooted a node without moving the VMs beforehand. From what I can remember, a similar operation in the past would trigger. Batch migration of VMs, but in this case a batch shutdown was issued.

VMs in the other 3 nodes run happily.

The thing is, I was not able to restart the VMs that were running on the mentioned node until it came back to life (firmware upgrade was in progress). I had the VMs out of service for 10 minutes.

Is this expected?, having 3 out of 4 nodes I would expect to have quorum to overrule any consistency doubt.

This is currently just a testing environment, but for production this would be a deal breaker.
 
Last edited:
You don't mention it, but you have to add to HA every VM you want to be in high availability.
If you did, and you did issue a "reboot" of the PVE host, remember that there is a setting in Datacenter -> Options called "HA settings" where you can set the "Shutdown policy". By default it will:
  • Shutdown the VMs if you told the host to reboot.
  • Live migrate the VMs if you told the host to shutdown.
Change that to "migrate" so it will always try to migrate VMs to other nodes on reboot and shutdown.

Also, when using HA, it's good practice to set the node in maintenance mode [1] before any reboot, etc.

[1] https://pve.proxmox.com/wiki/High_Availability#ha_manager_node_maintenance
 
  • Like
Reactions: Lukas Moravek
You don't mention it, but you have to add to HA every VM you want to be in high availability.
If you did, and you did issue a "reboot" of the PVE host, remember that there is a setting in Datacenter -> Options called "HA settings" where you can set the "Shutdown policy". By default it will:
  • Shutdown the VMs if you told the host to reboot.
  • Live migrate the VMs if you told the host to shutdown.
Change that to "migrate" so it will always try to migrate VMs to other nodes on reboot and shutdown.

Also, when using HA, it's good practice to set the node in maintenance mode [1] before any reboot, etc.

[1] https://pve.proxmox.com/wiki/High_Availability#ha_manager_node_maintenance

Good morning!, yes, I see the "HA settings" it's set as "Default (conditional)". Will change that to "migrate".

Thanks for the hint!