Why does 3 3-node HA cluster die when 2 nodes are down?

Mayank006

Member
Dec 6, 2023
55
0
6
I am using Ceph as a shared storage on all 3 servers in an HA system. I powered off 2 servers (i.e. servers 2 & 3) out of 3.

I was expecting that all VMs will be migrated to the 1st running server. It happened that server 1 was non-responsive for 1-2 min and then I could access the proxmox on server 1, but none of the VMs were running. Even the VMs already present on server 1 are dead.
 
I am using Ceph as a shared storage on all 3 servers in an HA system. I powered off 2 servers (i.e. servers 2 & 3) out of 3.

I was expecting that all VMs will be migrated to the 1st running server.
That's not how Proxmox works: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_quorum . The single server does not find quorum and assumes it's part of the non-functioning part of the cluster (and becomes read-only after a reboot, I believe).

EDIT: Always keep more than half of the nodes running. This also always bites people with two-node clusters on this forum.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!