I've been running PVE 7.x for about a year now on a three node cluster. Its been working great. The cluster operates in HA mode and I see my nodes listed in the HA page.
But something very bad happened just now and I'd like to know if this is the expected behavior. Here is the timeline.
But my more important question is: Why did nodes 1 and 2 suddenly power off all running VMs? Is this by design? Can I modify this behavior?
But something very bad happened just now and I'd like to know if this is the expected behavior. Here is the timeline.
- I have had three 7.3-x nodes running in HA mode for over a year. No problems.
- I added a new (fourth) node running PVE 8.0.x. It joined correctly, then I powered if off so I could move it into my datacenter later.
- I removed all VMs and replications to/from node 3 to prepare to remove it (node 3) from my cluster.
- Not realizing that I was about to knock out half my total number of joined nodes (quorum), I powered off node three. I now have 2 of 4 nodes running.
- As node 3 powered down, ALL guest VMs unexpectedly shutdown on nodes 1 and 2.
- Attempting to force the VM's back up on nodes 1 and 2 resulted in an error saying "cluster not ready - no quorum? (500)"
But my more important question is: Why did nodes 1 and 2 suddenly power off all running VMs? Is this by design? Can I modify this behavior?