[SOLVED] Proxmox 5.3-6 HA - Setup/Node Reboot

vshaulsk

Active Member
Oct 24, 2017
60
6
28
40
Toledo OH
Perhaps the below has been answered before, but my quick search through the forums and documentation have yielded no results.

I run a couple of VM's using a central storage server and multipath ISCSI. The VM's function properly.

The complete proxmox cluster consists at the moment of 4 nodes with dedicated NIC's/VLAN's for the different network segments; This includes a dedicated network for corosync ring 1

The VM's which use central storage are setup using the built in proxmox HA module with the built in softdog fencing.

When I shut down or reboot one of the 4 nodes everything works properly. The VM's on that node which are part of the HA migrate to the correct next node.

The issue I have is if I am only running 3 nodes and one of those nodes reboots or shutdowns, even if no VM's are running on it ...... all of the proxmox hosts reboot
- Is this normal for all the hosts to reboot in a 3 node cluster if one of the members is lost?
- Does this have to do with the cluster loosing quorum?

Is there a way to setup a 3 node cluster so that if one node shuts down or reboots the other nodes continue to function without rebooting as well ?

Thank you !!
 
Normally you need an uneven number of nodes to run HA properly. If two nodes are missing you have lost 50%, this might be a problem.

Why you using 4 nodes and shutdown one of them? It might be help you to stop pve-ha-crm and pve-ha-lrm before you stop the third node.
 
I only turn on the 4th node when I need the extra resources. When I don't need the extra resources I power down the 4th node, thus saving some electricity. I do the same thing on my storage side to save some power = money

What is interesting is if I move all the HA VM's to one node within the 3 node cluster and than reboot one of the other nodes...... only the node with the HA VM's will also reboot. The remaining node which is only running local VM's will remain running.

Now about an uneven number of nodes to run HA properly.... I thought it was required to have 3+ nodes and not necessarily an uneven number. 4, 5 or 6 would also work....... I maybe wrong on this.

I thought that HA would basically work until you lost all but two nodes .....

Does anyone have a three node HA cluster? Are you able to reboot one of the members without the other members rebooting ?
 
What is interesting is if I move all the HA VM's to one node within the 3 node cluster and than reboot one of the other nodes...... only the node with the HA VM's will also reboot. The remaining node which is only running local VM's will remain running.
I think that's normal so, because the HA process on the other nodes are in idle state and not needed, because there are no VMs running on it. You can see the status on the HA overview.

Now about an uneven number of nodes to run HA properly.... I thought it was required to have 3+ nodes and not necessarily an uneven number. 4, 5 or 6 would also work....... I maybe wrong on this.
Normally you are not wrong at this. It's possible to run an even number, but then you run in such situations you are now. So it's better to run an uneven number, then you not running in such problems. If you run 5 node cluster, then 2 can be down, because you have more than 51% running. But if you have 4 node and loose 2 nodes then you have 50% lost and an even number in the cluster. I think that's your problem here, your last nodes are not able to get the quorum.

Does anyone have a three node HA cluster? Are you able to reboot one of the members without the other members rebooting ?
Yes, I am running such cluster and have VMs configured in HA Mode. And yes, im able to reboot one node without any problem.
 
Interesting .... thank you for the help !!!

I wonder if it has to do with the number of expected votes and if it does maybe I can decrease the number of votes to look as if I have three nodes instead of 4, 5, 6 etc.....
 
you need always a quorum of > 50 %, otherwise you can run into split brain situations (you really do not want that!)

so expected vote is n/2+1

so: 3 Nodes -> Expected Vote = 2
4 Nodes -> Expected Vote = 3
5 Nodes -> Expected Vote = 3 (and so on)

So only in a cluster with at least 5 nodes you can loose (or switch off) two nodes. In your configuration, if you switch off the 4th node you will loose HA, as every further node failure will fence running nodes.

And yes only the nodes running HA resources will be fenced (should be obvious why!)
 
Ok great thanks for the explanation!!

I decided to remove the extra node from the cluster as I use it rarely.

I can increase the available resources on the existing nodes by adding more memory and rebalancing the VM’s

Again thank you for the explanation!!! I will mark this thread as resolved :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!