cluster quorum

stuartbh

Active Member
Dec 2, 2019
112
9
38
59
I recently created a 2 node cluster and then had reason to take the second node offline. In so doing, I noticed that the entire PVE stopped functioning until the second node rejoined. Is there a way to tell PVE that when only one node on a two node cluster is functioning to still consider that a quorum (without adding a third node at the moment)?

I was contemplating to temporarily remove the second node but then thought I read that doing so would cause me to need to reinstall the entire second node before rejoining the cluster. Is this correct?

I ran an update on the second node and doing so messed it up somehow and I am looking at that now (used standard apt commands to upgrade the PVE node). I may end up needing to reinstall it anyway, if so, I suppose I ought delete the node from the cluster first and rejoin post install.

Thanks in advance to anyone providing insights and do stay safe and healthy during these challenging times.

Stuart
 
Our wiki should answer your questions: https://pve.proxmox.com/wiki/Cluster_Manager
You can lower the required quorum (not recommended) or use an external quorum device

Yes, removing the node from the cluster means you have to reinstall it
 
Is there a way to tell PVE that when only one node on a two node cluster is functioning to still consider that a quorum
Yes there is pvecm expected 1

In the Cluster management documentation you will find two ways of removing a cluster node. While it is possible to remove a node without having to reinstall, it is not recommended.

While you technically don't need it, adding a QDevice for external vote support might be interesting to you. With this you don't need a third full node, but get the benefits of having a 3-node cluster.
 
  • Like
Reactions: Soper
I noticed that there is a place in the /etc/pve/corosync.conf file to set the number of votes each node gets. What would be wrong with just giving each node two votes (or only the working node two votes)? Would this not also solve the problem for the moment?

the other temporary solution would be adding the line "pvecm expected 1" to the /etc/pve/corosync.conf file at the bottom?

I am not sure why the other node says pve-manager is not and will not configure, but I may end up just deleting the node from the cluster and reinstalling it anyway.

Stuart
 
What would be wrong with just giving each node two votes?
Nothing really, but it would have the same effect. For your cluster to be quorate it needs to have >50% of votes. If both of your nodes have 2 votes and one goes down, it will still be non-quorate just like if both of them had one vote.

Changing these values in general will most likely just lead to unintended consequences though and is most certainly not what you are looking for.

the other temporary solution would be adding the line "pvecm expected 1" to the /etc/pve/corosync.conf file at the bottom?
No, pvecm is a command line tool < Proxmox Virtual Environment Cluster Manager >. When your cluster is non-quorate (so at least half of all nodes are dead), the remaining nodes will change the PVE management into read-only mode. Because of that you will no longer be able to change and manage your VMs and containers and will also not be able to log into the GUI. This is done to avoid cluster split-brain problems in which they run into inconsistent states.
Sometimes you still want to able to manage your node though (e.g. if you want to disband the cluster) in this case you can login to your cluster (e.g. with SSH) and then issue the command pvecm expected 1. This will tell the node that it should override the value of expected votes for the time.
 
Nothing really, but it would have the same effect. For your cluster to be quorate it needs to have >50% of votes. If both of your nodes have 2 votes and one goes down, it will still be non-quorate just like if both of them had one vote.

Changing these values in general will most likely just lead to unintended consequences though and is most certainly not what you are looking for.


No, pvecm is a command line tool < Proxmox Virtual Environment Cluster Manager >. When your cluster is non-quorate (so at least half of all nodes are dead), the remaining nodes will change the PVE management into read-only mode. Because of that you will no longer be able to change and manage your VMs and containers and will also not be able to log into the GUI. This is done to avoid cluster split-brain problems in which they run into inconsistent states.
Sometimes you still want to able to manage your node though (e.g. if you want to disband the cluster) in this case you can login to your cluster (e.g. with SSH) and then issue the command pvecm expected 1. This will tell the node that it should override the value of expected votes for the time.

When I attempted to execute that command I got this:
Unable to set expected votes: CS_ERR_INVALID_PARAM

Once I shut the second node down the command worked (I'd suggest that error is rather indescript).

Also, once I fix the second node how do I revert the expected parameter back to a normal setting or would that just get fixed as a function of me deleting the second node from the cluster, reinstalling, and having the newly reinstalled node rejoin the cluster?

I am surely going to try to setting up a 3rd QDevice as an extra vote when quorum elections are held. I have a laptop that I use daily that runs Linux, so doing that is really an easy cheat/solution. I hadn't really thought about it until just now.

I also have an HP T620 with 4GB of RAM and 16GB of internal SSD that runs pfSense on it and I'm think of installing ProxMox on it and then running pfSense under ProxMox (as the T620's only VM it would have).

The pfSense instance (running directly on bare metal) is using only "23% of 3449 MiB" (approximately 800MB) of the 4GB of RAM and 4.9GB of the 11GB root on the SSD (the other 4GB of SSD is for swap that is used 0%). I imagine ProxMox itself does not use more than 2GB of RAM, does that sound right? At any rate, another 4GB of RAM would not run more than $30 USD. I know a total of 8GB of RAM would be plenty, though if price the differential is negligable maybe I'd get an 8GB or 16GB RAM module.

It seems that my ProxMox installation is using about 7GB of DASD and pfSense about 5GB of DASD, so a 16GB SSD might be cutting it too close I think.

If I did this, it would allow me to have another node in the cluster and also to have HA with pfSense on the T620 and another pfSense instance installed on one of the other two cluster nodes.

Have a safe and healthy day during these most challenging times.

Stuart
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!