[SOLVED] Node 1 Down in Cluster Can't turn on VMs on others

cshill

Member
May 8, 2024
62
8
8
Hi Proxmox community,
Still new to Proxmox but was wondering why it is that if I am adding multiple nodes into a cluster they start acting like a legitimate chain and that if one goes down the entire chain breaks.

I am working on two nodes for testing different filesystems and virtual machine disk setups and one node has a problem with booting up. This is an undiagnosed issue with the server, I think, but the main point is that now that this server has gone down I can access the 2nd node but can't do anything to the virtual machines, start them, stop them, nothing. This makes me think that clusters in their current state is NOT what I want. I have an earlier post where someone already stated that there's been a long desire for adding servers directly under the "datacenter" tab as independent nodes but accessible in one spot unless you want to cluster them.

As of right now what is my best option here? Just don't make a cluster? How do I break up this cluster? As of right now I see no option to break this cluster and let Node 2 do it's own thing.
 
As of right now what is my best option here? Just don't make a cluster? How do I break up this cluster? As of right now I see no option to break this cluster and let Node 2 do it's own thing.
About 77% of forum posts are about two-node clusters and how it's not a supported or recommended configuration.

Clusters are meant to run a job and only one member of the cluster is allowed to run one particular job.
If your two cluster members lose connectivity - neither of them knows if the other one is running a job. They do know that if they start the job, when the other one is running the same job, there will be data corruption. So they put themselves in a suspended state.

To prevent this situation, called Split Brain, the cluster must contain 3 members, so there is always a majority, i.e. 2/1 split. The majority always wins. When one member is left alone - it will reboot and release the job by not starting it. The majority will pick it up.

So, in summary, don't run 2 node cluster. Read the documentation, and pay attention to something called QDevice that can help you achieve the majority in a compromised cluster.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
  • Like
Reactions: Kingneutron
Although I see @bbgeek17 point, I usually run a two-node cluster (a sort of online-standby system) in which the main node has 2 votes and secondary has only one vote.

I recently discovered the qdevice technique (you can run only one QDevice per cluster if I'm not wrong), but when I got troubles like yours, @cshill, I usually issue a pvecm expexted 1 command on the only running node.

Try at your risk, make sure you have backups. Backups save lives, money and time.
 
  • Like
Reactions: Kingneutron
I usually run a two-node cluster (a sort of online-standby system) in which the main node has 2 votes and secondary has only one vote.
I usually issue a pvecm expexted 1 command on the only running node.
This is fine for a homelab when the operator understands the consequences and reasons for doing so. OP is at the very beginning of their journey by their own admission. They should understand the concepts before trying to circumvent them.

you can run only one QDevice per cluster if I'm not wrong
The goal of QDevice is prevent Split Brain, i.e. act as a tie-breaker. You would want one for a 2,4,6,etc node cluster, so it will become 3/5/7 node cluster. There is no point of having 2 QDevices because you'd put your cluster back in split-brain territory. Or you didnt need it in the first place.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
The goal of QDevice is prevent Split Brain, i.e. act as a tie-breaker. You would want one for a 2,4,6,etc node cluster, so it will become 3/5/7 node cluster. There is no point of having 2 QDevices because you'd put your cluster back in split-brain territory. Or you didnt need it in the first place.

Thank you, now I got the point right.
 
  • Like
Reactions: bbgeek17
@fmaione @bbgeek17 I appreciate your input guys. I have more questions but I will add to a different thread. In general my journey is fresh and I'm picking things up. I'm intentionally trying to break and explore all different things with proxmox as I want, nay have to, move away from VMware.
 
  • Like
Reactions: justinclift
As a general data point for people that do want to run a 2-node cluster seriously, you'll need to defang the watchdog software on both nodes as well.

Otherwise, about a minute after one host goes down, the remaining host will spontaneously reboot itself. Yep, that's as bad as it sounds. ;)

It happens because after a node goes offline, the remaining host won't have quorum (unless pvecm expected 1 has been run), so it thinks it's in trouble and its watchdog software will automatically reboot it 1 minute later in an attempt to fix things.

A reasonable way to defang the watchdog software is by telling the softdog kernel module to ignore reboot requests.

I do it by creating a modprobe override like this:

Code:
# echo "# Tell softdog to NOT reboot" > /etc/modprobe.d/disable-softdog-reboot.conf
# echo "options softdog soft_noboot=1" >> /etc/modprobe.d/disable-softdog-reboot.conf

With that in place (and after a reboot), the watchdog can't automatically reboot the system.

For a 2 node system, that's far more useful than having the watchdog be allowed to muck things up. :)
 
Last edited:
  • Like
Reactions: cshill

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!