Clustering issue

ztek

New Member
Aug 20, 2023
9
0
1
I have recently setup a cluster with two nodes. One of my work mate warned me that there is big downside of clustering. If one node is down, the other node VMs can not start. I simply could not believe it until today when I had a power outage and one of the node failed to start and the other node started but without the VMs. All the VMs were hanging. I had to power on the other node and then only I saw VMs starting!!



Question. Is it really the normal way of operation that if one node is down the cluster' other node does not properly start??

This is HUGE downside.
 
I have recently setup a cluster with two nodes. One of my work mate warned me that there is big downside of clustering. If one node is down, the other node VMs can not start. I simply could not believe it until today when I had a power outage and one of the node failed to start and the other node started but without the VMs. All the VMs were hanging. I had to power on the other node and then only I saw VMs starting!!



Question. Is it really the normal way of operation that if one node is down the cluster' other node does not properly start??

This is HUGE downside.
Yes, a lot of disappointed people have been told the same here on this forum. Every thread about a two-node cluster (without a QDevice for a third vote) has warnings about this.
 
Last edited:
  • Like
Reactions: ztek
can you explain what need to be done to make it work as a "normal" cluster that doesnt die when one of the nodes is down
 
You need a 3rd to prevent split brain.
You can use a Raspberrry Pi as a QDevice which is probably the cheapest option.

As an alternative you can install PVE in an old PC just to act as a voting device for the cluster
 
  • Like
Reactions: ztek
thanks, i have a spare Raspberrry Pi, so i would use that, is there any documentation on that setup somewhere?
 
  • Like
Reactions: ztek
can you explain what need to be done to make it work as a "normal" cluster that doesnt die when one of the nodes is down
run
pvecm expected 1
on runing node

if second node runs , cluster works again
 
can i use this method with the latest proxmox ve as alternative, without Pi? https://youtu.be/sjS9oDEw9EQ?feature=shared

If you lose the node with 2 votes, the other node will not be able to form quorum and VMs will not be allowed to start.
The video applies to a specific scenario of a homelab in which a given node is frequently shut down (dont know if that's your use case too). Just get a qdevice and use Proxmox cluster the way it is designed to work ;)
Oh, well, you could also make a cronjob that "pings" your other node and issues pvecm expected 1 if it doesn't reply in a few minutes. But think twice about this option on production, as there are corner cases that may produce data loss (i.e. storage replication).
 
run
pvecm expected 1
on runing node

if second node runs , cluster works again
Please don't do this.

The recommendation to avoid split-brain situations for any cluster with an odd number of nodes is to setup a QDevice, you can find more info and instructions at [1]. Do note that a QDevice can run on any Linux device, and unlike Corosync itself it does not have a strict requirement on low latency between the nodes.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_corosync_external_vote_support
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!