"Best practise" for a two node proxmox/storage cluster

HA in proxmox means automatic fail-over to another node. For non-HA cluster you could get by with only two nodes by added two_node: 1 to corosync.conf
thanks for this explanation!
but the problem is that nodes does not get fenced which means that both nodes can operate in a quorate state independently of each other. Therefore a third node acting as witness is preferred.
Am I right in assuming that this is no problem for my scenario?
Or in other words: When I don't use HA, is there any need for fencing?
Even in a split brain scenario it's good for me when both nodes keep on running.

Many greets
Stephan
 
If you are running VM's from shared storage you could end in a situation where the same VM is running on both servers effectively corrupting the disk.
yes, but in a non HA setup this can't happen automatically, can it? And even if this worst case (VM runs on both nodes at the same time) happens: Every vDisk has its own DRBD target, so I could decide for every single vDisk which "version" I like to keep, and which one I like to throw into /dev/nul.

Greets
Stephan
 
If you are running VM's from shared storage you could end in a situation where the same VM is running on both servers effectively corrupting the disk.
Actually, no, this could not happen because this isn't an HA scenario..... well, I should say it can't happen unless you manually decide to start the VM on the additional node and you can't even accidentally do this because Proxmox will refuse to start the machine until you manually override the expected quorum.... If you do that without truly confirming a dead node then the results are on you. The process that I (and others in this thread) want is when a node fails whatever VMs are running on that node go down but because of DRBD the data for those machines is also on the other node so when a node goes down those VMs go down and I can intervene remotely to confirm what's going on (log in to IPMI, etc).... if we are unable to fix the node remotely and can confirm that it's down then we can then launch the VMs on the node that survived and we deal with the down node when we are able to get to the data center to be physically in front of the machine.
 
unfortunately you're wrong concerning the licensing topic. Microsoft and Oracle see three nodes and say: "Cash, please!" They don't care about HA groups, because it's possible to reconfigure it after the Microsoft/Oracle guys leaving the house...
This is why a third node isn't an option for me.

To be honest, I have a feeling this is Proxmox saying "Cash please" as well..... but one that will likely backfire. In order to run a two-node cluster you'd need to license a third proxmox server to install on the "crap pc" so that all versions are in-sync with each other..... but the backfire alternative is that you just don't license any of them and just run OS security updates.... neither scenario is ideal so I keep learning towards the other option of just forgetting about proxmox since they forgot about people like me.

Aside from the licensing aspect of things, there is also the fact that many pay per U for their colo..... Not all SMB can afford a three node cluster.... I barely worked in a 2 node cluster.
 
To be honest, I have a feeling this is Proxmox saying "Cash please" as well..... but one that will likely backfire. In order to run a two-node cluster you'd need to license a third proxmox server to install on the "crap pc" so that all versions are in-sync with each other..... but the backfire alternative is that you just don't license any of them and just run OS security updates.... neither scenario is ideal so I keep learning towards the other option of just forgetting about proxmox since they forgot about people like me.

Aside from the licensing aspect of things, there is also the fact that many pay per U for their colo..... Not all SMB can afford a three node cluster.... I barely worked in a 2 node cluster.
If you have the patience and can wait a few days I will have a solution based on a cheap Rasberry Pi to act as third "node". This one will not require a license since it will not have proxmox management interface installed.
 
  • Like
Reactions: vkhera
If you have two_node: 1 in corosync.conf and a VM has option 'start on boot' set then the VM will be started on the evicted node on reboot. This means the VM is running on both proxmox nodes.
If the evicted node is booting then it would be rejoining the quorum in which case it would see the VM running on the other node but if it's not joining the quorum then it would it would not be starting the VMs because of what I stated before... that node would have to be manually overridden for the expected votes on the quorum before it would start VMs without talking to the other node.
 
If you have the patience and can wait a few days I will have a solution based on a cheap Rasberry Pi to act as third "node". This one will not require a license since it will not have proxmox management interface installed.
I am interested in this.... I'm not sure if I could make it work or not in my colo as far as getting power, etc..... one might say to power the Pi off of a USB port but anytime the node it's plugged in to would reboot so would the Pi and if the hardware failure was a power supply issue then the Pi would also go down.
 
If the evicted node is booting then it would be rejoining the quorum in which case it would see the VM running on the other node but if it's not joining the quorum then it would it would not be starting the VMs because of what I stated before... that node would have to be manually overridden for the expected votes on the quorum before it would start VMs without talking to the other node.
You have misunderstood this. The hole point of two_node is that expected votes to have quorum is one node so a node coming up which cannot see the other node will have quorum and start VM's accordingly.
 
Last edited:
I am interested in this.... I'm not sure if I could make it work or not in my colo as far as getting power, etc..... one might say to power the Pi off of a USB port but anytime the node it's plugged in to would reboot so would the Pi and if the hardware failure was a power supply issue then the Pi would also go down.
1) A Pi cannot be powered from a USB port since the absolute minimum Amps is 1.2
2) You should use the standard Pi power supply which deliver between 2 and 2.5 Amps with its own plug. A Pi 2 consumes 3.5 - 4 W. 4.5 - 5 W for a Pi 3.
 
1) A Pi cannot be powered from a USB port since the absolute minimum Amps is 1.2
2) You should use the standard Pi power supply which deliver between 2 and 2.5 Amps with its own plug. A Pi 2 consumes 3.5 - 4 W. 4.5 - 5 W for a Pi 3.
1) Some machines will provide the needed power. I've actually run more than 1 Pi off of another machine's USB port
2) This requires another power outlet which is an additional cost for most colo scenarios.

Aside from the power issue I'd actually also need an additional network port which complicates this scenario even more. I like the concept but I don't think I can make it work because of the physical requirements.
 
You have misunderstood this. The hole point of two_node is that expected votes to have quorum is one node so a node coming up which cannot see the other node will have quorum and start VM's accordingly.
So I don't think I actually run the two_node option on DRBD8/Proxmox 3.4..... my servers are in a cluster with the typical expected vote for a quorum to be complete.... If one of the physical nodes is down the other one will refuse to start, migrate, backup, etc, any virtual machine until the quorum is back up or I manually override the expected vote.
 
Hello, I'm very interested in the HA configurations with two servers and one low power server (like the raspberry PI) for the quorum purposes. Would this allow to have full blown high availability with local mirrored storage (like DRBD9)? Anyone running this setup right now? I would LOVE to see some documentation about this. It would be awesome!!!!
 
Hey Kei,

Would this allow to have full blown high availability with local mirrored storage
Yes it would, if it was completely functional yet.
From your other thread, I found the objection of @wosp, that you can't install PVE on ARM platform - I think he's right, so take at least an old PC or a small Zotac/Intel NUC/...

I would LOVE to see some documentation about this.
There is not much special about it to document - you just have to follow the two wiki articles for "DRBD9" and for "HA". And maybe at some point you can sneak a peek at the official documentation of Linbit (these are the guys that are developing DRBD).
https://pve.proxmox.com/wiki/DRBD
https://pve.proxmox.com/wiki/High_Availability_Cluster_4.x
https://www.drbd.org/en/doc/users-guide-90

But consider yourself warned - DRBD9 is not production ready yet. I wasted some time now to find that out the hard way and decided for myself to wait at least until DRDB9 reaches its final release (I read somewhere that this should happen by the end of June).

Cheers, Johannes
 
Last edited:
Thank you for your answer, so I assume DRBD8 is obsoleted for VE 4.x ? I just said DRBD9 because I thought it was available, but of course the older version would do the job if compatible.
However I'm not sure which would be the HW requirements for this low power server. For example, ARM is not supported and Raspberry PI runs ARM. This is a problem I guess, right?
 
This is a problem I guess, right?
Yes.

However I'm not sure which would be the HW requirements for this low power server
As I suggested - take an old used Zotac ZBox or an Intel NUC (looks like a Mac Mini). https://www.zotac.com/us/product/mini_pcs/all or http://www.intel.com/content/www/us/en/nuc/overview.html
There are no real hardware requirements. My quorumgetter runs on a VM on another server, has 20GB of storage, 1 vCPU-Core, 512MB RAM and runs simple basic Debian with afterwards installed PVE. You can't even meet these low limits with real hardware anymore ;-)

so I assume DRBD8 is obsoleted for VE 4.x ?
Somewhere I found someone who made the effort to implement DRBD8 in PVE4 - don't try it (as you admitted in your other thread that you are more of a beginner than expert ;-) ). He said explicitely that it was a lot of work, you have to pay attention all the time when updating and on top of that it is not that simple to do the upgrade to DRBD9 when it will be ready. They made a lot of (very positive) basic changes in the architecture of DRBD and it will be a very very nice thing - as soon as it will be ready. But they lost backward-compatibility with this more or less.
 
As I suggested - take an old used Zotac ZBox or an Intel NUC
I already have some spare Atoms that I could use, but I srongly belive they're all 32 bits and w/out virtualization support. Is this a problem even for just the "quorumgetter" purpose?

don't try it (as you admitted in your other thread that you are more of a beginner than expert ;-) )
I sure am :) Would I be better off waiting for DRBD9 or should I try Ceph?
 
https://forum.proxmox.com/threads/cluster-and-ha-in-4-x-would-love-to-learn-more.27625/#post-139049

Is this a problem even for just the "quorumgetter" purpose?
As @wosp said in your other thread - 64bit is necessary. The Proxmox packages are only provided for this architecture. But some Atoms have 64bit supoort. Just check their type numbers.
EDIT: Only exception is, like @mir said, if you just install for example a regular Debian and on it the corosync package (which is available for more platforms), then you can use your Raspberry. But you have to configure it manually.

Would I be better off waiting for DRBD9 or should I try Ceph?
For Ceph, you need three nodes at least - also see in your other thread. I can't really tell you more about Ceph - I never used it.
 
Last edited:
Only exception is, like @mir said, if you just install for example a regular Debian and on it the corosync package (which is available for more platforms), then you can use your Raspberry. But you have to configure it manually.

Yes, I belive this would be a great solution for my needs. And no, I did not crosspost on purpose, as I realized the existence of this thread only after I started a new one. Sorry for that.

About DRDB9, it appears that some people are using it even in production. What is so risky about it that many other are suggesting not to use it yet? And when should be safe to use it? Could it take months?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!