"Best practise" for a two node proxmox/storage cluster

Oct 28, 2013
308
47
93
www.nadaka.de
Hi everybody,

yes, I know - the recommendation is using at least three nodes building a safe virtualization/storage cluster.
But please keep in mind that there is software out there which you have to pay per node. At the moment we are running a two node cluster (RHEV+DRBD), and switching to a three node solution would mean spending another 6k bucks to Microsoft and 28k to Oracle. :(
So I would love to build a two node Proxmox/Ceph or Proxmox/DRBD Cluster. Is there a (supported) way to do this?

Thanks a lot and many greets
Stephan
 
  • Like
Reactions: Bob K Mertz
Hi Wolfgang,

thanks for your reply!

That's really a pity, that a two node cluster is not possible. Please let me know: There are 100thousands of VMware/Hyper-V/whatever cluster out there running on two nodes. Is every single one of these solutions bad design, or can these solutions do something that Proxmox VE can't?

I'm not a MS/Oracle licensing expert but AFIK you have to pay only if the VM runs on the host.
What I mean you can restrict your VM on the 3 node cluster to 2 nodes.

That sounds interesting... MS and Oracle say you have to pay for each node the VM *could* run on.
How can I reliably prevent that a VM ever runs on a certain node?

Many greets
Stephan
 
That sounds interesting... MS and Oracle say you have to pay for each node the VM *could* run on.
How can I reliably prevent that a VM ever runs on a certain node?

you can create a pool, and in this pool, add 2 hosts and the vm.


But about licensing for oracle, I'm not sure it's enough. I'm currently studying for a customer oracle license, and It seem that all hosts nodes which are attached (physically) to the storage where the vm can run, need to be licensed.

I'm not sure that segmenting with pool could be valid.
 
Hi mir,
If you don't run HA why is a two-node cluster impossible?
The tree-node requirement is only valid if you want to run HA on pve-4?
thanks a lot for that hint... I never thought about the differences between a cluster and a HA-cluster.
For testing purposes I built up a two node cluster with ceph storage - and everything looks so fine... ;-)
Code:
Datacenter -> HA -> quorum: OK
node1 -> Ceph -> Status -> health: HEALTH_OK
node1 -> Ceph -> Status -> quorum: Yes {0 1}

So please tell me: What can terribly go wrong with this kind of setup?

Thanks a lot and greets
Stephan
 
I'm personally really trying my best to figure out what to do and everywhere I turn I'm getting more and more upset with Proxmox as a company mainly because I saw Proxmox as a solution for the little guy who now seems to all of a sudden forgotten about us and want only to deal with bigger companies. I have been running the same as you, m.ardito, only not as long (only a couple years) and using DRBD instead of Ceph. I finally decided to upgrade to proxmox 4 and worked with my host to allow me to temporarily install an additional node so I could perform the upgrade. I installed Proxmox 4 as a single node, migrated my VMs to that node, and have started the process of reinstalling the 2 node cluster and I can't get any straight answers... well, except for wolfgang saying "no" in nearly any thread that starts to look promising and offering no explanation as to why. It seems like everyone just keeps saying no to HA and forgetting that there are places that non-HA clusters work and from what I'm reading about the technology used in Proxmox 4 there is nothing wrong with a two node cluster being used other than Proxmox wants us to license additional nodes where it's not really warranted.

The scenario is this.... I only have 2U of space and I can not obtain an additional U for a third server. I am frequently hours away from my co-location but generally always have access to the internet. While it's ideal that my servers constantly stay up and running if they drop for a short bit it's not the end of the world where them being down for me to travel to the data center and go through the steps of troubleshooting hardware would be a problem. All I want is to have 2 nodes that are replicated (which DRBD does fine) and let them do their job. Should a hardware failure occur on one node and the VMs on that node go down I want to be able to check the status and if I find that one node genuinely is down I want to be able to start those VMs on the node that is still alive and then deal with the hardware failure when I can get to the data center and not be rushed because everything is still functioning fine on the other node. This is the scenario that I used in Proxmox 3 and it has worked fine for me even in a hardware failure I had.

Proxmox staff seems to be blaming this all on Corosync 2.x not supporting it but you don't have to look far to find that Corosync 2.x supports two_node just fine.... Sure, it doesn't support qdisk but that's HA.... I'm not looking for my servers to resolve this automatically. Looking into it more seems to indicate that Corosync 2.x actually is even working to implement solutions which brings me to my next frustration with Proxmox: Why did they make a major (non-beta) release that is based on such new software that isn't completed yet (DRBD is included in this statement as well). I'm completely caught with my pants down because I currently have 3 servers where I'm only allowed to have 2 and after 3 weeks I've got nowhere and every turn I make is seemingly (intentionally?) clouded by Proxmox staff.

Bottom line is this: There are a bunch of us that have been Proxmox users that fully understand the implications of 2 node systems.... We don't want/need HA and are fine with manual intervention.... We are tired of being referred to as "idiots" because we want a scenario that Proxmox doesn't think makes sense.

Instead of insulting us and alienating us (many of whom could eventually grow into 4, 5, or even more nodes that would eventually be license revenue) and pushing us away how about telling us the best ways to do what we need now rather than push us to another platform that isn't afraid to offer what is needed.
 
Hey everybody,

thanks a lot for all the replies - and sorry for my late answer!

Bob thoughts on the topic are very close to mine - let me try to ask some questions from my point of view:

Why do I like to have some cluster stuff?
I'm responsible for the server infrastructure of a small/midsize company consisting of about 50 virtual servers.
All these VMs on one node? I wouldn't sleep well in this case, because the impact of a hardware failure would be massive.
That's why I like to to have two nodes which fulfill the following wishes (again very similar to what Bob described):

Two nodes with Proxmox 4 and DRBD9. Very charming that every vDisk is a single DRBD target which is automatically "primary" on the node where the VM runs - so nice! So if one node fails, of course all VMs running on this node go down. Now I like to say: "Ok, this node is dead. Please make all DRBD targets primary on the second node and start all the VMs there (manually, not automatically!). Then business can go on, and I can repair the first node. Is this too complicate to become real?

And of course there is the sword of Damocles for every cluster stuff: split brain.
Ok, let's assume that every connection between the two nodes is interrupted from one moment to the next. What do I wish?
I wish that all VMs keep running. Should be possible, because every VM has its vDisks as "primary" on its node. Of course, syncing is not possible, but who cares? Business can go on (as far as possible without the broken connections), and I can repair the connection between the two nodes.

For this scenario I would love to buy a standard subscription for four CPUs.

I'm looking forward to hearing your opinions to my thoughts.

Greets
Stephan
 
Is this too complicate to become real?
After my own vast and unsuccessful tests, I would tend to the answer "Yes!" :rolleyes:

But let's be honest - it should be doable and it's my dream concept as well. Just having a small 2 node setup, where I can handle possible hardware failures manually. For me personally I see HA as an optional upgrade and I even set up a small VM to get cluster quorum, that is not the problem. But I came down to earth very quickly when my DRBD9 based vdisks got destroyed multiple times, sending the VM into the grub rescue shell and showing me that it is not useable at the moment.

I agree partially with the things that Bob said - the feeling that we're left alone a little bit and the whole problematic DRBD9 Tech-Preview situation. But on the other hand I totally understand that sometimes you need to make a jump and take new concepts/versions in to make progress over all. And I know of the guys at Linbit that they're working hard to get their bugs out as well as the guys here - we hopefully just have to wait for some time.

Regarding your sword of damocles: I think you are right - the whole concept of DRBD9 with the control volumes and not having anymore a dual primary situation is sooooo nice, it should be very safe by design already. My tests although show me, that there still exist some problems, that need to be taken care of.

Cheers Johannes
 
Hi All!

You can make the third node for quorum needs only. For this third node, you can use
any PC (even with poor parameters) capable run Debian server. Not necessary for these
purposes buy the real server. Even the VM in the different cluster or starting in you working
PC (via VirtualBox for example) suitable for this.
But in this case you get complete HA cluster.
You may disable VMs migration to the third node (via HA-groups).
And you will haven't licensing problems.

P.S.
Two-nodes cluster in Proxmox VE 3.x is realized through the qdisk. For this you needed
any PC that provides it. This PC actually performs the third node.

--
Best regards,
Gosha
 
Last edited:
Hi All!

Here's a real example of the cluster in which the third node - a VM in the different cluster:

pic.png

The first and second nodes - real servers. This is my Backups cluster. :)

--
Best regards,
Gosha
 
Hi Gosha,

You can make the third node for quorum needs only.
You may disable VMs migration to the third node (via HA-groups).
And you will haven't licensing problems.

unfortunately you're wrong concerning the licensing topic. Microsoft and Oracle see three nodes and say: "Cash, please!" They don't care about HA groups, because it's possible to reconfigure it after the Microsoft/Oracle guys leaving the house...
This is why a third node isn't an option for me.

Furthermore I'm a big friend of "keep it simple". So again: What's the problem with my suggestions?

Thanks a lot and greets
Stephan
 
unfortunately you're wrong concerning the licensing topic. Microsoft and Oracle see three nodes and say: "Cash, please!" They don't care about HA groups, because it's possible to reconfigure it after the Microsoft/Oracle guys leaving the house...

Hi Stephan!

But if the third node placed on Your PC as VM is not capable of running VMs that are on two real nodes, then any claim to the reconfiguration those guys may have? :)
 
But if the third node placed on Your PC as VM is not capable of running VMs that are on two real nodes, then any claim to the reconfiguration those guys may have? :)

And what happens when I'm on holiday and my PC is down for three weeks? ;-) Sorry, but such a setup doesn't feel very professional to me... I'd love to have a simple setup on strong server hardware.

Greets
Stephan
 
Hi all,

This specific problem after upgrade to pve-4.x has stroked me too. The way I will handle this problem is that I will custom tailor a Rasberry Pi to act as a quorum provider only. Eg. no proxmox packages but a plain vanilla Rasberian with corosync2. If anybody is interested I will create a wiki page with detailed instructions and a hardware shopping list.
 
Hi mir,
If anybody is interested I will create a wiki page with detailed instructions and a hardware shopping list.
I'm very interested in detailed instructions for building your solution!
But nevertheless please could someone explain to me why my suggestion is not doable? Why is it necessary to build these lousy workarounds? What's the problem of having no quorum in the scenarios described above?

Think about it: You go to your boss and say: "Here we go: Let's build a two node cluster with 20k buck machines - but we also need a raspberry to get it work." Sounds ridiculous, doesn't it? :)

Thanks and greets
Stephan
 
Last edited:
  • Like
Reactions: alexcolin
Could you please explain to me, why exactly my wishes from post #10 are not doable?

If you don't run HA why is a two-node cluster impossible?
The tree-node requirement is only valid if you want to run HA on pve-4?
Do my wishes from post #10 need HA? When we talk about Proxmox and DRBD: What are the HA features, and what is "only" cluster stuff?

Thanks for your support and many greets
Stephan
 
What are the HA features, and what is "only" cluster stuff?
HA in proxmox means automatic fail-over to another node. For non-HA cluster you could get by with only two nodes by added two_node: 1 to corosync.conf but the problem is that nodes does not get fenced which means that both nodes can operate in a quorate state independently of each other. Therefore a third node acting as witness is preferred.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!