2 Node Cluster HA DRBD or CEPH?

Dexter23

Active Member
Dec 23, 2021
216
15
38
35
Hi boys

i want to build a two node ha cluster and 3rd machine for Qdevice, i want to ask you if is better use in this case DRBD or CEPH?

The configuration of the server can be this:

Server A:

SSD 120 GB (Proxmox)

2 HDD 1 TB RAID 1 (VM Storage)

Thanks
 
Alternatively, since Ceph needs at least 3 nodes (more are better), have a look at ZFS in combination with VM replication.
Setting up a ZFS pool on both nodes (called the same) and have the same Storage available on both nodes, makes it possible to use the VM replication.

The only downside is, that it is asynchronous. Depending on the replication interval, you might lose some data if you encounter the situation that a node dies and the VMs need to be recovered on the remaining node.

For example, I personally am running such a setup and replicate important VMs like the mail server each minute. Other VMs where a bit of data loss is okay, are replicated on a longer interval.
 
Hi @aaron

yes I had taken into consideration zfs with the replication of the vm, my only fear is that in fact I have to run a mail server for my clients and I had excluded this option because in case the vm fails I had drbd I do not lose any mail.
 
Hi @aaron

yes I had taken into consideration zfs with the replication of the vm, my only fear is that in fact I have to run a mail server for my clients and I had excluded this option because in case the vm fails I had drbd I do not lose any mail.
Hey, I am in the exact (!) same situation

Did you found a reliable solution?

Thanks in advance!
 
Some Experiences for the topics in question here:

a) DRBD / LinStor

I personally used to operate DRBD on two clusters for some time. I found it quite an effort to keep it up to date, since it is not tighly integrated within Proxmox VE. The experience was, that the drbd lost it's sync now and then and I had to fix that manually. I also wrote some scripts for the integration myself at one of the clusters and they did not always work so well. (I would assume my scripts and my setup were flawed as reason for this). DRBD is also a bit more complex, especially the new LinStor-Architecture.

Last but not least DRBD is slow, since it writes synchronously over the network. This is because DRBD needs to be operated in Dual-Primary-mode, which is a strict requirement here. and in Dual-Primary-Mode Synchronization-Protocol C (Fully Synchron Writes) is the only option.

But: It is guaranteed to replicate the data in real time.

I assume LinStor will work better than the old DRBD. But for me my experience managing Proxmox VE Clusters had been quite stressful from time to time.

If you really need synchronized real-time data on a two node Cluster and only have 1 Gbe Networking, LinStor is a viable option.

Here is a short guide:
https://linbit.com/blog/linstor-setup-proxmox-ve-volumes/

You probably need to work through the full documentation.

b) Ceph

Regarding Ceph, I'm operating 2 clusters with 3 nodes each. Cluster one with 9 SATA-SSD-OSDs. Cluster two with 6 NVMe-SSD-OSDs (less perfect, but at the time at cluster setup, the optimal choice of SSD was not available). They are both working absolutely well. I'm far more happy with Ceph than with DRBD. I had some minor Issues when Upgrading Ceph, but then you have to read thoroughly the fine proxmox upgrade guides.

If you have / can afford at least three nodes and (at least) 10 Gbe Networking: Go for Ceph!

c) Storage Replication with ZFS

I just set up a cluster with ZFS storage replication. Compared with the two prior mentioned techniques, it's a piece of cake. I like that very much. Since I'm operating a rather important postgres database server, I had added a realtime postgres standby server, to make sure, to always have the latest data. Since High availability is not needed, I'll keep it that way.

If Realtime-Synchronization is not a requirement: Storage Replication is your friend.

----

GlusterFS is another option. I played around with it 5-10 Years ago and did not manage to get it into acceptable working mode.
 
Last edited:
Did you found a reliable solution?
The only reliable and low-maintenance solution is NOT to use a "two-node cluster", use at least a three (or more, yet odd-numbered) node cluster with a proper shared storage for a real HA cluster. This cannot be stressed enough. If it would be so easy, it would be documented in the official documentation.

Sorry for the upcoming rant ... yet it has to be said...
People: Dont' ask this at least once a month, just believe the people that're doing this for many years or even decades. It's stated over-and-over again in the forums and RTFM which already states this. Nice that VMware and others are able to do this, PVE (and CEPH) need at least 3 nodes. This is the lower limit. Just accept it.

I've also used a two-node DRBD cluster in the older days, which worked, yet as other already mentioned, need some work from time-to-time. It worked, yet the amount of work I had to put in was much, much more than with our proper PVE clusters nowadays. It's nowadys Apple-like ... it just works if you do it properly as documented.
 
Sorry for the upcoming rant ... yet it has to be said...
People: Dont' ask this at least once a month

I think there's nothing wrong with adding to an existing thread, especially with a delay. There's many more weird questions on the forum (which might as well go to e.g. hardware reddit) and nobody minds.

I've also used a two-node DRBD cluster in the older days, which worked, yet as other already mentioned, need some work from time-to-time. It worked, yet the amount of work

I can't speak for the OP, but usually on a forum like this, when someone tells me "do not do that" I am still after the "but why" part.
 
I have been running DRBD since 2017 starting with two data nodes and a third dummy node for quorum. DRBD 8 was not the easiest to learn and I was still mostly new to linux\debian itself at the time. It has preformed well enough for the couple RDP servers and NAS I was running on it, but I haven't tried anything like a high load\high performance database. Being able to live migrate a running VM in only the time it takes to sync the RAM between nodes is pretty amazing. I've always ran the DRBD with 10 Gbps through a isolated\offline switch and these days DRBD is on 5 data nodes being managed with LINSTOR. LINSTOR is the "newer" management tool and is soooo much better and more reliable than the old days of directly editing the configuration files on each of the nodes. There is still a learning curve and becoming acquainted with Resource Definitions, Storage Pools, Volume Definitions, Volume Groups, etc. Plus a whole bunch of other stuff if you want to get in to tuning it for performance. I never lost data due to any DRBD problem, though I have made some mistakes cost me many stressful late night hours of troubleshooting a few times.

The data is replicated in real time, but I don't run HA\automatic fail-over in my setup because I prefer to intervene if there is a problem and 100% 24/7 operation isn't that critical for my situation. That said, I wouldn't want to do it without having at least 3 data nodes, but it is possible with just 2.
 
Last edited: