Bring back DRBD8 kernel module

e100

Renowned Member
Nov 6, 2010
1,268
46
88
Columbus, Ohio
ulbuilder.wordpress.com
I have 38 DRBD volumes totaling around 50TB of useable storage across sixteen production nodes.
We have held off upgrading to Proxmox 4.x in hopes DRBD9 and its storage plugin become stable and after a year I'm still waiting.

I need to upgrade to 4.x but the non-production ready DRBD9 makes that upgrade more difficult for me.

Problems with DRBD9:
  1. The storage plugin still lacks disk resize.
  2. If I use LVM on top of DRBD9 I could still resize disks. But thats not an option either because primary-primary is not stable, linbit docs say wait for 9.1.
  3. Code is too young, needs more time to mature and prove its reliability before using in production.
  4. Drbdmanage license change, fallout and changeback. What can we expect to happen next month?

Can Proxmox please do one of these?:
A) Compile the kernel with whatever DRBD8 version is in the upstream kernel but also provide a binary package to install DRBD9 kernel module for the people who are already using it or might want to try it.
Or
B) Provide a DRBD8 binary kernel module package so I can easily choose to downgrade.
 
  • Like
Reactions: JensDoe
I would consider Ceph as (more feature complete and stable) alternative (or Gluster, or Sheepdog), or maybe zfs with zfs-sync for really small setups.
 
I would consider Ceph as (more feature complete and stable) alternative (or Gluster, or Sheepdog), or maybe zfs with zfs-sync for really small setups.
Hi Dietmar,
but in small setup often ceph isn't an alternative (and ZFS also not) - the two main reasons:

1. Performance
If I have some X-TB raided DRBD-Volumes (between two nodes), I got good IOPS + singlethread performance. With ceph it's much much slower (or you need an full SSD-setup - which isn't comparable with multi-TB-setup and an normal budget).
With ZFS you get smaller performance (on some/many? systems) and no real (in view of time) sync between the node.

2. Licensing
For virtualisation of windows-server you need an windows license for all nodes, where the server can run. With ceph the win-VM can run on all nodes, so you need 6 licenses on a six-node-cluster.
With drbd between two nodes, the server can run on two nodes only - even on a six-node cluster.

I can't say anything about Gluster or sheepdog, but all I have read looks not that the performance is comparable - but perhaps I'm wrong.

Udo
 
@dietmar

My only problem with DRBD in Proxmox 4.x is that it includes non-production ready DRBD9.
I should not have to compile code myself to get enterprise stability and features out of the Proxmox enterprise repo.

@udo is spot on

CEPH is too slow because of the single threaded IO in KVM, does not even come close to the performance I can get with DRBD.
Also, vzdump backup performance with CEPH is HORRIBLE. Its like backing up to a 'fast' tape drive in 1999. Some of my VMs in CEPH take over 48 hours to backup with vzdump, sorry but that is not production ready.

ZFS is not real-time sync. If I have a node failure I want to start the VMs on the other node without any data loss.
Gluster, been a long time since I used it but when I did it performed much worse than DRBD and I've not seen anything to indicate that it has improved by orders of magnitude needed for me to use it.
Sheepdog, while its always looked promising still does not seem ready for production.

There simply is nothing comparable to DRBD that meets my safety and performance needs.
 
@dietmar

My only problem with DRBD in Proxmox 4.x is that it includes non-production ready DRBD9.
I should not have to compile code myself to get enterprise stability and features out of the Proxmox enterprise repo.

Please complain to Linbit. According to them it is stable and production ready.
 
If I have some X-TB raided DRBD-Volumes (between two nodes), I got good IOPS + singlethread performance. With ceph it's much much slower (or you need an full SSD-setup - which isn't comparable with multi-TB-setup and an normal budget).

We switched to SSD only setups several years ago, and you can build up ceph cluster with incredible performance that way...
But yes, performance has its price. IMHO ceph is the way to go if you want to have a future proof, full featured and expandable storage. Ceph even provides a distributed file system now (cephfs) ...
 
My only problem with DRBD in Proxmox 4.x is that it includes non-production ready DRBD9.

DRBD9 is provided as a plugin from Linbit, its not a part of a default Proxmox VE installation. We only include DRBD9 kernel module which is marked as stable since june 2015 (AFAIR). DRBD9 plugin/drbdmanage was always marked as "experimental" in 4.x, due to obvious reasons.

Almost all our users failed to run a stable DRBD9 setup, additionally still missing features and also the recent license changes does not improve the user base. Also the promised speed was not always seen.

Therefore we do not force new users to go this road. On the other side, the maintainers (Linbit) told recently that is now working better. So it looks you need to get in touch with Linbit, I personally never tested their latest plugin in real production setup, all our production systems are based on ZFS or Ceph, running well since quite some time now. (all SSD).

Backup performance on Ceph is not totally great, but I can achieve acceptable values. As all our big fileservers are on NFS, so we do not backup these data using vzdump. Our VMs are small.

A VM with 256 GB harddisk and a LZO compressed 50 GB backupfile file takes about on hour to get written (vzdump to NFS).
 
We only include DRBD9 kernel module which is marked as stable since june 2015 (AFAIR).

The DRBD9 kernel module is precisely what I have an issue with, linbits documentation says "running in dual-primary is not recommended"
http://www.drbd.org/en/doc/users-guide-90/s-dual-primary-mode

I don't want to use drbdmanage and the feature incomplete plugin.

I want to run manually configured dual-primary using a DRBD8 kernel module because this configuration is "not recommended" with DRBD9 kernel module.

A VM with 256 GB harddisk and a LZO compressed 50 GB backupfile file takes about on hour

Exactly my point, much slower than DRBD.
In my environment the backup speed is limited by lzo and the write speed of my backup media.
Code:
149: Jan 28 03:55:00 INFO: transferred 569083 MB in 3059 seconds (186 MB/s)
149: Jan 28 03:55:00 INFO: archive file size: 233.52GB

Code:
299: Jan 28 04:03:44 INFO: transferred 96636 MB in 327 seconds (295 MB/s)
299: Jan 28 04:03:44 INFO: archive file size: 22.53GB
 
The DRBD9 kernel module is precisely what I have an issue with, linbits documentation says "running in dual-primary is not recommended"

So you suggest to remove the drbd9 module, and ship the old DRBD8 one included with the ubuntu kernel?
 
I have the same problem.. Tomorrow I have to tell my customer to migrate away from drbd to an scsi storage because 3.4 is only supported until 28. Feb.
 
I'm quite new to Proxmox but for small setups (like the one i plan) DRBD seems to be a good solution. I would also prefer stability (= DRBD8). ;)

Regards

Macavity
 
2. Licensing
For virtualisation of windows-server you need an windows license for all nodes, where the server can run. With ceph the win-VM can run on all nodes, so you need 6 licenses on a six-node-cluster.
With drbd between two nodes, the server can run on two nodes only - even on a six-node cluster.


Udo

Hi Udo, for licensing, you can create a pool in ceph cluster for your windows vm, and only define storage in proxmox on 2 nodes.

We had a license control some months ago, and it's was ok.
 
  • Like
Reactions: jeffwadsworth
@dietmar

Will Proxmox provide a DRBD8 kernel module or will we be forced to compile our own on every kernel upgrade?

+1 vote for DRBD8
(I use DRBD from many years ago)

LINBIT sells you the illusion that DRBD9 is stable, but if you see the amount of complaints and inquiries in his mailing list about of this version, and after compare it with DRBD8 (latest version), you will can get conclusions very quickly about of the gigantic difference.

Moreover, If PVE also sells you the illusion that his product is stable and ready for use it in production environment, i guess that PVE should to have the DRBD8 in his repository, and the end user is the one who can choose which version use.
(and you will can see the amount of downloads the will have DRBD8 vs DRBD9)

BR
Cesar
 
I'm in a similar spot here... 5 node cluster with two pairs of DRBD8 nodes with no viable path to upgrade to ProxMox 4.x with comparable performance and redundancy. I second offering DRBD8 kernel module either in the kernel or as a package.

EDIT: It seems DRBD 8 can be upgraded to DRBD 9 smoothly: https://www.drbd.org/en/doc/users-guide-90/s-upgrading-drbd
Just make sure you've set up Linbit's Proxmox repo: https://www.drbd.org/en/doc/users-guide-90/s-proxmox-install

But the lack of dual-primary does defeat the purpose, so we still need DRBD 8.4.x kernel module for the next few months at least.
 
Last edited:
Because some people are using DRBD9 kernel module I'm asking that you make both available by having one in the kernel and the other in a package.
I would say latest DRBD8 version in the Kernel and DRBD9 as DKMS module. The other way around is more difficult to maintain (DKMS will not automatically downgrade from version 9 to 8).

@e100:
Read this post and follow the links there for DRBD8 Kernel support and DRBD8 storage plugin.

BR,
Jasmin