KVM on top of DRBD and out of sync: long term investigation results

Source: http://www.drbd.org/users-guide/re-drbdconf.html

I've been using DRBD since around 2005, I remember reading that the auth is performed only once in an example drbd.conf.
The only references I can find are old example drbd.conf files, search google for:
"Authentication is only done once" DRBD

Maybe thats not true anymore but it sure would be silly and inefficient to perform the auth on every single request.

e100, many thanks, i believe the same, and if we have doubt, we only need do the verification of DRBD twice, the first time with the secret phrase and encryption enabled and see the time that need DRBD for complete the task, and second with... ¿?

Best regards
Cesar
 
Last edited:
Many thanks Cesar...are this settings fast enough for my 10GBE drbd-backlink connection? Or could I use higher values? (resnyc-rate, c-mac-rate, etc.)

I don't know, it depend of your max write speed in the storage and the max speed of your network DRBD. I think that if you have NICs of 10 Gb/s, the storage will be more slow.

For do such test, I have as habit always configure within of DRBD the speed of write to values super high for the first sync, "AND ALWAYS IN STATIC MODE" (that it is a little more high that the max speed of my DRBD network link), then when the first DRBD sync is in progress, i can see the max speed of write sustained that supports my setup (hardware, software, all my systems in general).

For example, if i have configured DRBD with static speed write to the max value, and my max speed write sustained was of 100 MB/s, then my DRBD setup in dynamic mode will have finally a "c-max-rate 80M;" configured (the 15% or 20% minus that the max speed write sustained in the test), and as value of minimal speed sync "c-min-rate 33M;" (the 33% of the max speed write sustained of the test).

I hope that this information can be util for you.

Best regards
Cesar
 
Last edited:
directsync does NOT prevent this problem.

I've recently noticed a larger number of inconsistencies, on upgraded faster hardware, so I enabled data-integrity-alg on a couple of nodes.
Few hours later DRBD breaks and split-brains due to the error:
buffer modified by upper layers during write

This is the precise problem that giner described in the first post.
All of my VMs have directsync set for all of their VM disks, so it is obvious that directsync does NOT prevent the buffer from being modified while DRBD writes are in flight.

writethrough, in my experience, does not perform well so I am not sure what I will do to prevent this problem.

I wonder if DRBD 8.4 in the 3.10 kernel is less susceptible to this problem.

For reference the cause of this problem is explained by Lars, one of the main DRBD developers, here:
http://lists.linbit.com/pipermail/drbd-user/2014-February/020607.html

Hi There,

I have some updates regarding the issue.
1) Upgrading to the latest DRBD 8.4 does not help and the issue is reproducible easily
2) I most cases directsync prevented out of sync for me however I've got a single VM where directsync did not help. However (!), switching to writethrough helped. This means that 'writethrough' is the only one mode that does not produce out of sync blocks for me so far. Here is a VM config that produces out of sync with directsync:
# Installed OS Windows 7 Enterprise SP1 with all updates
boot: c
bootdisk: ide0
cores: 1
ide0: drbd-lvm-0:vm-123-disk-1,cache=directsync,backup=no,size=40G
ide2: none,media=cdrom
memory: 512
name: zelpc00000001
net0: rtl8139=AE:8E:CA:56:D8:BD,bridge=vmbr2
onboot: 1
ostype: wxp
sockets: 1

Best regards,
Stanislav
 
Hi all,

I also still have some OOS Probs after booting one of the proxmox hosts...I also switched back to default(no cache)...but with writethrough these probs where there. Now I have the latest updates with kernel 3.10-8 and pve-qemu-2.2...perhaps this is the cure :)

mac
 
Hi all,I also still have some OOS Probs after booting one of the proxmox hosts...I also switched back to default(no cache)...but with writethrough these probs where there. Now I have the latest updates with kernel 3.10-8 and pve-qemu-2.2...perhaps this is the cure :)mac
Hello macday. How long have you been running that setup ?
 
Hi Rob,

do you mean how long until the oos occurs? about 50 days. for now i have to reboot my cluster every month. after reboot the oos will be resynced (about 1,5gb per drbd ressource). after reboot it changes from inconsistent to uptodate. but i need a solution without rebooting and bringing the drbd down. i have a primary primary setup with two different raids. one drbd is sas raid and on drbd is nearline sas. i already tuned some settings to sync to disk more often.
max-buffers 8000;
max-epoch-size 8000;
unplug-watermark 8000;
 
Hi There,

I have some updates regarding the issue.

Recent tests shows that all cache modes with O_DIRECT produce out of sync blocks and modes without O_DIRECT do not produce out of sync blocks. So both writethrough and writeback can be considered safe in relation to this issue.

Best regards,
Stanislav
 
Thanks Stanislav,

so should I try writeback mode with a fast raid-controller which also has writeback with bbu ?

Best regards,
Mac

Generally writeback cache mode for qemu is not safe and can cause data loss on power failure. My previous message was only related to DRBD and out of sync blocks.
 
Hi Rob,

do you mean how long until the oos occurs? about 50 days. for now i have to reboot my cluster every month. after reboot the oos will be resynced (about 1,5gb per drbd ressource). after reboot it changes from inconsistent to uptodate. but i need a solution without rebooting and bringing the drbd down. i have a primary primary setup with two different raids. one drbd is sas raid and on drbd is nearline sas. i already tuned some settings to sync to disk more often.

hi @ll
just a quick update...no more oos after 30 days of uptime with heavy writes and cache mode none on all vms...i think i found a good trick on my setup...do a sync and flush the page cache once a day (in the morning)

let me hear what you think about it.

cheers mac
 
Stanislav : On our production systems we are using drbd on zfs , with a zvol for each kvm . cache is set to writeback . In your post to pve-dev you mention " - if block device is open without O_DIRECT - the issue never happen " . Is there a cache setting in pve to prevent the issue? PS: we've had vm stability issues using drbd , ceph and non shared storage. system time always goes out of sync when vm's are unstable. So I use a script to test how the time is doing on data entry systems . If time is off key users get a red alert email. the stability issues here occur when there is heavy network traffic. like backups on multiple systems to nfs. So I wonder if this issue here is with any storage - not just drbd? Sometime soon I'll test the script you posted to see if the issue will occur with zfs zvols .
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!