DRBD Performance Problem

TheReelaatiiv

Member
Mar 29, 2012
89
0
6
Hello,

i have the following setup:

Two Proxmox 2.2 nodes. On top of each exactly the same HDDs in a Software RAID 10. On top of the Software RAID i configured DRBD and on top of it LVM2.
Network Configuration looks like this:
Server 1 and Server 2 are exactly the same but the ips:
eth1 <--> Switch (Internet)
eth0 and eth2 bonded to bond0. Connected to each other with CAT 5e Network Cable.
eth0 and eth1 are Realtek NICs and eth2 is an Intel NIC.

I created two LVM LVs named "ovz-glowstone" and "ovz-bedrock" which i am mounting on the nodes "glowstone" and "bedrock":
For example on node "bedrock": On boot, activate lv ovz-bedrock and mount it to /var/container/
The same on the node "glowstone" with "ovz-glowstone".

dd with oflag=direct shows me about 30MB/s which is not as expected.

I already did this to fix the problem (but replication is still very slow):

echo 127 > /proc/sys/net/ipv4/tcp_reordering
ifconfig bond0 mtu 2000 (it does not work with 4000 because the Realtek NIC does not like this high mtu)

My DRBD configuration:

global_common.conf:
Code:
global {
    usage-count no;
    # minor-count dialog-refresh disable-ip-verification
}

common {
    protocol C;

    handlers {
        #pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
        #pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
        #local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
        # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
        split-brain "/usr/lib/drbd/notify-split-brain.sh root@dedilink.eu";
        out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root@dedilink.eu";
        # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
        # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
    }

    startup {
        # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
    }

    disk {
        # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes
        # no-disk-drain no-md-flushes max-bio-bvecs
    }

    net {
        # sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers
        # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret
        # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork
    }

    syncer {
        # rate after al-extents use-rle cpu-mask verify-alg csums-alg
    }
}

r0.res:
Code:
resource r0 {
        protocol C;
        syncer {
                rate 2G;
    }
        startup {
                wfc-timeout 60;
                degr-wfc-timeout 60;
                become-primary-on both;
        }
        net {
                cram-hmac-alg sha1;
                shared-secret "*****";
                allow-two-primaries;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
        }
        on bedrock {
                device /dev/drbd0;
                disk /dev/md0p1;
                address 10.0.0.2:7788;
                meta-disk internal;
        }
        on glowstone {
                device /dev/drbd0;
                disk /dev/md0p1;
                address 10.0.0.3:7788;
                meta-disk internal;
        }
}

To explain the storage build again:

/dev/sd[abcd] -MDRAID-> /dev/md0
/dev/md0p1 (=Linux LVM) -DRBD-> /dev/drbd0
LVM knows /dev/drbd0.


And sorry for my bad english.
 
That seems to work:

Code:
root@glowstone ~ # iperf -c 10.0.0.2 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.0.0.2, TCP port 5001
TCP window size:   734 KByte (default)
------------------------------------------------------------
[  4] local 10.0.0.3 port 38305 connected with 10.0.0.2 port 5001
[  5] local 10.0.0.3 port 5001 connected with 10.0.0.2 port 35842
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.37 GBytes  1.18 Gbits/sec
[  5]  0.0-10.0 sec  1.71 GBytes  1.47 Gbits/sec

But the disks are very fast:

Code:
root@glowstone ~ # hdparm -Tt /dev/sd[abcd] /dev/md0

/dev/sda:
 Timing cached reads:   24662 MB in  2.00 seconds = 12345.51 MB/sec
 Timing buffered disk reads: 524 MB in  3.00 seconds = 174.49 MB/sec

/dev/sdb:
 Timing cached reads:   24876 MB in  2.00 seconds = 12451.90 MB/sec
 Timing buffered disk reads: 528 MB in  3.01 seconds = 175.56 MB/sec

/dev/sdc:
 Timing cached reads:   24662 MB in  2.00 seconds = 12345.36 MB/sec
 Timing buffered disk reads: 526 MB in  3.01 seconds = 174.82 MB/sec

/dev/sdd:
 Timing cached reads:   24930 MB in  2.00 seconds = 12479.81 MB/sec
 Timing buffered disk reads: 518 MB in  3.03 seconds = 170.98 MB/sec

/dev/md0:
 Timing cached reads:   24966 MB in  2.00 seconds = 12497.47 MB/sec
 Timing buffered disk reads: 888 MB in  3.00 seconds = 295.65 MB/sec

What is going wrong here?

/Edit:

Tested on /dev/drbd0 and /dev/storage/ovz-glowstone it seems also to be VERY fast:

Code:
root@glowstone ~ # hdparm -Tt /dev/drbd0

/dev/drbd0:
 Timing cached reads:   25350 MB in  2.00 seconds = 12689.29 MB/sec
 Timing buffered disk reads: 946 MB in  3.00 seconds = 314.93 MB/sec
root@glowstone ~ # hdparm -Tt /dev/storage/ovz-glowstone 

/dev/storage/ovz-glowstone:
 Timing cached reads:   24186 MB in  2.00 seconds = 12106.79 MB/sec
 Timing buffered disk reads: 748 MB in  3.01 seconds = 248.33 MB/sec

So why is a dd as slow as <use an example here>?

/Edit:

Sorry, my mistake: hdparm only shows read performance.
Write performance to the LV /dev/storage/ovz-bedrock:

Code:
root@bedrock ~ # dd if=/dev/zero of=/dev/storage/ovz-bedrock bs=1G count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 25.9118 s, 41.4 MB/s

Writing to /dev/drbd0 would destroy my data and my customers would not be very happy :)
 
Last edited:
Your "problem" is that you have a lot of software layers between your applications and the physical disk.

apps->
- OVZ ->
- filesystem ->
- LVM ->
- DRBD ->
- RAID 10 ->
- Disk

Each of the above layers adds complexity and gives you a performance cost. Remember that all of this is software based which means every layer competes for CPU access.
A way to improve your setup would be to replace your software RAID with a hardware RAID controller.
 
I don't think it's the Software-Raid Level which decreases the speed.

I "degraded" the drbd and tested the speed again:

Code:
root@glowstone /dev # dd if=/dev/zero of=/dev/storage/ovz-150 bs=5G count=1 oflag=direct
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 10.0859 s, 213 MB/s

/Edit:

The same command ("of" Parameter has another value) on an Adaptec 6405E Raid-Controller in RAID10 (the same mode as in software raid)
Code:
root@bedrock ~ # dd if=/dev/zero of=/dev/storage/ovz-bedrock bs=5G count=1 oflag=direct
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 12.3438 s, 174 MB/s

Do you see now, that Hardware-RAID isn't better than Software-RAID at the moment?

I've set Stripe Size to 1MB because i've read somewhere that a higher value is much more faster?
 
Last edited:
If you see DRBD performance issues, I suggest the better place to get help is the DRBD user mailing list. Proberly configured, DRBD is really fast.