OpenVZ over NFS over DRBD is unusable slow

chammers

New Member
Oct 21, 2013
2
0
1
Hello

We try to setup Proxmox on a two node cluster that shares the data partition via master-master DRBD and has a NFS server that is always running on exactly one server. Both (even the NFS server) have mountet the exported directory in /mnt/pve/nfs-store1.

While that works fine for KVM machines with one big qcow2 file, the performance was *horrible* (10x slower) for OpenVZ containers that store their files individually in a directoy on the NFS mount.

To narrow down the problem, I tried bonnie++ on the host server (i.e. not in a VM!) first in the DRBD directory that is used by the NFS server (/srv/nfs-store1) and then in the NFS mounted directory (/mnt/pve/nfs-store1).
On the DRBD directory it took 8min, on the NFS mountet directory on the same host 91min!!!

The NFS directory was mountet using the Kernel defaults (v4, TCP, rsize/wsize=1M) but v3, UDP and other blocksizes didn't change the speed significantly. Bonnie++ was called using "time bonnie++ -r1024 -s2048 -c 1 -x 3 -u chammers -d t -m vm-office03-nfs > /bonnie3-nfs.csv" (I know -s should be double of RAM but I have 40GB RAM and not enough disk space on the test server so I used -r1024)

Code:
Version      1.96   ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
vm-office03-drbd 2G   490  91 29294   7 30690   4  3166  99 2371587  99  1216  17
Latency               295ms     286us     114ms    2774us     184us    4177us
vm-office03-drbd 2G   513  95 30049   8 30817   4  3027  99 2355940 100  1183  17
Latency             32431us     193us     123ms    3189us     140us    1813us
vm-office03-drbd 2G   517  97 29572   8 30665   5  3038  99 2256731  99  1258  18
Latency             61063us     140us     118ms    7100us     170us    3506us
                    ------Sequential Create------ --------Random Create--------                                                                                                                                                             
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--                                                                                                                                                             
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP                                                                                                                                                             
vm-office03-drbd 16  3020   5 +++++ +++ 10217  15 13712  23 +++++ +++ 10190  16                                                                                                                                                             
Latency                92us     466us     519us     105us      52us     136us                                                                                                                                                               
vm-office03-drbd 16  6066  10 +++++ +++ 15073  23 11230  19 +++++ +++ 15355  23                                                                                                                                                             
Latency               493us     463us     502us     124us      54us      74us                                                                                                                                                               
vm-office03-drbd 16  6988  12 +++++ +++ 12766  19 11513  19 +++++ +++ 15119  22                                                                                                                                                             
Latency               114us     457us     504us     105us      37us      80us

Using NFS over DRBD on the same Host:

Code:
Version      1.96   ------Sequential Output------ --Sequential Input- --Random-                                                                                                                                                             
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--                                                                                                                                                             
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP                                                                                                                                                             
vm-office03-nfs  2G   936  93  9257   1 25522   5  3105  99 2233857 100  1195  13                                                                                                                                                           
Latency              8293us     113us     201ms    3918us     168us   48877us                                                                                                                                                               
vm-office03-nfs  2G   934  94 23826   2 25416   5  2851  99 2345217  99 952.8   8                                                                                                                                                           
Latency              8470us     121us     228ms    3098us     169us   11901us                                                                                                                                                               
vm-office03-nfs  2G   957  95 10247   1 24874   5  2698 100 2327259  99  1047  12                                                                                                                                                           
Latency              8435us      97us     229ms    3331us     102us   58897us                                                                                                                                                               
                    ------Sequential Create------ --------Random Create--------                                                                                                                                                             
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--                                                                                                                                                             
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP                                                                                                                                                             
vm-office03-nfs  16    18   0 10816  54   658   9    21   0 18730  57  4372  30                                                                                                                                                             
Latency              2349ms   17447us    1215ms     975ms     289us     660us                                                                                                                                                               
vm-office03-nfs  16    22   0 14587  52   698  10    22   0 16040  57  4255  32                                                                                                                                                             
Latency              1008ms   17714us     951ms     741ms    1745us    1666us
vm-office03-nfs  16    21   0 12179  52   681  11    21   0 17977  57  4394  27
Latency               873ms   19002us    1024ms     864ms    1774us     443us


Does anybody have some more ideas how to tweak NFS or what else is recommended for OpenVZ on a high-availablilty cluster of just two nodes. I don't want to buy a NetApp filer for this.

Using qcow2 for OpenVZ is not supported/possible, right?

Best Regards

-christian-
 
Last edited:
After some more tests I found out that using the NFS export option "async" not only reproducably solves all performance issues with NFS-over-DRBD, it makes it even faster than the same bonni++ test on the DRBD mount directly (probably due to some caching effects)!

If somebody can explain the huge difference between sync and async when using DRBD as underlaying block device, I would be interested...
 
We use async also, sync is way too slow but we really need sync (reliable). Did you also try without DRBD? We had same issue without DRBD running on another volume.
 
Sync:
When the server tells the client that the data is stored to persistent storage, the data is stored on persistent storage.

Async:
The server lies to the client about data being stored to persistent storage it might onlly be stored in RAM. This WILL result in data loss should the server crash.

Do you have a RAID card with battery backed write cache?

This would help DRBD and NFS in sync mode. The data can be safely stored in the cards cache RAM (way faster than the disk) and be considered persistently stored. The larger the cache the longer you can sustain high IO rates, once cache is full you drop back to speed of the disks.

Battery backed write cache is especially helpful for DRBD, it drastically reduces seeks caused by DRBD writing its metadata.