(poor) NFS performance

neodg

Renowned Member
Mar 11, 2012
19
0
66
Hi there!

Its late an i dont know what problem i have.

I have 3 Servers, 1 Storage, 2 Nodes.

The Fileserver has 8 gBit nics, that are bondet (802.3ad) to a switch (bigframe is enabled),
the 2 Nodes have also 4gBit nics (per Node), that are bondet (802.3ad). So, ok.
The Fileserver has a NFS Target and exports (/storage/ 10.10.10.10(rw,async,no_root_squash,no_subtree_check).

The Nodes are connected with the storage

(nfs: Storage path /mnt/pve/Storage
server 10.10.10.10
export /storage/
options noatime,async
content images,iso,vztmpl,rootdir,backup
maxfiles 2)

Al works "fine", but the data transfer rate is... horrible.

The Storage has a transferrate about 400MB/s (and i will expand the storage), the VM on the nodes have only ~ 120MB/s.

i have tested several configurations on storage and node, but at 120MB ill get no more speed. Why?!

Where is my error?
 
The Fileserver has 8 gBit nics, that are bondet (802.3ad) to a switch (bigframe is enabled),
the 2 Nodes have also 4gBit nics (per Node), that are bondet (802.3ad).

All Nics are gbit. But with the trunk shouldn't it be more?
 
it says what it should!

Test with tiotest runs. But the picture say me that the bonding doesnt do right. When 1 nic can about 120MB, then must be 3 to 4 in work when i test the server and this doenst appear :(
 
tiobench --size 40000
Run #1: /usr/bin/tiotest -t 8 -f 5000 -r 500 -b 4096 -d . -T-T

Unit information
================
File size = megabytes
Blk Size = bytes
Rate = megabytes per second
CPU% = percentage of CPU used during the test
Latency = milliseconds
Lat% = percent of requests that took longer than X seconds
CPU Eff = Rate divided by CPU% - throughput per cpu load

Sequential Reads
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
3.2.0-3-amd64 40000 4096 1 106.25 51.21% 0.036 336.29 0.00000 0.00000 207
3.2.0-3-amd64 40000 4096 2 99.13 95.38% 0.078 824.76 0.00000 0.00000 104
3.2.0-3-amd64 40000 4096 4 88.71 170.6% 0.175 2695.70 0.00003 0.00000 52
3.2.0-3-amd64 40000 4096 8 87.22 323.7% 0.356 3146.79 0.00007 0.00000 27

Random Reads
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
3.2.0-3-amd64 40000 4096 1 6.25 29.13% 0.622 1.54 0.00000 0.00000 21
3.2.0-3-amd64 40000 4096 2 7.47 47.44% 1.008 100.43 0.00000 0.00000 16
3.2.0-3-amd64 40000 4096 4 7.73 22.37% 1.834 379.34 0.00000 0.00000 35
3.2.0-3-amd64 40000 4096 8 7.36 29.56% 3.230 433.16 0.00000 0.00000 25

Sequential Writes
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
3.2.0-3-amd64 40000 4096 1 113.25 33.69% 0.034 63596.46 0.00007 0.00005 336
3.2.0-3-amd64 40000 4096 2 118.60 97.65% 0.064 69192.40 0.00008 0.00004 121
3.2.0-3-amd64 40000 4096 4 114.63 198.8% 0.134 65512.43 0.00031 0.00008 58
3.2.0-3-amd64 40000 4096 8 117.76 424.7% 0.262 59627.34 0.00032 0.00031 28

Random Writes
File Blk Num Avg Maximum Lat% Lat% CPU
Identifier Size Size Thr Rate (CPU%) Latency Latency >2s >10s Eff
---------------------------- ------ ----- --- ------ ------ --------- ----------- -------- -------- -----
3.2.0-3-amd64 40000 4096 1 0.35 0.533% 0.006 0.43 0.00000 0.00000 65
3.2.0-3-amd64 40000 4096 2 0.36 0% 0.007 0.08 0.00000 0.00000 0
3.2.0-3-amd64 40000 4096 4 0.45 0.034% 0.008 0.06 0.00000 0.00000 1302
3.2.0-3-amd64 40000 4096 8 0.42 0% 0.009 0.23 0.00000 0.00000 0


or as picture: Benchmark_2.jpg
 
Last edited:
you can use iperf to do network tests.
 
Is the switch configured for 802.3ad and does it have full 802.3ad support? looks like a HP switch from the pic you uploaded. my netgear switch claims it has 802.3ad support but its not the complete 802.3ad protocol :S alb bonding seems to cause more problems than it solved so i'm running my cluster in active/backup atm
 
Hi, this is normal, with 802.3ad (lacp), you can't loadbalacing 1 connection between multiple links. (Load balancing is based on ipsource-ipdestinable hashtable).
So you can't have more than 1gigabit for 1 nfs share (connection).

Pnfs can help you.(It can use multiple link), but your storage need to handle pnfs. (last netapp storages do it).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!