Network performance issues

agh

New Member
May 2, 2013
7
0
1
Hello,

Here is the situation :

4 Proxmox nodes in a cluster.
Each node is the same :

Code:
# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-8
libpve-access-control: 3.0-7
libpve-storage-perl: 3.0-17
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-4
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1

Code:
# lscpu
root@ix6-proxmox-1:/var/lib/vz/template/iso# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Stepping:              7
CPU MHz:               1995.329
BogoMIPS:              3989.84
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-7,16-23
NUMA node1 CPU(s):     8-15,24-31

Code:
# lspci  |grep Ether
03:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
06:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
06:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)

and,
Code:
# cat /etc/network/interfaces
auto lo
iface lo inet loopback

auto eth2
iface eth2 inet manual

auto eth3
iface eth3 inet manual

auto bond0
iface bond0 inet manual
    slaves eth2 eth3
    bond_miimon 100
    bond_mode 802.3ad
    post-up ifconfig bond0 mtu 9000

auto vmbr0
iface vmbr0 inet static
    address 192.168.0.xxx
    netmask 255.255.255.0
    gateway 192.168.0.254
    bridge_ports bond0
    bridge_stp off
    bridge_fd 0
    mtu 1500



So, I have big network performance issues.
Look at theses tests.

  • proxmox-1 and proxmox-2 are two proxmox physical hosts
  • vm-1, vm-2 are tow KVM guests on proxmox-1, and vm-3 is a KVM guest on proxmox-2.
  • VirtIO is used on VMs

iperf between proxmox-1 and proxmox-2 => result is OK.
Code:
root@proxmox-1:~# iperf -c proxmox-2 -i1 
------------------------------------------------------------
Client connecting to ix6-proxmox-3, TCP port 5001
TCP window size: 23.8 KByte (default)
------------------------------------------------------------
[  3] local  xx.xxx.xxx.xxx port 38319 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  1.02 GBytes  8.75 Gbits/sec
[  3]  1.0- 2.0 sec   988 MBytes  8.29 Gbits/sec
[  3]  2.0- 3.0 sec   979 MBytes  8.21 Gbits/sec
[  3]  3.0- 4.0 sec   950 MBytes  7.96 Gbits/sec
[  3]  4.0- 5.0 sec   976 MBytes  8.19 Gbits/sec
[  3]  5.0- 6.0 sec  1.06 GBytes  9.07 Gbits/sec
[  3]  6.0- 7.0 sec  1.06 GBytes  9.07 Gbits/sec
[  3]  7.0- 8.0 sec   972 MBytes  8.16 Gbits/sec
[  3]  8.0- 9.0 sec   950 MBytes  7.97 Gbits/sec
[  3]  9.0-10.0 sec   996 MBytes  8.35 Gbits/sec
[  3]  0.0-10.0 sec  9.78 GBytes  8.40 Gbits/sec


iperf between vm-1 (on proxmox-1) and vm-2 (on proxmox-1) => result is OK
Code:
[root@vm-1 ~]# iperf -c vm-2 -i1
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx TCP port 5001
TCP window size: 19.3 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 38669 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  1.68 GBytes  14.4 Gbits/sec
[  3]  1.0- 2.0 sec  1.74 GBytes  15.0 Gbits/sec
[  3]  2.0- 3.0 sec  1.70 GBytes  14.6 Gbits/sec
[  3]  3.0- 4.0 sec  1.73 GBytes  14.8 Gbits/sec
[  3]  4.0- 5.0 sec  1.75 GBytes  15.0 Gbits/sec
[  3]  5.0- 6.0 sec  1.75 GBytes  15.0 Gbits/sec
[  3]  6.0- 7.0 sec  1.75 GBytes  15.1 Gbits/sec
[  3]  7.0- 8.0 sec  1.75 GBytes  15.0 Gbits/sec
[  3]  8.0- 9.0 sec  1.57 GBytes  13.5 Gbits/sec
[  3]  0.0-10.0 sec  16.8 GBytes  14.4 Gbits/sec


iperf between vm-1 (on proxmox-1) and vm-3 (on proxmox-2) => result is KO !
Code:
[root@vm-1 ~]# iperf -c vm-3 -i1
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 19.3 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 47869 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec   330 MBytes  2.77 Gbits/sec
[  3]  1.0- 2.0 sec   341 MBytes  2.86 Gbits/sec
[  3]  2.0- 3.0 sec   345 MBytes  2.89 Gbits/sec
[  3]  3.0- 4.0 sec   341 MBytes  2.86 Gbits/sec
[  3]  4.0- 5.0 sec   348 MBytes  2.92 Gbits/sec
[  3]  5.0- 6.0 sec   340 MBytes  2.86 Gbits/sec
[  3]  6.0- 7.0 sec   347 MBytes  2.91 Gbits/sec
[  3]  7.0- 8.0 sec   346 MBytes  2.90 Gbits/sec
[  3]  8.0- 9.0 sec   343 MBytes  2.88 Gbits/sec
[  3]  9.0-10.0 sec   283 MBytes  2.37 Gbits/sec
[  3]  0.0-10.0 sec  3.28 GBytes  2.82 Gbits/sec


So, do you have any idea to explain these bad figures ?
Note that the perfs could be even worst if i put more VMs on each Proxmox.
I think that it's because of poor perfs of Linux Bridges.

Thanks for your help.
Alexis
 
Hi,

can you try to disable lro on your 10gbe eth and bond?

ethtool -K ethX lro off


(lro cause bad performance with bridge, and is enabled by default on a lot of network card (I'm not sure for intel)
 
Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
ethtool -k eth1
Features for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on

So default is off on Intel.
 
Thanks for you answer.

I've compiled a newer version of ixgbe, with LRO set to Off.
But, it does not solve my issue.

I can now be more precise on the problem :

iperf from VM to Physical server (not a proxmox, just a physical server with 10 GBE) => OK
but
iperf from Physical to VM => KO

The the issue is on the INPUT of the bridge or something like that.
Any idea ?
 
Thanks for you answer.

I've compiled a newer version of ixgbe, with LRO set to Off.
But, it does not solve my issue.

I can now be more precise on the problem :

iperf from VM to Physical server (not a proxmox, just a physical server with 10 GBE) => OK
but
iperf from Physical to VM => KO

The the issue is on the INPUT of the bridge or something like that.
Any idea ?

Hi,
which OS is inside the VM and which network adapter do you use (virtio)?

Udo
 
Thanks for you answer.

I've compiled a newer version of ixgbe, with LRO set to Off.
But, it does not solve my issue.

I can now be more precise on the problem :

iperf from VM to Physical server (not a proxmox, just a physical server with 10 GBE) => OK
but
iperf from Physical to VM => KO

The the issue is on the INPUT of the bridge or something like that.
Any idea ?

Maybe disable generic receive offload ? (in host and guest)

ethtool -K ethX gro off