Poor network performance on guest

he VM is CentOS 6.7, the switch is a Dell PowerConnect 5524. The vmbr2 port on the PVE node, and port 1 on the FreeNAS box, are connected to the SFP+ ports on the switch using 2M twinax patch cables. The vmbr3 port on the PVE node, and port 2 on the FreeNAS box, are connected to each other using the same kind of cable. The switch reports an operating temperature of 46 C, and there's not a great deal of other load on it.

When doing node to FreeNAS via DAC, load on the node ranges from 2.5-3.0, and IOWait is under 1%. CPU usage is hovering around 10%. Load on the FreeNAS server increases from a baseline of 0.8 to about 2.0. Speed is 9.4 Gb/sec. Going via the switch, CPU usage is 8-9%, IOWait is still under 1%, load average is 3.2-3.5. Load average on the FreeNAS server is as high as 3 during this run. Speed is 4.6 Gb/sec.

When doing VM to FreeNAS via DAC, load on the node climbs a bit higher, to about 3.5. CPU usage is a bit higher as well, 12-14%. Load on the FreeNAS server is a bit lower, 1.4-1.6; load on the VM goes from a baseline of 0 to 1. IOWait remains under 1%. Speed is 5.8 Gb/sec. Going via the switch, CPU usage on the node is more variable, but hits peaks of about 12%. Load average on the node is around 3.2. Load average on the FreeNAS box is 1.25, and on the VM about 0.6. Speed on that run is 1.4 Gb/sec.

You have at least the following vCores (physical Core + HT) available right ?
  • Proxmox-Node >=4
  • Vm >=4 assigned
  • ZFS >= 3

right ?

If so, load issues are not whats causing your slow speeds.

Just for the sake of comparison, I ran iperf from the VM to the node on the "via switch" interface and on the "via DAC" interface. The "via switch" interface showed 8.4 Gb/sec, and the "via DAC" interface showed 9.4 Gb/sec.

This is the biggest weirdness, ever. the VM <-> Proxmox connection should be statistically the same regardless of you using Vmbr2 or vmbr3. Not have a 12 % difference.

Edit: What else runs on the "via switch" ??? both proxmox-VM's via the vmbr and/or physical nodes connected via the switch ?

On the switch side of things, AIUI, my cabling isn't officially supported. I could look for a set of Dell-branded optics for the switch, Chelsio-branded optics for the NICs, and a couple of fiber patch cables. That should eliminate any potential incompatibility on that side of things, though at a bit of expense. The switch itself is still under warranty, though I might have a hard time getting it replaced without a hard failure to show.

How much apart are the ping times between "Vm" and "FreeNas" when you compare via Switch to via DAC ?
Code:
ping -i 0.01 -c 100 <IP>

For the hell of it, can you create a openvswitch based Bridge ?
Code:
apt-get install openvswitch

Via Proxmox-GUI do the following:
Proxmox-Node assign a OVS_bridge  with no nics (empty)
Proxmox-Node  assign a OVS_Port for your Proxmox (3rd IP) on a separate subnet
VM assign a 3rd vnic via the OVS_bridge, set the ip to be the same subnet as above.
reboot

should the ovs bridge not come up the first time, execute:
/etc/init.d/networking restart

then run the same iperf again via VM <-> Proxmox via OVS_Bridge ?
 
Last edited:
The PVE node has 12 physical cores, which with HT should equal 24. The VM is assigned 4 cores, two other running VMs are assigned 2 cores each, and one other running VM is assigned a single core. The FreeNAS box is running on a Xeon E3-1230v2, which has 4 physical cores and HT.

The other VMs aren't connected to either vmbr2 or vmbr3.

This is the biggest weirdness, ever. the VM <-> Proxmox connection should be statistically the same regardless of you using Vmbr2 or vmbr3. Not have a 12 % difference.
It seems quite variable. Three consecutive runs on vmbr2 showed 9.29, 9.49, and 8.46 Gb/sec. Then three consecutive runs on vmbr3 showed 10.1, 11.3, and 12.4 Gb/sec. Three more consecutive runs on vmbr2 then showed 12.0, 9.15, and 9.63 Gb/sec.

Pings via switch:
Code:
--- 192.168.1.10 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 992ms
rtt min/avg/max/mdev = 0.174/0.255/0.416/0.051 ms

Pings via DAC:
Code:
--- 192.168.2.2 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 991ms
rtt min/avg/max/mdev = 0.152/0.269/0.572/0.063 ms
 
ings via switch:
Code:
--- 192.168.1.10 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 992ms
rtt min/avg/max/mdev = 0.174/0.255/0.416/0.051 ms
Pings via DAC:
Code:
--- 192.168.2.2 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 991ms
rtt min/avg/max/mdev = 0.152/0.269/0.572/0.063 ms

so its not like the switch is delaying packets or anything this is not the issue then
0.014ms should not make a difference whatsoever (this is over 100 pings)
 
The other VMs aren't connected to either vmbr2 or vmbr3.
And there are no physical nodes connected to the switch either (besides Proxmox/FreeNas), right??

You have seen this one, right ? :


For the hell of it, can you create a openvswitch based Bridge ?
Code:
apt-get install openvswitch

Via Proxmox-GUI do the following:
Proxmox-Node assign a OVS_bridge with no nics (empty)
Proxmox-Node assign a OVS_Port for your Proxmox (3rd IP) on a separate subnet
VM assign a 3rd vnic via the OVS_bridge, set the ip to be the same subnet as above.
reboot

should the ovs bridge not come up the first time, execute:
/etc/init.d/networking restart
then run the same iperf again via VM <-> Proxmox via OVS_Bridge ?
 
Well, yes, there are other devices plugged into the switch. I don't think I can avoid that--the only way I have to connect with the VM that I'm concerned with is via SSH, and I don't have a trusted public key on the PVE host. But trying again, with all the other devices (i.e., except for the PVE host, FreeNAS box, and my desktop machine; desktop machine is connected via a GbE port) unplugged from the switch, and all other VMs shut down, yields the following iperf results (for 60 seconds each):
VM <-> FreeNAS via switch: 722 Mb/sec
VM <-> FreeNAS via DAC: 6.27 Gb/sec
VM <-> FreeNAS via switch, repeated: 1.26 Gb/sec
Node <-> FreeNAS via switch: 4.30 Gb/sec
Node <-> FreeNAS via DAC: 9.41 Gb/sec
VM <-> FreeNAS via switch, second repeat: 432 Mb/sec
Node <-> FreeNAS via switch, repeated: 4.20 Gb/sec

I saw the openswitch suggestion; I'm going to need to wait a little bit on that.
 
The only thing I imagine could help are topics we already talked about:

Jumbo Fr
But trying again, with all the other devices (i.e., except for the PVE host, FreeNAS box, and my desktop machine; desktop machine is connected via a GbE port) unplugged from the switch, and all other VMs shut down, yields the following iperf results (for 60 seconds each):
VM <-> FreeNAS via switch: 722 Mb/sec
VM <-> FreeNAS via DAC: 6.27 Gb/sec
VM <-> FreeNAS via switch, repeated: 1.26 Gb/sec
Node <-> FreeNAS via switch: 4.30 Gb/sec
Node <-> FreeNAS via DAC: 9.41 Gb/sec
VM <-> FreeNAS via switch, second repeat: 432 Mb/sec
Node <-> FreeNAS via switch, repeated: 4.20 Gb/sec
So its not other nodes creating traffic on your "switch-vmbr". like e.g. multicast traffic or whatever.


I'm afraid the only courses of action that I have left in my tool box at this time are the ones already mentioned:
  • OpenVswitch for a seperate vmbr
  • JumboFrames for the vmbr's / IP's / switches connecting switch and VM to FreeNas