Poor Network Performance from VM to Physical server

foreverjake

New Member
Nov 12, 2023
3
0
1
Hello,
I have two dell servers, a R730XD and a R720XD both with quad SFP 10Gb/s ports connect to a single Cisco Nexus 3172T on two QSFP+ ports. The 4 ports on each server are placed in a LAG doing LACP and hashed at layer2+3. The Cisco port channel is a trunk carrying the same vlans to each server. All ports are set an MTU of 9000 on the switch and both servers. I have previously had these servers both running TrueNas Scale, with one acting as a periodic backup to the other. The backups would generally limit themselves to one cable of lag and transfer at about 9.1Gb/s. I have converted the R730XD to a Proxmox VE 8.0.4, installed TruNas Scale as a VM with ownership of the HBA controlling the server's hard drives and one virtio network interface. I have enabled multique of 12 to match my VCPU and used ethtool -L ens18 combined 12 to notify the VM of the multique. Now when I try to backup my TrueNas Scale VM to the bare-metal TrueNas Scale server the transfer never exceeds 550Mb/s. Any ideas why the traffic is so slow?

R730XD
72 x Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz (2 Sockets)
384GB Ram
HBA 330mini in passthrough mode
SRV-IO Enabled

Interface config:
auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual
mtu 9000

auto eno2
iface eno2 inet manual
mtu 9000

auto eno3
iface eno3 inet manual
mtu 9000

auto eno4
iface eno4 inet manual
mtu 9000

auto bond0
iface bond0 inet manual
bond-slaves eno1 eno2 eno3 eno4
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
mtu 9000

auto vmbr0
iface vmbr0 inet manual
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4092
mtu 9000

auto vmbr0.100
iface vmbr0.100 inet static
address 10.9.240.24/24
gateway 10.9.240.1
mtu 9000
#Lan

auto vmbr0.200
iface vmbr0.200 inet manual
#Public

auto vmbr0.300
iface vmbr0.300 inet manual
mtu 9000
#WiFi

auto vmbr0.400
iface vmbr0.400 inet manual
mtu 9000
#Private

auto vmbr0.500
iface vmbr0.500 inet manual
mtu 9000
#DMZ

TrueNas Scale VM
12 vCPU (2 sockets, 6 cores each)
128 GB of Ram
 
Same problem here. When I try to do iperf3 from hypervisor, there is no loss, but from the guest that is running on same hypervisor, losses are very high.
 
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
 
You could try 2 different things:

- deactivating the Intel NIC‘s LLDP by executing

ethtool --set-priv-flags <interface name> disable-fw-lldp on

- or disabling tx/rx VLAN offload with

ethtool -K <interface name> tx-vlan-offload off rx-vlan-offload off

Note: to make these changes permanent you can add either of these commands within /etc/network/interfaces by adding

post-up /sbin/ethtool -K<one of the mentioned settings from above> below the last line of every interface setting.

auto eno1
iface eno1 inet manual
mtu 9000
post-up /sbin/ethtool -K eno1 tx-vlan-offload off rx-vlan-offload off
 
The card I'm using is a DELL Intel x710-DA4 4x 10GbE SFP+ Quad Mezzanine Daughter Card for R730.

ethtool --set-priv-flags eno1 disable-fw-lldp on
Reports back as "netlink error: Operation not supported"

and

ethtool -K eno1 tx-vlan-offload off rx-vlan-offload off
ethtool -K eno2 tx-vlan-offload off rx-vlan-offload off
ethtool -K eno3 tx-vlan-offload off rx-vlan-offload off
ethtool -K eno4 tx-vlan-offload off rx-vlan-offload off

Seems to have no effect on network performance.

@zemzema Did either of these help out in your situation?

I think I'm just going to put a 2 port 10Gb/s card in the system and map it to the TrueNas VM. I have a free 8 lane PCI slot and a qLogic nic, so I should be able to isolate the module and tell Proxmox not to use the new card.
 
Turns out this is not a network problem. I have managed to get 10Gb/s file transfers when I'm copying from the cache and not directly from the hard drives. The 330 mini is mapped directly to the vm, but the 12 sata drives (2 x 6 drive in RAIDz2) should be able to keep up with a 10Gb/s interface. The supporting special dev for these data vdevs is a metadev (2 x 4TB) PCIe flash cards. They are also mapped directly to the TrueNAS VM. Anyone have any idea where the bottleneck is?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!