Hi!
I'm looking for some performance advise. Got a rather beefy PVE test host:
CPU: 2 x Xeon Gold 5220R @ 2.2 GHz (48C/96T total)
RAM: 512 GB
NIC: 2 x Intel XL710 40 GbE
The host is empty and not used by anything except for test VMs. The network is a simple OVS bridge with LACP LAG configured as follows:
Proxmox version is 7.3, kernel 5.15.74-1. Test VMs are Ubuntu 22.04, kernel linux-modules-5.15.0-56-generic. PVE firewall is disabled. Test VMs attach to vmbr0 with VirtIO NICs, 4 queues per NIC for vhost-net, and a VLAN tag, VM firewalls are disabled.
The network throughput between 2 VMs on this host on the same bridge in the same VLAN appears to be limited to roughly 27-30 Gbps (iperf3 TCP stream, default settings). Adding more CPUs and/or NIC queues and/or adding more stream has no effect.
Throughput over network between the host and an external test machine connected with 2 x 40 GbE over a low-latency network is limited to roughly 30 Gbps (iperf3 TCP stream, default settings). Throughput between a test VM on the host and the same external test machine is limited to roughly 25-27 Gbps (iperf3 TCP stream, default settings).
I am wondering if these throughput numbers are expected, and whether they can be improved. I tried playing with physical NIC RX/TX ring buffers on the host, TX queue lengths on host and VM interfaces, etc, but that didn't make much of a difference. Is there anything to try, or am I hitting the limits of kernel traffic processing and the next logical step is trying DPDK?
I would appreciate any advice or reference to any relevant materials, documentation or other resources. Thank you for your time!
Cheers!
I'm looking for some performance advise. Got a rather beefy PVE test host:
CPU: 2 x Xeon Gold 5220R @ 2.2 GHz (48C/96T total)
RAM: 512 GB
NIC: 2 x Intel XL710 40 GbE
The host is empty and not used by anything except for test VMs. The network is a simple OVS bridge with LACP LAG configured as follows:
Code:
auto enp59s0f0
iface enp59s0f0 inet manual
ovs-mtu 9000
auto enp59s0f1
iface enp59s0f1 inet manual
ovs-mtu 9000
auto bond0
iface bond0 inet manual
ovs_bridge vmbr0
ovs_type OVSBond
ovs_bonds enp59s0f0 enp59s0f1
ovs_options bond_mode=balance-tcp lacp=active other_config:lacp-time=fast
ovs_mtu 9000
auto vmbr0
iface vmbr0 inet manual
ovs_type OVSBridge
ovs_mtu 9000
ovs_ports bond0 mgmt
Proxmox version is 7.3, kernel 5.15.74-1. Test VMs are Ubuntu 22.04, kernel linux-modules-5.15.0-56-generic. PVE firewall is disabled. Test VMs attach to vmbr0 with VirtIO NICs, 4 queues per NIC for vhost-net, and a VLAN tag, VM firewalls are disabled.
The network throughput between 2 VMs on this host on the same bridge in the same VLAN appears to be limited to roughly 27-30 Gbps (iperf3 TCP stream, default settings). Adding more CPUs and/or NIC queues and/or adding more stream has no effect.
Throughput over network between the host and an external test machine connected with 2 x 40 GbE over a low-latency network is limited to roughly 30 Gbps (iperf3 TCP stream, default settings). Throughput between a test VM on the host and the same external test machine is limited to roughly 25-27 Gbps (iperf3 TCP stream, default settings).
I am wondering if these throughput numbers are expected, and whether they can be improved. I tried playing with physical NIC RX/TX ring buffers on the host, TX queue lengths on host and VM interfaces, etc, but that didn't make much of a difference. Is there anything to try, or am I hitting the limits of kernel traffic processing and the next logical step is trying DPDK?
I would appreciate any advice or reference to any relevant materials, documentation or other resources. Thank you for your time!
Cheers!