Network Optimization for High-Volume UDP Traffic in PVE

PG1024 · Feb 27, 2026

Hardware Specification: The physical server network card model is the 82599ES 10-Gigabit SFI/SFP.
Problem Description: A virtual machine is running an LVS (Linux Virtual Server) service. When a single-source IP generates UDP traffic exceeding approximately 300 Mbps, packet loss begins to occur on the network interface. The actual business requires handling peak traffic rates of around 400 Mbps.
Requirement: Besides PCIe Passthrough (NIC Direct Pass-through), are there any other solutions that can ensure a single-source IP can receive UDP traffic exceeding 1 Gbps normally?
Traffic Characteristics: The UDP packets primarily consist of firewall logs. While individual packets are not large, the volume is high, with an estimated rate of 100,000 to 200,000 packets per second.
Baseline Observation: The same physical machine running VMware vSphere 6.7 with an identical virtual machine configuration appears to handle the 400 Mbps traffic load stably during testing.

Ericmann · Feb 27, 2026

I have dealt with similar high PPS UDP cases, and 100k to 200k packets per second is usually the real problem, not the Mbps. Small packets kill you with interrupt overhead and softirq saturation.

Before jumping to passthrough, I would check vNIC type. Make sure you are using something like vmxnet3, not e1000. Then tune multiqueue, RSS, RPS, and RFS so traffic spreads across CPU cores. Also increase ring buffers with ethtool and check for drops at the driver level.

Pin vCPUs, isolate IRQs, and confirm CPU is not the bottleneck. In my experience, careful CPU and queue tuning often fixes it without full PCIe passthrough.

kosnar · Feb 27, 2026

Hi PG1024,

I made several test using iperf with this result:

Server
VM, Ubuntu 24.04, 4CPU, 4GB RAM, PVE node 1
$iperf -su

Client 1
Physical, Ubuntu 25.04, 1Gbps LAN
$iperf -c 192.168.1.69 -u -t 120 -b 500M
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 1] 0.0000-119.9998 sec 7.32 GBytes 524 Mbits/sec 0.027 ms 231/5349881 (0.0043%)

$iperf -c 192.168.1.69 -u -t 120 -b 1000M
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 2] 0.0000-120.0009 sec 13.4 GBytes 957 Mbits/sec 0.024 ms 661/9765300 (0.0068%)

Client 2
VM, Ubuntu 24.04., 4CPU, 4GB RAM, 10Gbps LAN, PVE node 2
$iperf -c 192.168.1.69 -u -t 120 -b 1000M
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 1] 0.0000-119.9999 sec 14.6 GBytes 1.05 Gbits/sec 0.011 ms 460/10699758 (0.0043%)

$iperf -c 192.168.1.69 -u -t 120 -b 5000M
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 2] 0.0000-119.9996 sec 43.8 GBytes 3.14 Gbits/sec 0.006 ms 19684/32038968 (0.061%)

Can you specify the percentage of your high packet loss?

The limitation may be in VM configuration. You write you use the same VM image and configuration as in VMware. Can you use Network Device Mode = virtio? (Post complete VM configuration).
Can you post CPU and CPU Pressure Stall graphs from high packet loss period?

R.

ness1602 · Feb 27, 2026

Usually i ask for netstat -sanu to see what can be optimized in kernel and system.

spirit · Feb 27, 2026

as far I remember, virtio-net is limit is around 2millions pps by core (depend of the cpu frequency). The only way is to increase number of queue on the virtio nic.

(if you are cpu limited, you should see a vhost-net process at 100% on the pve host)

doing iperf with big packet will not help to test (virtio-net can reach 20~40gbit/s easily, it's not a problem, but it's with big packet.

doing iperf with "-l 64" to test worst case of synflood should show pss limit.

kosnar · Feb 27, 2026

spirit said:
as far I remember, virtio-net is limit is around 2millions pps by core (depend of the cpu frequency). The only way is to increase number of queue on the virtio nic.

(if you are cpu limited, you should see a vhost-net process at 100% on the pve host)

doing iperf with big packet will not help to test (virtio-net can reach 20~40gbit/s easily, it's not a problem, but it's with big packet.

doing iperf with "-l 64" to test worst case of synflood should show pss limit.

My tests were made with default = 1470B packet size. Measured limit is 250566pps. I guess in my case the limit is single-thread CPU performance.
Using multi-thread in iperf I get 5.95 Gbits/sec 505548 pps but with high packet loss 16% - this may be the switch limitation.

I posted these numbers for the questioner to compare real word numbers...even made in our test lab.

R.

fstrankowski · Feb 27, 2026

Have the tests been performed without any interfering L2/L3 equipment between VM and Client? If not, how did you rule out its not the networking equipment inbetween?

kosnar · Feb 27, 2026

fstrankowski said:
Have the tests been performed without any interfering L2/L3 equipment between VM and Client? If not, how did you rule out its not the networking equipment inbetween?

I am not sure I understand the question.

"Client 1" is physical machine (actually my work computer) traffic to VM goes thru two switches. "Client 2" is on another physical Proxmox node than "Server" (we have 3-node PVE cluster) in this case traffic goes thru one switch. In both cases the trafic goes thu physical ethernet cards. Common traffic in our company is around 0.5-2MB/s and it is less than 1% of measured numbers.

So I think my numbers may be trustable. There is one big but - those are is only network trafic measurements because PG1024's question is about UDP Proxmox incomming limits. Packet are not processed. When you process packets it takes time and can make packet proces limitation and make packet loss because there is packet queue size limitation and UDP can be droped by specification.

Is this answer to your question?

R.

fstrankowski · Feb 27, 2026

kosnar said:
I am not sure I understand the question.
So I think my numbers may be trustable. There is one big but - those are is only network trafic measurements because PG1024's question is about UDP Proxmox incomming limits. Packet are not processed. When you process packets it takes time and can make packet proces limitation and make packet loss because there is packet queue size limitation and UDP can be droped by specification.

I'm totally with you. Its not about the traffic amount, its about the pps, thats the bottleneck. My question was just aiming at the possibility that your network equipment might bottleneck aswell and was just curious if you ruled that possibility out.

kosnar · Feb 27, 2026

fstrankowski said:
I'm totally with you. Its not about the traffic amount, its about the pps, thats the bottleneck. My question was just aiming at the possibility that your network equipment might bottleneck aswell and was just curious if you ruled that possibility out.

My "Client 1" connection is 1Gbps and the number are very close (957 / 1024 = 93%) I guess here is not much space to improve. This twice as much then PG1024 issue point.

VM "Client 2" <-> "Server" connection is 10Gbis I measure (3.14 / 10Gbps = 31%) as I wrote it is probably single thread CPU limit of iperf. But this is almost 10x higher than PG1024's limit. On same envitonment using TCP I transfers 9.29 / 10 Gbps.

Actually I don't think about my bottleneck but about PG1024's bottleneck. If you want I can make more tests and try to find out the limits. At first place I need to solve VM CPU limits. But this is out scope this thread where PG1024 needs help with his VM's limits. Do you want to create such thread?

R.

fstrankowski · Feb 27, 2026

kosnar said:
Actually I don't think about my bottleneck but about PG1024's bottleneck. If you want I can make more tests and try to find out the limits. At first place I need to solve VM CPU limits. But this is out scope this thread where PG1024 needs help with his VM's limits. Do you want to create such thread?

Is your client able to multithread said application? I've seen (long time ago) a very nice talk from Valve about defending DDoS attacks in online games. Different topic but with the same technical aspect - PPS vs Bandwidth. Latter one neglectable while PPS are the real bummer. In case you're interested: https://youtu.be/2CQ1sxPppV4?t=462 (video starts at the interesting point for this discussion).

P.S.: Thanks for your answers, and clarifications, these helped me understand your problem in more detail

spirit · Feb 27, 2026

kosnar said:
My tests were made with default = 1470B packet size. Measured limit is 250566pps. I guess in my case the limit is single-thread CPU performance.
Using multi-thread in iperf I get 5.95 Gbits/sec 505548 pps but with high packet loss 16% - this may be the switch limitation.

I posted these numbers for the questioner to compare real word numbers...even made in our test lab.

R.

250566pps is quite low, I mean , you should reach 1~2mpps for any each packetsize. I remember to reach easily 7~9gbit with 1core/thread with standard 1500mtu. (with epyc v3 3,5ghz and cpu forced to max frequency)

PG1024 · Feb 28, 2026

The situation is more severe when the virtual machine's network card model is set to vmxnet3. Monitoring with iftop -i enp6s18 shows that the bandwidth on the VM's interface does not exceed 200M, indicating significant data loss.
iperf3 tests do not reveal obvious anomalies and show normal bandwidth figures.
We have attempted several optimizations, including ethtool -G enp129sf01 rx 4096 tx 4096, enabling RPS, and removing the bonding configuration. However, the results were unsatisfactory and unstable.
The related screenshots show the configurations for both the virtual machine and the PVE host. Could you please review them to see if there are any misconfigurations? Thank you.

spirit · Feb 28, 2026

The situation is more severe when the virtual machine's network card model is set to vmxnet3. Monitoring with iftop -i enp6s18 shows that the bandwidth on the VM's interface does not exceed 200M, indicating significant data loss

this is normal, don't use vmxnet3 or e1000, they are full software emulation. you need to use virtio which use vhost-net offloading on pvehost.

your cpu is quite old, and it's possible that spectre/meltdown/.... mitigation impact performance

Code:

nano /etc/default/grub to GRUB_CMDLINE_LINUX_DEFAULT="quiet mitigations=off"
update-grub
reboot

kosnar · Mar 2, 2026

To VM configuration I have some notes:

You are using multigueue. My numbers are without it. Did you set multiqueue in VM too? In VM execute ethtool -L <ethX> combined 16 adapter name might change in PVE in compare to VMware.
As SCSI you use VMware PVSCSI. I recommend migrate to VirtIO SCSI for better performance.

To PVE configuration - consider:

disabling HT - It can increase performance in some scenarios and I strongly recommend disabling HT especiatly when you disable mitigations
upgrading to PVE v9.1

I expect bottleneck:

Network card Model -> use VirtIO not vmxnet3 ... you already have this
CPU bottleneck. If CPU is overloaded then it simply drops UDP packets. That's why I ask for CPU graphs. Not only VM graphs but Host's graphs too. Number of VMs in this node? You use "VMware PVSCSI" so post "IO Pressure Stall" too. PVE v8 don't have Pressure Stall graphs so you should collect these information in another way.
Network configuration in PVE. Can you post network configuration (bonding, VLAN, bridge), firewall, and SDN? I don't need to know IPs.
Post all special tunning like "ethtool -G enp129sf01 rx 4096 tx 4096"

R.

PG1024 · Mar 2, 2026

Supplementary Information on Physical Server Resource Monitoring：

Code:

auto enp129s0f0
iface enp129s0f0 inet manual
        pre-up ethtool -G $IFACE rx 2048 tx 2048
        pre-up ethtool -C $IFACE rx-usecs 10 tx-usecs 10

iface enp3s0f0 inet manual

iface enp3s0f1 inet manual

iface enp130s0f0 inet manual

iface enp130s0f1 inet manual

iface enp4s0f0 inet manual

iface enp4s0f1 inet manual

auto enp129s0f1
iface enp129s0f1 inet manual
    pre-up ethtool -G $IFACE rx 2048 tx 2048
    pre-up ethtool -C $IFACE rx-usecs 10 tx-usecs 10

auto bond0
iface bond0 inet manual
    bond-slaves enp129s0f0 enp129s0f1
    bond-miimon 100
    bond-mode balance-tlb

auto vmbr0
iface vmbr0 inet static
    address X.X.X.X/24
    gateway X.X.X.X网关 X.X.X.X
    bridge-ports bond0桥接端口 bond0
    bridge-stp off桥接生成树协议 关闭
    bridge-fd 0桥接转发延迟 0
    post-up echo 2000 > /sys/class/net/$IFACE/tx_queue_len

source /etc/network/interfaces.d/*

Thanks for the input, everyone. Here is the update for synchronization:

On a physical server with CentOS 7 installed and identical configurations, the LVS service handled a single-source UDP flow stably at 400 Mbps. With multiple source IPs, it easily reached over 1 Gbps without issues.
A PVE 9.1 host has been deployed on identical hardware. The overall performance shows no significant degradation, with a slight improvement observed.
Mitigations=off was applied at the PVE level on the afternoon of March 2, 2026. We will monitor the results and share the outcome later.
The physical server is currently running minimal workloads. Most services are in a 1:1 mapping state (physical-to-VM), where VM configurations are not identical to the physical hardware specs.
Disabling HT was tested on the physical CentOS 7 + LVS setup, resulting in no significant performance gain. We will test this configuration on the PVE environment at a later time.
The firewall policy ACCEPTs entire subnets for interactive addresses. Additionally, the NOTRACK strategy has been implemented.

PG1024 · Mar 2, 2026

Below are the supplementary details for the virtual machine

Code:

Channel parameters for enp6s18:
Pre-set maximums:
RX:        n/a
TX:        n/a
Other:        n/a
Combined:    16
Current hardware settings:
RX:        n/a
TX:        n/a
Other:        n/a
Combined:    16

kosnar · Mar 2, 2026

Why you use balance-tlb? This may cause out of order packet delivery (and UDP packet may be classified as invalid and dropped) and some delivery issue because second interface can send packets but not receive (TCP will resend, UDP just drop-lost).

Can you set bond to active-backup (or disconnect second interface) and post the status of packet loss?
R.

spirit · Mar 2, 2026

kosnar said:
Why you use balance-tlb? This may cause out of order packet delivery (and UDP packet may be classified as invalid and dropped) and some delivery issue because second interface can send packets but not receive (TCP will resend, UDP just drop-lost).

Can you set bond to active-backup (or disconnect second interface) and post the status of packet loss?
R.

yes, I was thinking exactly the same

PG1024 · Mar 3, 2026

Adding the mitigations=off parameter to /etc/default/grub yielded noticeable optimization and significant improvement. Performance in version 9.1 is particularly better compared to 8.4, so the hosts have been upgraded from 8.4 to 9.1.Below is the performance comparison observed during last night's peak hours on PVE 9.1.
Previous tests showed that balance-tlb performs slightly better than active-backup for the bond-mode. Today, we will conduct a comparative analysis of this difference on two PVE hosts.
Hyper-Threading (HT) has also been disabled today, and we will monitor its impact simultaneously.

Network Optimization for High-Volume UDP Traffic in PVE

New Member

New Member

Member

Famous Member

Distinguished Member

Member

Famous Member

Member

Famous Member

Member

Famous Member

Distinguished Member

New Member

Distinguished Member

Member

New Member

Attachments

New Member

Member

Distinguished Member

New Member

We value your privacy