Strange Issue Using Virtio on 10Gb Network Adapters

Hello everyone, i've been scratching my head about whats causing my VM workers to run such slow pipelines;
It turns out its exactly this, since a couple of updates ago virtio-net has issues with MTU. Reducing it from 1500 fixes the issue.
Pings run fine, but for example something like docker pull will hang. Switching to intel e1000 likewise fixes the issue.

Edit: Maybe related to https://lore.proxmox.com/pve-devel/20250417104855.144882-1-s.hanreich@proxmox.com/T/#u ?
Edit1: Seems to be a kernel regression, using kernel 6.5.13-6-pve instead fixes the issue.
 
Last edited:
I spoke too soon about my issue being resolved by using a different i350 NIC in my pfSense machine. It worked just fine when Proxmox was using a Realtek 2.5gbe or 5gbe NIC and the VirtIO drivers (with MTU set to 1435). However, if I use an x520 or x550 NIC, I still get major frame drops.
 
I've spoken to some friends who have the same issue right now, also fixed by fiddling with MTU.
Haven't yet narrowed it down, i thought the older kernel worked fine but it seems to not be the case.
 
Hi there, I had recently similar issues as you describe with the exact same switch, the Mikrotik CRS309-1G-8S+IN. I found out that sometimes some SFP modules are getting too hot, that primarily affects SFP+ to RJ45 modules, but sometimes also some SFP+ fiber modules. The reason is because of the passive colling of the Mikrotik switch. For me the issues started to occur when affected SFPs hit arround 85°C, packets are getting dropped and connections are lagging. If temp goes up to more than 90°C with most SFP the connection completely dies. I solved the issue by manually setting link speeds to 5Gb/s, that helps to reach acceptable temps on affected modules.
 
I believe my issue was due to not having the host vmbr set to vlan-aware, altho it functioned fine until the last couple of updates.
Will report back if it stays stable
 
I believe my issue was due to not having the host vmbr set to vlan-aware, altho it functioned fine until the last couple of updates.
Will report back if it stays stable
If you are using VLANs and don't set your vmbr to vlan-aware, you won't have any connection on your VMs, not just connection issues.

Did you get the packet loss only for connections toward the Internet or also on your local network?
MTU size certainly can affect internet connectivity, especially if you have a DSL or 4G based internet uplink. In the past I had sometimes issues with certain ISPs which use tunnel protocols like IPinIP, GRE, PPTP and so on. Setting a MTU of 1450 or sometimes even 1250 solved these kind of issues.

On the other side I never had any issues with default MTU size on the LAN itself, only with Jumbo frames up to 16k
 
If you are using VLANs and don't set your vmbr to vlan-aware, you won't have any connection on your VMs, not just connection issues.

Did you get the packet loss only for connections toward the Internet or also on your local network?
MTU size certainly can affect internet connectivity, especially if you have a DSL or 4G based internet uplink. In the past I had sometimes issues with certain ISPs which use tunnel protocols like IPinIP, GRE, PPTP and so on. Setting a MTU of 1450 or sometimes even 1250 solved these kind of issues.

On the other side I never had any issues with default MTU size on the LAN itself, only with Jumbo frames up to 16k
You will get connections to your VMs, vlan-aware is not really required, from Proxmox docs:
  • VLAN awareness on the Linux bridge:In this case, each guest’s virtual network card is assigned to a VLAN tag,which is transparently supported by the Linux bridge.Trunk mode is also possible, but that makes configurationin the guest necessary.
  • "traditional" VLAN on the Linux bridge:In contrast to the VLAN awareness method, this method is not transparentand creates a VLAN device with associated bridge for each VLAN.That is, creating a guest on VLAN 5 for example, would create twointerfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.
And as stated, worked fine until recently :)
https://pve.proxmox.com/wiki/Network_Configuration <- Look for example vmbr0 having no vlan awareness.
 
If you are using VLANs and don't set your vmbr to vlan-aware, you won't have any connection on your VMs, not just connection issues.

Did you get the packet loss only for connections toward the Internet or also on your local network?
MTU size certainly can affect internet connectivity, especially if you have a DSL or 4G based internet uplink. In the past I had sometimes issues with certain ISPs which use tunnel protocols like IPinIP, GRE, PPTP and so on. Setting a MTU of 1450 or sometimes even 1250 solved these kind of issues.

On the other side I never had any issues with default MTU size on the LAN itself, only with Jumbo frames up to 16k
I've been using VLANs for 6 months without any interface having vlan-aware checked. I am going to check it now to see if my issues get resolved by doing so. FWIW, I have the port tied to the VM set for tagging the traffic so I highly doubt it matters.

Edit: Just wanted to add that I am seeing this issues specifically on my LAN. It's especially bad if the Linux Bridge Proxmox is on is a 10gb NIC and the client device is using a 2.5gb NIC. Brands of the NIC do not seem to matter, so it does not seem like a driver issue. Also, it happens when using a 5gb for the Linux Bridge, but it is not as bad. It's like a buffer is getting filled and dropping frames on the client device.
 
Last edited:
Alright, so I have another big clue here.

In my Mikrotik's stats page, I can see a whole shit-ton of Rx Pauses when streaming across VLANs from my Laptop over wifi. They only occur on the port that is connected to my Pfsense machine. Stopping the gamestream stops the Rx Pause counter completely. This only occurs when the Linux Bridge on Proxmox is a 5Gb NIC or faster. It is much worse with a 10Gb NIC, though. So, my original thought that a the i350 in my PfSense machine is getting slammed somehow by the faster NIC and dropping frames seems correct.

Now does the issue occur on the Virtio stack, the tranciever, PfSense, or the i350 itself? I don't really know how to troubleshoot further. I've been trying to virtualize my Pfsense machine for testing but it's been a giant PITA to get my VLANs moved over from a 4-port i350 to a 2-port x550.
 
Alright, so I have another big clue here.

In my Mikrotik's stats page, I can see a whole shit-ton of Rx Pauses when streaming across VLANs from my Laptop over wifi. They only occur on the port that is connected to my Pfsense machine. Stopping the gamestream stops the Rx Pause counter completely. This only occurs when the Linux Bridge on Proxmox is a 5Gb NIC or faster. It is much worse with a 10Gb NIC, though. So, my original thought that a the i350 in my PfSense machine is getting slammed somehow by the faster NIC and dropping frames seems correct.

Now does the issue occur on the Virtio stack, the tranciever, PfSense, or the i350 itself? I don't really know how to troubleshoot further. I've been trying to virtualize my Pfsense machine for testing but it's been a giant PITA to get my VLANs moved over from a 4-port i350 to a 2-port x550.
Did you attempt the vlan-aware bridge?