pfSense VM slower than expected inter-vlan routing

mshorey · Jun 3, 2024

pve 8.2.2
I'm hoping someone else may have some insight for me that is running pfSense as a VM on their proxmox instance and maybe not seeing the speeds they expect between VLANs. Iperf3 testing between VMs on the same VLAN I can reach transfer speeds of ~30Gbps. But between VMs on different VLANs (when the traffic needs to be routed through the pfSense VM) I'm seeing maybe 5-6Gbps. Watching the CPU utilization on the pfSense VM when it's having to route these iperf tests between the VLANs it never goes over 15% or so. I have set multiqueue to 4 or 8 for each VM depending on their vCPU count and that hasn't seemed to make a difference. All VMs are using virtio to connect and tagging their traffic for specific VLANs. I was previously doing everything over a single linux bridge (vmbr0) but I thought I'd add a second bridge (vmbr1) to pfSense just for my VM's vlan (100) to see if that would make a difference and it did a little. Previously I was seeing about 4Gbps now I'm seeing just under 6Gbps testing from a VM on VLAN1 (vmbr0) to VLAN100 (vmbr1) but am seeing more retransmits than I'd expect. Everything on my network is still 1500mtu and I would change it if I thought it'd make a difference but the CPU utilization on my pfSense VM is incredibly low so that would not be indicative of needing to use jumbo packets IMO. I'm open to any suggestion ya'll might have and extremely appreciative. My pfSense version is 2.7.2 and the VM config is as follows:
root@pve-1:/etc/pve/qemu-server# cat 107.conf
agent: 1
balloon: 0
boot: order=scsi0;ide2
cores: 8
cpu: host
hostpci0: 0000:01:00.1,pcie=1
ide2: local:iso/pfSense-CE-2.7.2-RELEASE-amd64.iso,media=cdrom,size=854172K
machine: q35
memory: 8192
meta: creation-qemu=8.1.2,ctime=1702956906
name: PFSENSE-2
net0: virtio=BC:24:11:EF:4B:41,bridge=vmbr0,queues=8
net1: virtio=BC:24:11:89:B5:BB,bridge=vmbr1,queues=8
numa: 0
onboot: 1
ostype: l26
scsi0: VMs:vm-107-disk-0,iothread=1,size=12G
scsihw: virtio-scsi-single
smbios1: uuid=7aa0d7e5-90b6-444c-98ec-2bcdab0a0e43
sockets: 1
startup: order=1,up=30
vmgenid: 331695e6-6024-4a0f-a672-cd39aac55e20
And here's one of the testing VMs:
root@pve-1:/etc/pve/qemu-server# cat 103.conf
agent: 1
balloon: 0
boot: order=scsi0
cores: 4
cpu: host
memory: 16384
meta: creation-qemu=7.0.0,ctime=1662999076
name: DOCKER
net0: virtio=36:C7:81:59

4:93,bridge=vmbr0,queues=4
numa: 0
onboot: 1
ostype: l26
scsi0: VMs:vm-103-disk-0,discard=on,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=f2c40f91-177b-410f-ab0d-c930e7b6160f
sockets: 1
startup: order=4,up=30
tags:
vmgenid: 3abb9524-cb3e-46a2-8862-52d37bce2907
And the other testing VM:
root@pve-1:/etc/pve/qemu-server# cat 112.conf
agent: 1
balloon: 0
boot: order=scsi0;ide2;net0
cores: 4
cpu: host
ide2: none,media=cdrom
memory: 4096
meta: creation-qemu=7.1.0,ctime=1672979532
name: NPM
net0: virtio=2a:ed:b7:28:34:63,bridge=vmbr1,queues=4,tag=100
numa: 0
onboot: 1
ostype: l26
scsi0: VMs:vm-112-disk-0,discard=on,iothread=1,size=42G
scsihw: virtio-scsi-single
smbios1: uuid=47b48916-70d2-4543-9220-c350461b565e
sockets: 1
vmgenid: 28600f0b-74d5-494d-a64e-c4559cfc5edc
And I've attached a screenshot of the iperf3 testing performed after making the changes mentioned.

rit1001 · Jun 4, 2024

How does the CPU utilization of the whole system change when you run your tests?

You are moving from using layer 2 switching handled at the kernel level (near hardware level) to layer 3 switching/routing where the packets must be passed from the kernel to the pfSense VM and then back to the kernel. The result is far more work being carried out.

mshorey · Jun 4, 2024

rit1001 said:
How does the CPU utilization of the whole system change when you run your tests?

You are moving from using layer 2 switching handled at the kernel level (near hardware level) to layer 3 switching/routing where the packets must be passed from the kernel to the pfSense VM and then back to the kernel. The result is far more work being carried out.

It looks to hover around 20% when I'm running 4 parallel streams iperf3 between the VMs on different VLANs

rit1001 · Jun 4, 2024

I think this is all indicating that you should use VLANs to isolate traffic, rather than route high volumes of traffic between VLANs using software.

From the details posted, you are

- on a single pfSense VM locking one of the 8 vCPUs at 100% (15% overall load)

- On the overall system locking 8 of the reported 44 CPUs at 100% which will be from kernel level work and all the test VMs you are running pfSense on.

All this load and the resulting low performance come from the fact that you are trying to use pfSense to emulate a high-performance layer 3 switch which is something that is normally built at the silicon level in an asic. You may see better performance by deploying something like openswitch, but you are also likely to see even greater cpu load.

mshorey · Jun 4, 2024

rit1001 said:
I think this is all indicating that you should use VLANs to isolate traffic, rather than route high volumes of traffic between VLANs using software.

From the details posted, you are

- on a single pfSense VM locking one of the 8 vCPUs at 100% (15% overall load)

- On the overall system locking 8 of the reported 44 CPUs at 100% which will be from kernel level work and all the test VMs you are running pfSense on.

All this load and the resulting low performance come from the fact that you are trying to use pfSense to emulate a high-performance layer 3 switch which is something that is normally built at the silicon level in an asic. You may see better performance by deploying something like openswitch, but you are also likely to see even greater cpu load.

All of this makes good sense. I appreciate your input and there may be a L3 switch in my near future.

rit1001 · Jun 4, 2024

If you are hoping for a ~30Gbps or even ~10Gbps L3 switch you may find the cost a little on the high side. This is a market where people will install openswitch onto a dedicated server with a few high-speed NICs. They get to throw CPU cores at the problem, but without the virtualization overheads.

Weehooey-HSS · Jun 4, 2024

Have you disabled hardware checksums? I have seen it cause performance issues in the past.

https://docs.netgate.com/pfsense/en/latest/recipes/virtualize-proxmox-ve.html

mikos · Oct 2, 2024

i don't think it's a pfSense issue by itself, i have experienced the same problem with a simple debian nating/routing L3 traffic.
seems to be a bug or misconfiguration on the host, as the problem from my end is on AMD EPYC server, when switching to virtual router to Intel server i am getting near native performance.

thats my original post
https://forum.proxmox.com/threads/p...limiting-throughput-maybe-cpu-related.155279/

PLKMafia · Oct 12, 2024

Came here to say that I have the exact same problem right now. It's been driving me crazy for quite a while trying to determine if the bottleneck is occurring on my switch, firewall, or Proxmox box. It appears to be a Proxmox issue.

Rather than pfSense, I run OPNsense. My CPU performance is almost the same as yours (never runs high). Intra-VLAN switching (same vlan) between VMs gives me ~30~45 GB/s when carried out by the Proxmox host with VirtIO. When intra-VLAN switching between two distinct devices separated by my switch, I saturate my 2.5 GBe NIC.

But once I move to inter-VLAN routing (different vlans), I see a massive reduction in speed with many retries/retransmissions in iperf. For the record, I have Intel i225 Rev 3 (2.5 Gbe) NICs in my Proxmox box and OPNsense box.

When iperf testing from the Proxmox host itself, I don't see any inter-VLAN routing bottlenecks. Here is a link to my current post. I also linked another person's post who is having the same issue. So far, the Proxmox admin who answered my question is surmising the problem stems from OPNsense. I don't think it does, but I will be completing a test this weekend to determine it once and for all.

Search

Search

pfSense VM slower than expected inter-vlan routing

mshorey

Member

Attachments

rit1001

New Member

mshorey

Member

Attachments

rit1001

New Member

mshorey

Member

rit1001

New Member

Weehooey-HSS

Active Member

mikos

Member

PLKMafia

New Member