Asymmetric iperf3 throughput over L2TP/IPsec (IKEv2) tunnel

Elfy

Renowned Member
Dec 29, 2016
60
58
83
35
I’m troubleshooting an asymmetric bandwidth problem over a site-to-site L2TP / IKEv2 (IPsec) VPN between two sites. Upload/download should be symmetric (1 Gbps fiber), but TCP from Janeway (Proxmox node) → TrueNAS collapses with heavy retransmits while TrueNAS → Janeway is stable and fast. Evidence points to directional packet loss / drops somewhere between Janeway and TrueNAS (or a NIC/driver quirk on Janeway affecting its egress). UDP can hit line rate but TCP collapses under retransmits in the Janeway→TrueNAS direction. VLAN offload disable gave a small improvement, which suggests an offload/driver/bridge interaction.

Environment / topology:

  • Site-to-site: L2TP over IPsec (IKEv2), pfSense 2.7.2 on both gateways.
  • Physical links: symmetric 1 Gbps fiber between gateways.
  • Host A (Proxmox): Janeway: Intel 1 Gbps NIC eno2.
  • Host B (TrueNAS): TrueNAS: Intel 1 Gbps NIC enp3s0f1.
  • Proxmox uses extra networking for Ceph (separate 25Gb NIC), but tests are between the external VLANs over the VPN.
  • No intentional QoS or shaping on pfSense.

What I’ve already tried:

  • Disabled NIC and tunnel offloads (TSO/GSO/GRO/LRO, VLAN offloads tested).
  • Increased NIC ring buffers on TrueNAS to max (4096) and switched to a NIC with larger rings.
  • Increased net.core.rmem_max, wmem_max, netdev_max_backlog, enabled window scaling.
  • Ensured no per-connection shaping on pfSense.
  • Verified both ISPs report symmetric up/down.
  • Tested both single and parallel TCP streams (iperf3 -P 1 and -P 4) — parallel streams do not eliminate the issue.
  • Performed UDP tests — raw UDP capacity is fine (near line rate in one direction), but receiver stats indicate huge UDP loss in some test forms.
  • Tested direct HTTP download/cURL and SCP real-world transfers — observed low real-world speeds (tens of Mbps).
  • Swapped TrueNAS to a different physical NIC (enp3s0f1) with 4k RX/TX buffers. Slight improvement but still severe asymmetry.
  • Disabled VLAN offloads produced a small improvement.
Representative results
Janeway → TrueNAS (bad)
Code:
root@Janeway:~# iperf3 -c 192.168.50.2 -t 5
...
[  5]   0.00-5.00   sec  3.15 MBytes  5.29 Mbits/sec   19             sender
[  5]   0.00-5.05   sec  2.33 MBytes  3.87 Mbits/sec                  receiver

TrueNAS → Janeway (good)
Code:
root@Janeway:~# iperf3 -c 192.168.50.2 -t 5 --reverse
...
[  5]   0.00-5.05   sec   154 MBytes   255 Mbits/sec  112             sender
[  5]   0.00-5.00   sec   141 MBytes   236 Mbits/sec                  receiver

UDP tests (weird receiver loss numbers)​

Janeway -> TrueNAS (UDP: sender shows 954Mbps, receiver shows near 0 / huge loss)
TrueNAS -> Janeway (UDP: sender shows 953Mbps, receiver shows ~243Mbps with ~74% 'lost' by receiver)

cURL Tests​

curl on TrueNAS (downloading from public object store)
Total: 316176600 bytes
Time: 127.16 s
Speed: 2,486,493 bytes/s (~19.9 Mbps)

same file from another site/ISP:
Total: 316176600 bytes
Time: 66.656714 s
Speed: 4,743,357 bytes/s (~37.95 Mb/s)

Code:
auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

auto ens1f0np0
iface ens1f0np0 inet static
        mtu 9000
#OpenFabric1

auto ens1f1np1
iface ens1f1np1 inet static
        mtu 9000
#OpenFabric2

auto vmbr1
iface vmbr1 inet static
        address 10.11.0.3/28
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#Storage

auto vmbr0
iface vmbr0 inet manual
        bridge-ports eno2
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#VLAN Trunk

auto External
iface External inet static
        address 10.10.0.3/24
        gateway 10.10.0.1
        bridge-ports vmbr0.10
        bridge-stp off
        bridge-fd 0
#External bridge

auto vlan12
iface vlan12 inet static
        address 10.12.0.3/28
        vlan-raw-device vmbr0
#Cluster

post-up /usr/bin/systemctl restart frr.service


Speedtest CLI links (on nodes):
https://www.speedtest.net/result/c/8df733b7-b3a7-4113-846a-44179c27f902
https://www.speedtest.net/result/c/78e43aba-60c1-4b52-9ebb-bafb8efb7228
https://www.speedtest.net/result/c/2e4761f2-e3c0-4081-ac10-afdc300f2d84

PCAPs / captures: I have packet captures from both physical NICs (Janeway eno2, TrueNAS enp3s0f1) for the same iperf runs and can share them.

Questions:​

  • I suspect hardware limitation somewhere but how can I pin down the bottleneck?
  • Could this be a known Proxmox / kernel / Intel driver interaction with IPsec (ESP) that breaks TCP one direction? Any known module parameters or kernel versions to avoid?
  • What specific ethtool/driver counters should I paste to prove whether drops are NIC/driver related vs mid-path (switch/gateway)?
  • If the packet captures are useful, what exact slices/filters should I include for others to quickly see the loss point?
  • Is there anything in pfSense IPsec settings (ESP cipher, MTU/MSS clamp, hardware crypto offloading) that commonly causes exactly this symptom (fast one way, collapse the other way)?
  • Any Proxmox-specific tunables (bridge offload, qdisc, IRQ affinity, netdev_backlog) that Proxmox users have found helpful for IPsec + Intel NICs?

UPDATE 03-09-2025​

After more digging, this doesn’t look like a host or VPN config issue at all; it’s a routing problem between my datacenter ISP (Flexential), Lumen, and Google Fiber. MTRs/Traceroutes show Flexential handing traffic to Lumen, which then detours traffic out of Salt Lake City through Kansas City/Dallas/Denver before reaching Google Fiber. This adds ~50 ms latency and correlates with poor throughput on DC → home transfers, whereas reverse path (home → DC) is more direct and performs normally. So it seems the asymmetric speeds I’m seeing are almost certainly because of suboptimal BGP routing/peering, not anything wrong with my Proxmox, TrueNAS, or VPN setup... I guess I will open a ticket with my colo provider.
 
Last edited: