SFF with PVE + OPNsense. Am I overly optimistic about performance?

theUtmost

New Member
Jan 23, 2024
8
0
1
Content trigger warning: this post is long
I wanted to include a lot of detail though in the hope that some of it is useful, if your attention span is short then this isn't the thread you're looking for - move along!

I'm a relative newbie to Proxmox VE, I've been dabbling with it less than 1y.
At the start of this year I replaced my ISP router with a single PVE node, and OPNsense VM.
I managed to get that running by following instructions without too many issues, and initially I was very pleased with the results:
getting about ~870-880Mbps down and ~420Mbps up on a plan the ISP advertises as 900Mbps down/450Mbps up.

Sometime in June or July (I really can't recall those months were very busy for me) there was an update for either PVE or OPNsense (or yeah - both) and after that, my throughput speed has tanked.

Having read around on the forums it seems I am not alone, and from some threads "resolved" it seems this might be the new normal?
Upstream kernel to blame blah blah. I'm nowhere near enough of a geek to understand such language, I can usually follow along a simple bash script and largely figure out how it works but much more than that and I'm running everything through chatGPT for explanations.

I'm currently testing some of the OPNsense tunable options for TCP offload and the like, and I have attained a modest improvement.
I'm now getting speeds like ~250-280Mbps down and ~200-240Mbps up.
A shadow of it's former self but still better than the ~100Mbps D/U I saw back in the months starting with J.

What I'd like to understand is - will it be possible to fix this properly on the hardware I have or am I being unreasonably optimistic?

The main reason I went for PVE+OPNsense router on a DIY SFF PC was because I already had most of the hardware, and I wanted something more performant and more flexible than my ISP router, but while I have the flexibility with OPNsense, I'm no longer getting superior speeds to the lowly ISP router.

Some specifics:
OS: Proxmox VE 8.2.4 x86_64 (latest at time of writing)
Host: 10T7S1HQ00 ThinkCentre M720q
Kernel: 6.8.12-1-pve
CPU: Intel i5-9400T (6) @ 3.400GHz
GPU: Intel CoffeeLake-S GT2 [UHD Grap
Memory: 12582MiB (used) / 15865MiB (ie 16Gb)
Drive: 256Gb Kingston NVMe SSD, I forget the exact model
OPNsense: 24.7.4_1-amd64 0- actually now I see that written out - is this my problem???
The host CPU is of course Intel, not AMD, but I am not sure of the significance of the QEMU VM guest selection being a different architecture.

I run only that single VM, I occasionally spin up another for speedtesting but that is only occasionally - in regular daily use the PVE node is ONLY running OPNsense.

Initially, I only allocated 8Gb RAM to the OPNsense VM.
I did have a few system lockups where while I could still ping both the OPNsense VM and the PVE host on their IPs, I could not login to either the WebUI or SSH to either - needed a hardware restart to bring everything back up again. I did notice the OPNsense VM showed it was around 90% RAM utlilisation of the allocated 8Gb, so when the internet throughput tanked, I increased the allocation to 10Gb. OPNsense again used 90% of that, so I have now increased it again up to 12Gb RAM and you guessed it - OPNsense is now using 90% of that RAM.

CPU typically sits around 10-15% utliised inside the OPNsense, with 1 vCPU allocated.
If I hammer the connection, it does spike to 100%, so that's another thing I should try and increase the cores available to OPNsense, seeing as those other cores on the Intel i5-9400T aren't being used for anything else!

Networking:

eno1
- this is the onboard GbE NIC on the ThinkCentre m/b. It might be Intel or broadcom, honestly i don't recall, but it's not of much interest because it's deliberately NOT used for main network duties. Rather, I use it as a management port for the PVE node. It has a static IP address in the same subnet as my main LAN but will never have another device assigend to that IP (excluded from DHCP scope) and NO gateway defined (deliberately). The PVE node WebUI and SSH are set to listen on this interface only.
eno1 > vmbr0 > static IP in LAN

I then have a dual-NIC PCIe card plugged in with a riser-adapter (I found models for these risers from old lenovo support listings and bought a kit including baffle and even screws from Aliexpress).
The card itself is an Inspur X540-T2, so yep that's an Intel X540-T2 clone ie dual 10GbE NICs, also from Aliexpress.

It identifies as:
enp1s0f0 and
enp1s0f1

enp1s0f0 > vmbr1 > WAN
enp1s0f1 > vmbr2 > LAN

Inside OPNsense, I defined:
vtnet0 (WAN) > vmbr1, and the settings required for my ISP (VLAN tag = 10, a PPPoE client and DHCP client IP setting)
vtnet1 (LAN) > vmbr2, and a static IP (Gateway for LAN)
ISC DHCPv4 is enabled on the LAN interface, and I use unbound DNS, there is quite a lot of gear downstream of this and a number of MAC addresses have been given IP reservations.
The reason I went for bridge setup rather than PCIE passthrough was so that I could spin up another VM/CT and do things like eg speedtest on the same PVE node.

Anyway.
Super long post, but I wanted to include some specifics rather than hijacking anyone elses post with a pointless "me too i have the same issue even though my setup is totally different"
If anyone made it this far well done!
If you have suggestions on what else to try that's even better, I'd be very grateful!
I'm changing ONE setting at a time and then monitoring for a few days to see if any improvement, but I guess my main questions are:
1. Is my hardware not up to the demands I'm asking of it? Especially maybe in the RAM department?
2. Should I give up on 10GbE NICs?

I don't ACTUALLY have either an internet connection faster than 1GbE, nor do I have a switch faster than 1GbE (though I'm looking at buying a replacement switch soon for faster LAN transfers from workstation to NAS etc).

I'm mulling setting up another system in parallel and trying either a bare metal OPNsense install, or (gasp) something totally different like VyOS, IPfire or even Sophos XG Firewall - yeah I know it's not OSS but the UI is pretty good and it's free for home use so if it does what I need and gives me decent throughput that might be the end of this experiment. Shame as I like the idea of using OPNsense way more than proprietary OS for routing but currently, it's just not meeting my needs.
I have another same model SFF ThinkCentre M720q available and I've just ordered another riser kit from Aliexpress, so I can tinker a bit with alternatives without breaking the current connection (and then do a hasty hardware swap!).

All ideas and input will be greatly received and if I find any promising results from changes made I will share what I changed and the outcome.
Yes, I am recording each change! I know, amazing right?
Thanks in advance and cheers!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!