VM as Gateway. NAT for local IPs

inDane

Well-Known Member
Jan 11, 2019
40
1
48
35
Hey Proxmoxers,
im facing a weird problem.

Proxmox Version 8.2.4.

I have a bond0, which is split into vlans. I have vmbr0 which is VLAN aware.

Lets say i have a private network 192.168.1.0/24 and some VMs there with a virtio NIC with vlan tag 100. Now I am adding another VM, that has two virtio NICs with vmbr0 with tag 999 and vmbr0 with tag 100. While 999 is the public network. This machine, lets call it gateway, MASQUERADEs (NATs) the traffic for the public network machines.

This does work, but it is INSANELY slow and i am super confused why. I could have sworn, that this was working, but i cannot pinpoint since when it got so slow.

iperf between a "private client vm" with a vmbr0 and vlan-tag 100 and the gateway machine is fast (enough) ~8 Gbits.
iperf between "internet" and the gateway machine on vmbr0 with vlan-tag 999 is also fast (enough).
Now, downloading from private client vm through the gateway is utterly slow. < 80kb/s. I tried to replace virtio for E1000E, i fiddled with MTUs. It didnt change... what am i missing?

I also tried using opnsense as my gateway, since i was seeing the problems, i thought opnsense was the problem, but no. A regular Ubuntu VM acting as GW is showing the same thing.

Any hints are appreciated. I spent many hours already on this...

All nics have at least multiqueue=2.

Best
inDane
 
Hey,
it's smell a binding problem.
Which mode are you using for bonding ?
On your GUI, -> PVE Host -> System -> System Log: Check if you haven't messages calling duplicated @mac on your bonded interfaces.If so, try to modify your bonding mode on failove mod, apply your configuration, then retry yours iperfs.

If this gonna "solve" your problem, you'red facing an issue (which i personnaly deal with it without any bypassing solution found :'( ) present from debian 11 ( by mind, maybe older version concerned)

PS: personal think: play with MTU and multiqueue just increase problems when playing with x)
 
Last edited:
Bond mode is LACP with layer 2+3

Unfortunately there is no such thing state in the log. Neither "duplicated" nor anything mentioning "mac" or mac adresses. :-(
 
OK - Sorry, was onjly able to help you diagnose the source problem with aggregation, when this duplicated @ mas error in your log file come. I've not tested with lacp, no switch lv2 to home ^^
 
Hi InDane,

I'm trying to figure out what your goal is with this architecture. Please, if you can describe what your motivation is to implement this, maybe it will be easy to help you to find a better solution. Anyway, I've identified two major points:
  1. Link aggregation – you are using a bond. Are you using it to have redundant physical connection and better speed?
  2. VLANs – You are using VLANs. Did you tried to use Proxmox SDNs (Datacenter -> SDN)?
  3. NAT – You tried to use NAT for VLAN 999, right? Some special reason for it? Machines on VLAN 100 will use public valid IP address?
You also mention that you tried to use OpenSense as a gateway. Maybe this will be the fast and robust way to solve it. Those problems that you are facing sounds like routing problems.
  1. Even if you are tiring to aggregate and balance traffic between two different Internet providers, an OpenSense solution can handle it as dual WAN configuration.
  2. VLANs can be created using OpenSense and used by Proxmox on SDN type VLAN. With this configuration, you will setup the VLAN ID in just one place for Proxmox VMs.
  3. NAT and DHCP can easily be configurated in OpenSense. It will be used by Proxmox SDNs and even by machines outside Proxmox that uses tagged VLANs.
  4. Plus: you can add some plugins to OpenSense to add features like WAF, reverse proxy, DNS filters, etc.
 
Hey Juliosene,
the motivation is to use a lower amount of public IPv4 addresses. I have got a /25 net with public addresses and it is getting full.
1. Yes
2. No
3. VLAN 100 Private (192.168.1.0/24) and VLAN 999 is public from my /25 net.

1. There is no dual WAN, nor is it needed!
2. I dont need to create them, they are there already!
3. ?
4. ?

I have the feeling, that this post was made with some AI.

Best
inDane
 
I have the feeling, that this post was made with some AI.
No... This was made with almost 30 years' experience in the tech industry. I'm not a Proxmox specialist, but I'm trying to help. :cool:

the motivation is to use a lower amount of public IPv4 addresses. I have got a /25 net with public addresses and it is getting full.

Ok.

A /25 subnet means 126 usable hosts. First of all, an important step is to map and review the services using those IPs and classify them (criticality, traffic, security). If you find some services that can be addressable by domain name behind the same IP or services that moved behind a reverse proxy, maybe this is an something to think about.

Back to your network segmentation, if all your servers are running on Proxmox, you can create two different SNDs VLAN type.
Datacenter -> SDN -> Zones -> Add (Top left) -> VLAN

Screenshot 2024-10-21 144735.png

If you have different physical connections for VLAN 100 and VLAN 999, you must create 2 Zones, otherwise, one is good enough.

Datacenter -> SDN -> VNets -> Create

Screenshot 2024-10-21 144915.png

Click “Create”

For Subnet, select the VNet that you create and just add the Subnet. Gateway, DHCP and others must be blank. (change it for your subnet)

Screenshot 2024-10-21 145100.png

Do the same procedure to create your internal network SDN.

Screenshot 2024-10-21 145439.png

Go to Datacenter -> SDN and hit “Apply”.

Now you can go to your Opensense VM and add two new network interfaces with the bridge pointed to those VNets that you created before. You don't need to create VLANs on Opensense, once Proxmox are exposing each VLANs in a different interface to Opensense.

Using your Opensense, you can create routes, NAT and DHCP to the new interfaces to add features and controls as you need.

If your Opensense is running in a hardware external to Proxmox Cluster, the solution is to create those VLANs on Opensense. In this case vmbr0 bridge (or the bridge connected to Opensense) must be able to use VLANs (VLAN aware: checked).

At the end, you must change VMs interfaces to the correct bridge (VLAN 999 or 100), simply changing the bridge to the VNets associated.

I hope it helps!


Now the AI are creating images with your use case! :D:D:D:D:D
 
OK, thanks and sorry for being judgemental.

addressable by domain name behind the same IP or services that moved behind a reverse proxy, maybe this is an something to think about.
This is exactly what im doing. But they need a route to the outside world again. Thats where my gateway comes in.

I dont think the SDN approach will solve this. I currently suspect something wonky in our backbone-network or the ISPs network.

I will try SDN nonetheless, it looks like a neat feature. Thanks for the hint.


I made another absolutely weird observation:
All tests are done on one host, docker with nat and docker without nat (docker --network host).



sudo docker run -it ubuntu:latest /bin/bash

Code:
root@1050476385a5:/# iperf3 -c ping.online.net -p 5201
Connecting to host ping.online.net, port 5201
[  5] local 172.17.0.2 port 60888 connected to 51.158.1.21 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   117 MBytes   977 Mbits/sec    0   8.01 MBytes       
[  5]   1.00-2.00   sec   133 MBytes  1.12 Gbits/sec    0   8.01 MBytes       
[  5]   2.00-3.00   sec   133 MBytes  1.12 Gbits/sec    0   8.01 MBytes       
[  5]   3.00-4.00   sec   136 MBytes  1.14 Gbits/sec    0   8.01 MBytes       
[  5]   4.00-5.00   sec   134 MBytes  1.12 Gbits/sec    0   8.01 MBytes       
[  5]   5.00-6.00   sec   135 MBytes  1.13 Gbits/sec    0   8.01 MBytes       
[  5]   6.00-7.00   sec   131 MBytes  1.10 Gbits/sec    0   8.01 MBytes       
[  5]   7.00-8.00   sec   134 MBytes  1.13 Gbits/sec    0   8.01 MBytes       
[  5]   8.00-9.00   sec   136 MBytes  1.14 Gbits/sec    0   8.01 MBytes       
[  5]   9.00-10.00  sec   131 MBytes  1.10 Gbits/sec  309   3.98 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.29 GBytes  1.11 Gbits/sec  309             sender
[  5]   0.00-10.04  sec  1.29 GBytes  1.10 Gbits/sec                  receiver

iperf Done.
root@1050476385a5:/# iperf3 -c ping.online.net -p 5201 -R
Connecting to host ping.online.net, port 5201
Reverse mode, remote host ping.online.net is sending
[  5] local 172.17.0.2 port 56964 connected to 51.158.1.21 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  0.00 Bytes  0.00 bits/sec                 
[  5]   1.00-2.00   sec   128 KBytes  1.05 Mbits/sec                 
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec                 
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec                 
[  5]   4.00-5.00   sec   128 KBytes  1.05 Mbits/sec                 
[  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec                 
[  5]   6.00-7.00   sec   128 KBytes  1.05 Mbits/sec                 
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec                 
[  5]   8.00-9.00   sec   128 KBytes  1.05 Mbits/sec                 
[  5]   9.00-10.00  sec   128 KBytes  1.05 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.04  sec   867 KBytes   707 Kbits/sec  308             sender
[  5]   0.00-10.00  sec   640 KBytes   524 Kbits/sec                  receiver

iperf Done.

sudo docker run --network host -it ubuntu:latest /bin/bash

Code:
root@bi-ubu-srv-2404:/# iperf3 -c ping.online.net -p5202
Connecting to host ping.online.net, port 5202
[  5] local <my-public-ip> port 35822 connected to 51.158.1.21 port 5202
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   115 MBytes   965 Mbits/sec    0   8.02 MBytes       
[  5]   1.00-2.00   sec   142 MBytes  1.19 Gbits/sec    0   8.02 MBytes       
[  5]   2.00-3.00   sec   139 MBytes  1.16 Gbits/sec    0   8.02 MBytes       
[  5]   3.00-4.00   sec   135 MBytes  1.14 Gbits/sec    0   8.02 MBytes       
[  5]   4.00-5.00   sec   133 MBytes  1.12 Gbits/sec  354   3.95 MBytes       
[  5]   5.00-6.00   sec   133 MBytes  1.12 Gbits/sec    0   4.04 MBytes       
[  5]   6.00-7.00   sec   140 MBytes  1.17 Gbits/sec    0   4.04 MBytes       
[  5]   7.00-8.00   sec   118 MBytes   994 Mbits/sec   71   2.95 MBytes       
[  5]   8.00-9.00   sec   122 MBytes  1.03 Gbits/sec    0   3.08 MBytes       
[  5]   9.00-10.00  sec   128 MBytes  1.07 Gbits/sec    0   3.19 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.27 GBytes  1.09 Gbits/sec  425             sender
[  5]   0.00-10.04  sec  1.27 GBytes  1.09 Gbits/sec                  receiver

iperf Done.
root@bi-ubu-srv-2404:/# iperf3 -c ping.online.net -p5202 -R
Connecting to host ping.online.net, port 5202
Reverse mode, remote host ping.online.net is sending
[  5] local <my-public-ip> port 43804 connected to 51.158.1.21 port 5202
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   128 MBytes  1.07 Gbits/sec                 
[  5]   1.00-2.00   sec   156 MBytes  1.31 Gbits/sec                 
[  5]   2.00-3.00   sec   156 MBytes  1.31 Gbits/sec                 
[  5]   3.00-4.00   sec   157 MBytes  1.32 Gbits/sec                 
[  5]   4.00-5.00   sec   157 MBytes  1.31 Gbits/sec                 
[  5]   5.00-6.00   sec   157 MBytes  1.31 Gbits/sec                 
[  5]   6.00-7.00   sec   155 MBytes  1.30 Gbits/sec                 
[  5]   7.00-8.00   sec   157 MBytes  1.32 Gbits/sec                 
[  5]   8.00-9.00   sec   156 MBytes  1.31 Gbits/sec                 
[  5]   9.00-10.00  sec   157 MBytes  1.32 Gbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.04  sec  1.57 GBytes  1.34 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  1.50 GBytes  1.29 Gbits/sec                  receiver

iperf Done.
 
I currently suspect something wonky in our backbone-network or the ISPs network.

Did you check if your network MTU is correct?

package size = MTU - 28

ping ping.online.net -f -l <<package size>>

find the highest package size and your MTU is this number plus 28.


About your network, I'm creating an infrastructure in my HomeLab that is like yours (exception made by the fact that I don't have valid IPs here).

I'm using Opensense as my EW router/firewall. Opensense has some plugins that can help to create reverse proxy (Caddy, Nginx, HAproxy) plus WAF and security tools.

Firewall-Proxmox-Proxmox-Single Router.drawio.png
 
The culprit was rx-gro-hw ...
generic-receive-offload on BROADCOM NICs (latest Firmware 22.92.07.50) with Kernel 6.8


Here is what I guess happend:
I did not realize the bad RX Performance after Upgrading to PVE 8, because the problems shows only in specific conditions.
1. Docker inside VM -> TX good, RX bad
2. VM using another VM as a gateway (as in this topic described).
Basically it boils down to, both scenarios use NAT.

Disabling rx-gro-hw fixes this:
ethtool -K eno12429np3 rx-gro-hw off

Make permanent with:
Code:
iface bond0 inet manual
    bond-slaves eno12399np0 eno12409np1 eno12419np2 eno12429np3
    bond-miimon 100
    bond-mode 802.3ad
    bond-xmit-hash-policy layer2+3
    mtu 9000
    pre-up /sbin/ethtool -K eno12399np0 rx-gro-hw off
    pre-up /sbin/ethtool -K eno12409np1 rx-gro-hw off
    pre-up /sbin/ethtool -K eno12419np2 rx-gro-hw off
    pre-up /sbin/ethtool -K eno12429np3 rx-gro-hw off


Info:
Code:
Linux pm01 6.8.12-4-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-4 (2024-11-06T15:04Z) x86_64 GNU/Linux

63:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57504 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet (rev 11)
63:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57504 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet (rev 11)
63:00.2 Ethernet controller: Broadcom Inc. and subsidiaries BCM57504 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet (rev 11)
63:00.3 Ethernet controller: Broadcom Inc. and subsidiaries BCM57504 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet (rev 11)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!