TCP Streams Drop between Proxmox VLANS Routed via Virtual PFSense

opcodeoeprator

New Member
Oct 5, 2023
3
2
1
I've been at this for a week, and I can't figure it out. Sorry for the Information dump, but I'm hoping that someone can take a look at the below to both try and solve my problem, and also offer a sanity check that I'm properly using VLANS and routing in my network and within Proxmox itself.

The problem:
When accessing servers on different vlans, connections are dropped after 10-40 seconds.
For Example:On my Desktop on Default Vlan 1, IP 10.10.1.103 I SSH into the DNS server with IP: 10.10.30.100 (It's interface for the private VLAN)After 10-40 secondsish, the connection is dropped and frozen. I am able to make a new one in another ssh session window.
During that time, pings have been continuous without fail to the same IP.
If I ssh into the same server on the SAME subnet via its Default/Lan IP 10.10.1.100, this does not happen, and the connection is fine.This does not only apply to SSH, but also web connections as well. For Exmaple, Nextcloud will drop with "Connection to server lost" often, and the page has to be refreshed. (Nextcloud being on VLAN 30, accessed via reverse Proxy that is on VLAN 20)
This applies to both servers with a single NIC, and more than 1 NIC for several VLANs tagged within proxmox.

Preface:
  • All VLAN interfaces are open to each other via any any on the pfsenseVM until this issue is fixed with allow all/all
  • All Proxmox Hypervisors also have interfaces on each vlan to manage them until fixed.
  • All the individual VMS have their network interfaces marked with the VLAN needed within proxmox, except for pfsense, which has the bridge as a network interface and handles it's VLANS within.
  • No firewalls are enabled in proxmox.
  • All Servers have a single NIC.

Topology:
Default LAN:
10.10.1.1/24
Three VLANS:
10-Management (10.10.10.0/24)
20-Public (10.10.20.0/24)
30-Private (10.10.30.0/24)
Individual Configuration as follows:
Server1:
vmbr0 - IP: 10.10.1.11/24 Gateway: 10.10.1.1
vmbr.10 - IP: 10.10.10.11/24
vmbr.20 - IP: 10.10.20.11/24
vmbr.30 - IP: 10.10.30.11/24
Server2:
vmbr0 - IP: 10.10.1.12/24 Gateway: 10.10.1.1
vmbr.10 - IP: 10.10.10.12/24
vmbr.20 - IP: 10.10.20.12/24
vmbr.30 - IP: 10.10.30.12/24
Server3:
vmbr0 - IP: 10.10.1.13/24 Gateway: 10.10.1.1
vmbr.10 - IP: 10.10.10.13/24
vmbr.20 - IP: 10.10.20.13/24
vmbr.30 - IP: 10.10.30.13/24
Pfsense is running on Server1 with two virtual NICS, both vmbr0 from proxmox with no VLAN tagging.Within pfsense:NIC 1 has 1 VLAN, 777 and is set to the WAN interface.NIC 2 has 4 interfaces: nic2(for default/lan), VLAN10, VLAN20, VLAN30.

The Switch:
Port 1: PVID: 777, untagged 777 - > MODEM WAN
Port 2: Tagged 10, 20, 30, 777 / Untagged 1 - > Server1
Port 3: Tagged 10, 20, 30 / Untagged 1 - > Server2
Port 4: Tagged 10, 20, 30 / Untagged 1 - > Server3
Port 4: Untagged 1 - > Desktop

What I've tried:
  1. Set allow all/all as a floating rule on the firewall via pfsense.
  2. Increased the state table count.
  3. System >> Advanced >> Firewall & NAT >>Bypass firewall rules for traffic on the same interface
  4. System -> Advanced -> Miscellaneous -> Gateway Monitoring -> (State Killing on Gateway Failure On) Not checked: Skip rules when gateway is down
  5. I thought it could be a Asymmetric Routing issue, so I disabled all but 1 NIC within the DNS server and tried to ssh with another interface other than the Default, but the issue kept happening.
  6. Using a router with all interfaces as tagged vlans running openwrt solved this issue, but I want to use PFSense within the VM, and get rid out the router.
 
  • Like
Reactions: kaizen133
I'm having the exact same issue with pfSense on a VM. I had the same issue with xcp-ng and now on Proxmox. Have you been able to figure out a solution for this?
 
I am also having nearly the same issue. I am using 1 proxmox host on which Pfsense VM is doing my routing and I have seperate VLANS tagged in proxmox. Pfsense has 4 nics for each vlan. My VNC / SPICE sessions are dropping after 10-30 sec when connecting from VLAN10 to VLAN20 for example. In pfsense logs I see allow packets for the session but also deny packets after this 10-30 sec (although I have a rule allowing the needed ports between the 2 interfaces/VLANs). Seems like some kind of TCP session issue is happening... but not sure what needs to be changed.
 
I was having the same and I've discovered that "Firewall Optimization Options" affects the time until the connection drops. Changing from "Normal" to "Conservative" improves the situation as it extends the window to ~15min. However, it does not fix the issue. At least it seems like a clue on where the issue may be
 
A few hours later...
I've discovered that it was due to asymmetric routing. Perhaps the machine you're connecting to has 2 IP's in different vlans. This causes the response to go back from a different route than the one did the request. Hence the state in the firewall for that connection goes to CLOSED:SYN_SENT instead of ESTABLISHED:ESTABLISHED. Hence the firewall will drop the connection once the timed windows finishes.

The way you can discover if there's an asym routing is to query the remote machine what route will use to get to your machine using:
ip route get [host IP]
If this returns an IP different to the one you ssh into, then you've discovered where the problem lies.

I'm not suggesting this is the only possible reason, but it may be one that can solve some of these issues
 
In the specific case of the OP, this is for sure the reason since the server has a different IP for each vlan. In general, you will only want an IP in that interface that's on your network for managing the servers. Others do not need to have an IP. The IP will be given to the VM's that use that tagged bridge to connect to other machines but not the host
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!