[SOLVED] Turning on firewall with ACCEPT policies everywhere makes hosts unreachable

Jul 14, 2024
8
0
1
Hello,

I am just getting started on PVE firewalling in my test lab, so I thought I'd start the firewall with an ACCEPT policy everywhere and then work my way forward. But I am failing.

I enabled the firewall for LXC 103:

Code:
# /etc/pve/firewall/103.fw
[OPTIONS]

enable: 1
policy_in: ACCEPT

and for the host:

Code:
# /etc/pve/nodes/pve/host.fw
[OPTIONS]

enable: 1
nftables: 1

The proxmox-firewall package is installed.

So far, so good. The UI now tells me that the datacenter firewall is still not active, and in fact,
Code:
nft list ruleset
is empty.

If I now flick on the datacenter firewall:

Code:
# /etc/pve/firewall/cluster.fw
[OPTIONS]

policy_out: ACCEPT
policy_in: ACCEPT
policy_forward: ACCEPT
enable: 1

then things go weird. Like packet loss kinda weird. I cannot quite figure it out, but TCP connections between two hosts are no longer reliable when I flick that switch, despite the ACCEPT policies everywhere. Like the TCP connection only works once every 10 tries.

What am I missing? What could be going on?

PS: I've disabled `nftables` and verified that the exact same problem appears with `iptables`.

Thank you,
martin
 
Last edited:
Here are some more details: two LXCs on PVE, in different VLANs. I'll run an HTTPS request from A to `https://b:3443` It'll work once every 10 tries. All the other times, this happens:

A send SYN and receives SYNACK:

Code:
20:45:41.134168 eth0  Out IP 192.168.231.98.36840 > 192.168.231.144.3443: Flags [S], seq 2778498582, win 64240, options [mss 1460,sackOK,TS val 1055619827 ecr 0,nop,wscale 7], length 0
20:45:41.134405 eth0  In  IP 192.168.231.144.3443 > 192.168.231.98.36840: Flags [S.], seq 1840706678, ack 2778498583, win 65160, options [mss 1460,sackOK,TS val 889876648 ecr 1055619827,nop,wscale 7], length 0

So far, so good. This is identical on the receiving host:

Code:
20:45:41.134316 eth0  In  IP 192.168.231.98.36840 > 192.168.231.144.3443: Flags [S], seq 4172194215, win 64240, options [mss 1460,sackOK,TS val 1055619827 ecr 0,nop,wscale 7], length 0
20:45:41.134328 eth0  Out IP 192.168.231.144.3443 > 192.168.231.98.36840: Flags [S.], seq 1082068254, ack 4172194216, win 65160, options [mss 1460,sackOK,TS val 889876648 ecr 1055619827,nop,wscale 7], length 0

But now things go weird. As said before, roughly every 10th time, this doesn't happen, but in between, it does:

Now, A sends the ACK ending the 3way-handshake:

Code:
20:45:41.134417 eth0  Out IP 192.168.231.98.36840 > 192.168.231.144.3443: Flags [.], ack 1, win 502, options [nop,nop,TS val 1055619828 ecr 889876648], length 0

but this is not received at B. B instead resends SYNACK:

Code:
20:45:42.142330 eth0  Out IP 192.168.231.144.3443 > 192.168.231.98.36840: Flags [S.], seq 1082068254, ack 4172194216, win 65160, options [mss 1460,sackOK,TS val 889877656 ecr 1055619827,nop,wscale 7], length 0
20:45:44.190337 eth0  Out IP 192.168.231.144.3443 > 192.168.231.98.36840: Flags [S.], seq 1082068254, ack 4172194216, win 65160, options [mss 1460,sackOK,TS val 889879704 ecr 1055619827,nop,wscale 7], length 0
20:45:48.222348 eth0  Out IP 192.168.231.144.3443 > 192.168.231.98.36840: Flags [S.], seq 1082068254, ack 4172194216, win 65160, options [mss 1460,sackOK,TS val 889883736 ecr 1055619827,nop,wscale 7], length 0
[…]

and A is getting pushy resending its first packet:

Code:
20:45:41.136034 eth0  Out IP 192.168.231.98.36840 > 192.168.231.144.3443: Flags [P.], seq 1:518, ack 1, win 502, options [nop,nop,TS val 1055619829 ecr 889876648], length 517
20:45:41.342330 eth0  Out IP 192.168.231.98.36840 > 192.168.231.144.3443: Flags [P.], seq 1:518, ack 1, win 502, options [nop,nop,TS val 1055620036 ecr 889876648], length 517
20:45:41.550328 eth0  Out IP 192.168.231.98.36840 > 192.168.231.144.3443: Flags [P.], seq 1:518, ack 1, win 502, options [nop,nop,TS val 1055620244 ecr 889876648], length 517

And that's it. No connection possible. Until a couple retries later, it suddenly works for a single connection.
 
Last edited:
Hey — The sequence numbers in your packet traces are the real giveaway here - from A's side the SYN has seq=2778498582, but B receives it as seq=4172194216. Those are different numbers, which means NAT is happening somewhere in the path between them. Even with ACCEPT policies on the Proxmox firewall, if you have masquerade or SNAT rules for inter-VLAN routing, conntrack is still doing sequence number translation, and a race condition in that translation is causing the ACK to be dropped. First thing to check is iptables -t nat -L -n -v or nft list ruleset and look for masquerade/snat to see if traffic between those VLANs is being masqueraded. If you are routing inter-VLAN through the host IP stack rather than pure bridging there is likely a MASQUERADE rule somewhere. Also worth checking sysctl net.bridge.bridge-nf-call-iptables and net.bridge.bridge-nf-call-nftables - if both are 1, bridged traffic is going through the iptables/nftables chain twice, once at bridge level and once at IP level, which can cause duplicate conntrack entries and this exact intermittent ACK-drop behaviour. Setting net.netfilter.nf_conntrack_tcp_loose=1 may help as a quick workaround but fixing the underlying routing/NAT config is the proper solution.
 
  • Like
Reactions: madduck
@PaddraighOS thank you so much for looking into this and your reply. Great eyes you got. There is no masquerading going on whatsoever on the PVE, but you are still 100% right I think. Our PVE is connected to a firewall on a trunk port, and traffic between VLANs does actually go via that firewall. Here is a packet from one VLAN to the other, seen on PVE, once outbound and once inbound:

Code:
07:07:01.810348 c6:6a:5c:5a:6d:30 > 40:84:93:16:a6:67, ethertype 802.1Q (0x8100), length 78: vlan 2396, p 0, ethertype IPv4 (0x0800), 192.168.231.98.50656 > 192.168.231.144.3443: Flags [S], seq 1914431976, win 64240, options [mss 1460,sackOK,TS val 1348500504 ecr 0,nop,wscale 7], length 0
07:07:01.810452 40:84:93:16:a6:67 > ee:17:55:fd:08:b0, ethertype 802.1Q (0x8100), length 78: vlan 2428, p 0, ethertype IPv4 (0x0800), 192.168.231.98.50656 > 192.168.231.144.3443: Flags [S], seq 2107436627, win 64240, options [mss 1460,sackOK,TS val 1348500504 ecr 0,nop,wscale 7], length 0

The firewall has MAC address 40:84:93:16:a6:67. And yes, the TCP sequence numbers are different, meaning that the firewall is actually doing some sort of rewriting, though it's not NAT, or if it is NAT, then it actually translates addresses to… themselves.

Unfortunately, I am not in control of the firewall, and the people running it have very little clue. This is one of the reasons why I was dabbling in PVE firewalling, as it would be easier for me to maintain my own ruleset, than to explain to them again the difference between UDP and TCP ports.

While I figure out how to solve this (the firewall contract is running out soon, and a Debian machine is already sitting there and ready to go), I am still unclear about why the ACKs are dropped. Where is that race condition you allude to?
 
Last edited:
@PaddraighOS Do you have any more information about the race condition you mentioned? I mean, I get what is happening and the firewall is obviously doing stuff it should not do, but that's more of an infficiency and shouldn't cause breakage.

Why do things fall over as soon as PVE turns on packet filtering?

Is the issue that Proxmox sees a packet leaving from socket A to B with sequence number 123, and then a packet comes in on another VLAN from socket A to B, but it has sequence number 456?
 
Last edited:
Is that a Cisco ASA by chance? those have been known to cause issues in the past because of TCP sequence randomization.

In any case, the problem is most likely that by turning on the firewall conntrack gets enabled and conntrack also does look at TCP sequence numbers and then classifies packets as invalid and then drops them. That usually happens when the guests are located on the same host. You can disable this behavior via the nf_conntrack_allow_invalid setting in the host firewall configuration, but lose some security in the process..
 
  • Like
Reactions: madduck