[SOLVED] PVE Firewall not filtering anything

t.lamprecht · May 3, 2021

Kurgan said:
net.bridge.bridge-nf-call-arptables
net.bridge.bridge-nf-call-ip6tables
net.bridge.bridge-nf-call-iptables

Those are set by /etc/sysctl.d/pve.conf since 2012 https://git.proxmox.com/?p=pve-cluster.git;a=commitdiff;h=501839cac97f68d4dcba21df6fb3797b976e9e56 due to causing performance regressions on many guests, bridge separation should rather use a real separating technology like VLAN or VXLAN, see the following mail for some details
https://lists.proxmox.com/pipermail/pve-devel/2012-March/002418.html
(other distros like RHEL was taken as argument that this was accepted and working default behavior)

If you want to apply rules directly on the bridge, not the actual tapX or ethX devices like pve-firewall does, then you can just drop those lines from that config (or add an override in a lexical later sorted filename to set them to 1).

This could be mentioned in the docs, but besides that I see no issue here - the PVE firewall works after all and other do not suddenly stop working due to this, this is set on boot (so any simple test would show that the setting is off).

ozdjh · May 9, 2021

@t.lamprecht, this is marked as solved but I don't know what the solution is.

I am seeing this same problem. We have a prod cluster and a dev cluster that we upgraded to 6.3 & Octopus about 5 weeks ago. We recently noticed spoofed traffic coming out of a prod cluster node and (with the help of support ticket #1879015) identified that net.bridge.bridge-nf-call-iptables was set to 0 and that none of the VM firewall rules were being enforced anymore. They had been working prior to the upgrade.

We checked the lab cluster but FW wasn't enabled at the datacenter level. We enabled it and can now see that PVE has set net.bridge.bridge-nf-call-iptables=1.

These are 2 totally standard installations, with FW rules configured solely through the web UI. Your support engineer says that net.bridge.bridge-nf-call-iptables should be 1 for the firewall to work as expected. I suggested editing /etc/sysctl.d/pve.conf but he said as we didn't know why PVE was disabling this we couldn't be sure it wouldn't do so again at some stage.

The iptables chains are called 'tapXiY' and are children of FORWARD. /etc/sysctl.d/pve.conf on both clusters sets the br-nf settings to 0. Yet on the dev cluster PVE has set net.bridge.bridge-nf-call-iptables to 1 while the prod cluster is leaving it at 0. Which is correct, and can you give me any idea what to look at to work out what in PVE is changing that value on one cluster but not the other?

shippj · May 22, 2021

This problem does NOT appear to be solved.

This seems like a CRITICAL SECURITY PROBLEM that has persisted for WAY too long!

I'm seeing this problem on 2 production machines. 6.3-4 and 6.4-4. I have the firewall enabled at the datacenter level and the host level. When I add a DROP or REJECT rule to the host, i'm still able to ping the host from the IP that is supposed to be dropped. The firewall also fails to work on VMs and containers. Nothing relevant in journalctl or syslog or the console. "pve-firewall restart" doesn't help. "pve-firewall status" show's running. Restarting the pve server doesn't help. I'm not using CEPH.

Every once in a while, I can get the firewall to work, but I'm unable to reproduce it. When it's working, changing the DROP rule to ALLOW and then back to DROP results in it continuing to allow. I'm waiting at least 60 seconds to make sure.

ozdjh · May 24, 2021

The comments on this thread regarding a ceph upgrade without a reboot match our experience. We upgraded to 6.3 then after testing everything was OK we upgraded Ceph to Octopus. We rebooted during the PVE upgrade but did not reboot after the Ceph upgrade. That matches what others have said. Running "pve-firewall restart" sets things back to normal.

We've had a ticket open for this but that has been unproductive. We've spent too long on this with no interest from support about finding the root cause ('try upgrading to 6.4', 'it must be something else you've installed' etc). We've written a plugin for our monitoring system to alert if this happens again so we can manually restart the firewall. It's not ideal but we're going to leave it at that.

t.lamprecht · May 26, 2021

ozdjh said:
@t.lamprecht, this is marked as solved but I don't know what the solution is.

Note that this was not marked as solved by any staff here, but by the thread OP, so please do not take that thread prefix as "proxmox staff just ignores this", the original author had their issue deemed as solved and so marked their thread as such.

We plan to improve the situation by dropping the sysctl override from pve-cluster, which is quite outdated nowadays and can be better handled by an admin themselves. Also, we will enforce relevant settings more pro-actively and periodically from pve-firewall, as long as the FW is configured to be enabled.

shippj said:
Nothing relevant in journalctl or syslog or the console. "pve-firewall restart" doesn't help. "pve-firewall status" show's running. Restarting the pve server doesn't help. I'm not using CEPH.

Then you're probably not affected by this threads issue, as a restarted FW always updates the relevant sysctl. Ensure you actually enabled the FW on all levels, that the rules and settings are correct and that you do not open a connection to the target first so that connection tracking allows the other side even if it'd wouldn't be allowed else.

shippj · Jun 10, 2021

I think I just narrowed down the problem some. The firewall rules appear to work for tcp connections, but not icmp. When I add a firewall rule to block an IP, the tcp ports,such as ssh become unusable almost instantly, but icmp pings keep working no matter how hard I try to make the firewall block them.

d1_sen · Aug 27, 2021

Has proxmox provided a solution for this issue? I recently upgraded to v7.0-9 and none of my FW rules are working, even after a "pve-firewall restart".

I have checked and rechecked, the FW is enabled in all places. It was all working before the upgrade.

ozdjh · Aug 27, 2021

To see if it's the same problem check the output of

Code:

sysctl net.bridge.bridge-nf-call-iptables

If that's set to 0 then the VM traffic isn't being passed through iptables. Setting that to 1 will fix things.

d1_sen · Aug 27, 2021

hey Ozdjh,

Thanks for your reply. this is the output of the sysctl:


net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-filter-vlan-tagged = 0
fs.aio-max-nr = 1048576

I saw a couple posts here mentioning that after restarting pve-firewall the above was set to 1, is that not the case?

I am reluctant to use this option in my production because I dont want to risk the firewall rule disabling again all by itself.

ozdjh · Aug 27, 2021

Hey d1_sen.

I would also have assumed that a firewall restart would have fixed this. And, I was under the impression that PVE 7 set the correct forwarding any time a firewall rule was pushed. Have you tried editing the rules of a VM to see if that enabled net.bridge.bridge-nf-call-iptables ?

lucaferr · Aug 27, 2021

d1_sen said:
hey Ozdjh,

Thanks for your reply. this is the output of the sysctl:

net.bridge.bridge-nf-call-ip6tables = 0 net.bridge.bridge-nf-call-iptables = 0 net.bridge.bridge-nf-call-arptables = 0 net.bridge.bridge-nf-filter-vlan-tagged = 0 fs.aio-max-nr = 1048576

I saw a couple posts here mentioning that after restarting pve-firewall the above was set to 1, is that not the case?

I am reluctant to use this option in my production because I dont want to risk the firewall rule disabling again all by itself.

I can confirm that in latest PVE 6.4, as well as in PVE 5.4, a pve-firewall restart set them all to 1 and fixes the problem (and can be safely run in production, since it doesn't interrupt the networking, I've run it dozens of times on hosts running tens of mission critical VMs).
Unfortunately I don't have any PVE 7 to test

Elektordi · Oct 18, 2021

I had the same problem today with a fresh proxmox 6.3-2 install!
Firewall was working few hours ago, did some tests including creating a cluster of 2 servers and installing ceph, "net.bridge.bridge-nf-call-iptables" was changed to 0. Problem was fixed with a "pve-firewall restart" and "net.bridge.bridge-nf-call-iptables" is now back to 1.

l3x · Feb 19, 2022

The firewall feature in pve-manager/7.1-7/df5740ad (running kernel: 5.13.19-2-pve) is flaky at best.

For example, when I enable the firewall at the DataCenter, Cluster and Container levels, I have seen it allow all traffic through to the container.

When I change a setting at the Cluster level, e.g., to allow all icmp traffic, then all traffic is blocked.

When I restart the firewall, with either `systemctl restart pve-firewall` or `pve-firewall restart` with the icmp ALLOW all rule in place, one ping will be allowed each time I run either command.

Does getting the subscription fix this issue? (my guess is that it probably does not)

This is basic stuff that really should work.

(And another un-related nit is: The debian lxe containers are super slow to start and require privileged and nested settings. So, that's an non-starter for me).

Otherwise, proxmox 7.1 is very nice so far.

t.lamprecht · Feb 21, 2022

l3x said:
The firewall feature in pve-manager/7.1-7/df5740ad (running kernel: 5.13.19-2-pve) is flaky at best.

For example, when I enable the firewall at the DataCenter, Cluster and Container levels, I have seen it allow all traffic through to the container.

Datacenter is Cluster level, so did you enable it on the Nodes? Remember that PVE won't cut-off established, existing connections (tracking) so,

l3x said:
When I change a setting at the Cluster level, e.g., to allow all icmp traffic, then all traffic is blocked.

That sounds wrong, and I cannot reproduce that at all here, maybe post your firewall config and more details about your setup, especially network related (how VMs traffic is routed out in general)

Code:

/etc/pve/firewall/cluster.fw
/etc/pve/nodes/nina/host.fw
/etc/pve/firewall/<VMID>.fw

And maybe open a new post for that, this one is rather old and doesn't have to do anything with whatever you see.

l3x said:
Does getting the subscription fix this issue? (my guess is that it probably does not)

Proxmox VE is 100% open source so, no, not in that regard you seem to imply - features are the same for everyone in Proxmox products.
Albeit subscriptions can give you enterprise support, and yes that support team could help to get this sorted out and advice you on the symptoms your seeing.

l3x said:
This is basic stuff that really should work.

If configured correctly it will.

l3x said:
The debian lxe containers are super slow to start and require privileged and nested settings. So, that's an non-starter for me

Which Debian LXC container? Neither Debian 10 Buster (old stable) nor Debian 11 Bullseye (current stable) need to be run as privileged CT, on the contrary IIRC the run better as unpriviledged and nesting is just a safe and OK feature to use for unpriv. CT, so not sure what the "non-starter" is in that regard.

Search

Search

[SOLVED] PVE Firewall not filtering anything

t.lamprecht

Proxmox Staff Member

ozdjh

Renowned Member

shippj

Member

ozdjh

Renowned Member

t.lamprecht

Proxmox Staff Member

shippj

Member

d1_sen

Member

ozdjh

Renowned Member

d1_sen

Member

ozdjh

Renowned Member

lucaferr

Renowned Member

Elektordi

New Member

l3x

New Member

t.lamprecht

Proxmox Staff Member

We value your privacy