[SOLVED] PVE Firewall not filtering anything

net.bridge.bridge-nf-call-arptables
net.bridge.bridge-nf-call-ip6tables
net.bridge.bridge-nf-call-iptables
Those are set by /etc/sysctl.d/pve.conf since 2012 https://git.proxmox.com/?p=pve-cluster.git;a=commitdiff;h=501839cac97f68d4dcba21df6fb3797b976e9e56 due to causing performance regressions on many guests, bridge separation should rather use a real separating technology like VLAN or VXLAN, see the following mail for some details
https://lists.proxmox.com/pipermail/pve-devel/2012-March/002418.html
(other distros like RHEL was taken as argument that this was accepted and working default behavior)

If you want to apply rules directly on the bridge, not the actual tapX or ethX devices like pve-firewall does, then you can just drop those lines from that config (or add an override in a lexical later sorted filename to set them to 1).

This could be mentioned in the docs, but besides that I see no issue here - the PVE firewall works after all and other do not suddenly stop working due to this, this is set on boot (so any simple test would show that the setting is off).
 
@t.lamprecht, this is marked as solved but I don't know what the solution is.

I am seeing this same problem. We have a prod cluster and a dev cluster that we upgraded to 6.3 & Octopus about 5 weeks ago. We recently noticed spoofed traffic coming out of a prod cluster node and (with the help of support ticket #1879015) identified that net.bridge.bridge-nf-call-iptables was set to 0 and that none of the VM firewall rules were being enforced anymore. They had been working prior to the upgrade.

We checked the lab cluster but FW wasn't enabled at the datacenter level. We enabled it and can now see that PVE has set net.bridge.bridge-nf-call-iptables=1.

These are 2 totally standard installations, with FW rules configured solely through the web UI. Your support engineer says that net.bridge.bridge-nf-call-iptables should be 1 for the firewall to work as expected. I suggested editing /etc/sysctl.d/pve.conf but he said as we didn't know why PVE was disabling this we couldn't be sure it wouldn't do so again at some stage.

The iptables chains are called 'tapXiY' and are children of FORWARD. /etc/sysctl.d/pve.conf on both clusters sets the br-nf settings to 0. Yet on the dev cluster PVE has set net.bridge.bridge-nf-call-iptables to 1 while the prod cluster is leaving it at 0. Which is correct, and can you give me any idea what to look at to work out what in PVE is changing that value on one cluster but not the other?
 
This problem does NOT appear to be solved.

This seems like a CRITICAL SECURITY PROBLEM that has persisted for WAY too long!

I'm seeing this problem on 2 production machines. 6.3-4 and 6.4-4. I have the firewall enabled at the datacenter level and the host level. When I add a DROP or REJECT rule to the host, i'm still able to ping the host from the IP that is supposed to be dropped. The firewall also fails to work on VMs and containers. Nothing relevant in journalctl or syslog or the console. "pve-firewall restart" doesn't help. "pve-firewall status" show's running. Restarting the pve server doesn't help. I'm not using CEPH.

Every once in a while, I can get the firewall to work, but I'm unable to reproduce it. When it's working, changing the DROP rule to ALLOW and then back to DROP results in it continuing to allow. I'm waiting at least 60 seconds to make sure.
 
Last edited:
The comments on this thread regarding a ceph upgrade without a reboot match our experience. We upgraded to 6.3 then after testing everything was OK we upgraded Ceph to Octopus. We rebooted during the PVE upgrade but did not reboot after the Ceph upgrade. That matches what others have said. Running "pve-firewall restart" sets things back to normal.

We've had a ticket open for this but that has been unproductive. We've spent too long on this with no interest from support about finding the root cause ('try upgrading to 6.4', 'it must be something else you've installed' etc). We've written a plugin for our monitoring system to alert if this happens again so we can manually restart the firewall. It's not ideal but we're going to leave it at that.
 
@t.lamprecht, this is marked as solved but I don't know what the solution is.
Note that this was not marked as solved by any staff here, but by the thread OP, so please do not take that thread prefix as "proxmox staff just ignores this", the original author had their issue deemed as solved and so marked their thread as such.

We plan to improve the situation by dropping the sysctl override from pve-cluster, which is quite outdated nowadays and can be better handled by an admin themselves. Also, we will enforce relevant settings more pro-actively and periodically from pve-firewall, as long as the FW is configured to be enabled.

Nothing relevant in journalctl or syslog or the console. "pve-firewall restart" doesn't help. "pve-firewall status" show's running. Restarting the pve server doesn't help. I'm not using CEPH.
Then you're probably not affected by this threads issue, as a restarted FW always updates the relevant sysctl. Ensure you actually enabled the FW on all levels, that the rules and settings are correct and that you do not open a connection to the target first so that connection tracking allows the other side even if it'd wouldn't be allowed else.
 
I think I just narrowed down the problem some. The firewall rules appear to work for tcp connections, but not icmp. When I add a firewall rule to block an IP, the tcp ports,such as ssh become unusable almost instantly, but icmp pings keep working no matter how hard I try to make the firewall block them.
 
  • Like
Reactions: lucius_the
Has proxmox provided a solution for this issue? I recently upgraded to v7.0-9 and none of my FW rules are working, even after a "pve-firewall restart".

I have checked and rechecked, the FW is enabled in all places. It was all working before the upgrade.
 
To see if it's the same problem check the output of
Code:
sysctl net.bridge.bridge-nf-call-iptables
If that's set to 0 then the VM traffic isn't being passed through iptables. Setting that to 1 will fix things.
 
hey Ozdjh,

Thanks for your reply. this is the output of the sysctl:

net.bridge.bridge-nf-call-ip6tables = 0 net.bridge.bridge-nf-call-iptables = 0 net.bridge.bridge-nf-call-arptables = 0 net.bridge.bridge-nf-filter-vlan-tagged = 0 fs.aio-max-nr = 1048576

I saw a couple posts here mentioning that after restarting pve-firewall the above was set to 1, is that not the case?

I am reluctant to use this option in my production because I dont want to risk the firewall rule disabling again all by itself.
 
Hey d1_sen.

I would also have assumed that a firewall restart would have fixed this. And, I was under the impression that PVE 7 set the correct forwarding any time a firewall rule was pushed. Have you tried editing the rules of a VM to see if that enabled net.bridge.bridge-nf-call-iptables ?
 
hey Ozdjh,

Thanks for your reply. this is the output of the sysctl:

net.bridge.bridge-nf-call-ip6tables = 0 net.bridge.bridge-nf-call-iptables = 0 net.bridge.bridge-nf-call-arptables = 0 net.bridge.bridge-nf-filter-vlan-tagged = 0 fs.aio-max-nr = 1048576

I saw a couple posts here mentioning that after restarting pve-firewall the above was set to 1, is that not the case?

I am reluctant to use this option in my production because I dont want to risk the firewall rule disabling again all by itself.
I can confirm that in latest PVE 6.4, as well as in PVE 5.4, a pve-firewall restart set them all to 1 and fixes the problem (and can be safely run in production, since it doesn't interrupt the networking, I've run it dozens of times on hosts running tens of mission critical VMs).
Unfortunately I don't have any PVE 7 to test
 
I had the same problem today with a fresh proxmox 6.3-2 install!
Firewall was working few hours ago, did some tests including creating a cluster of 2 servers and installing ceph, "net.bridge.bridge-nf-call-iptables" was changed to 0. Problem was fixed with a "pve-firewall restart" and "net.bridge.bridge-nf-call-iptables" is now back to 1.
 
The firewall feature in pve-manager/7.1-7/df5740ad (running kernel: 5.13.19-2-pve) is flaky at best.

For example, when I enable the firewall at the DataCenter, Cluster and Container levels, I have seen it allow all traffic through to the container.

When I change a setting at the Cluster level, e.g., to allow all icmp traffic, then all traffic is blocked.

When I restart the firewall, with either `systemctl restart pve-firewall` or `pve-firewall restart` with the icmp ALLOW all rule in place, one ping will be allowed each time I run either command.

Does getting the subscription fix this issue? (my guess is that it probably does not)

This is basic stuff that really should work.

(And another un-related nit is: The debian lxe containers are super slow to start and require privileged and nested settings. So, that's an non-starter for me).

Otherwise, proxmox 7.1 is very nice so far.
 
The firewall feature in pve-manager/7.1-7/df5740ad (running kernel: 5.13.19-2-pve) is flaky at best.

For example, when I enable the firewall at the DataCenter, Cluster and Container levels, I have seen it allow all traffic through to the container.
Datacenter is Cluster level, so did you enable it on the Nodes? Remember that PVE won't cut-off established, existing connections (tracking) so,

When I change a setting at the Cluster level, e.g., to allow all icmp traffic, then all traffic is blocked.
That sounds wrong, and I cannot reproduce that at all here, maybe post your firewall config and more details about your setup, especially network related (how VMs traffic is routed out in general)

Code:
/etc/pve/firewall/cluster.fw
/etc/pve/nodes/nina/host.fw
/etc/pve/firewall/<VMID>.fw

And maybe open a new post for that, this one is rather old and doesn't have to do anything with whatever you see.

Does getting the subscription fix this issue? (my guess is that it probably does not)
Proxmox VE is 100% open source so, no, not in that regard you seem to imply - features are the same for everyone in Proxmox products.
Albeit subscriptions can give you enterprise support, and yes that support team could help to get this sorted out and advice you on the symptoms your seeing.
This is basic stuff that really should work.
If configured correctly it will.

The debian lxe containers are super slow to start and require privileged and nested settings. So, that's an non-starter for me

Which Debian LXC container? Neither Debian 10 Buster (old stable) nor Debian 11 Bullseye (current stable) need to be run as privileged CT, on the contrary IIRC the run better as unpriviledged and nesting is just a safe and OK feature to use for unpriv. CT, so not sure what the "non-starter" is in that regard.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!