Nat with nftables - How To

bofh

Renowned Member
Nov 7, 2017
160
25
68
45
Ill post this as fresh because i have a hunch that this question will come up once people start using nftables

2 Things we gonna need
-the file with the actual nat rules
-a systemd service to load them


Proxmox does not use /etc/nftables. The proxmox.firewall service manages that all directly. So in order our rules to load we need todo this on our own.

i will assume our file is /etc/network/nat.nft
put it where you want but you need to adjust both paths in the systemd file

/etc/systemd/system/nat.service:
Code:
[Unit]
Description=Load NAT nftables rules after Proxmox firewall
After=proxmox-firewall.service
Wants=proxmox-firewall.service
PartOf=proxmox-firewall.service

[Service]
Type=oneshot
ExecStartPre=/bin/sh -c 'grep -E "^\s*table\s+\w+\s+[^\s}]+" /etc/network/nat.nft | while read -r _ family table _; do nft list tables | grep -q "table $family $table" && nft flush table $family $table || true; done'
ExecStart=/usr/sbin/nft -f /etc/network/nat.nft
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target


Now the actual example nat file
/etc/network/nat.nft
Code:
table inet nat { chain prerouting { type nat hook prerouting priority -100;

    # DNAT rules
    iifname "eth0" ip protocol tcp tcp dport 80 ip daddr 203.0.113.10 dnat to 172.16.0.10    # webserver HTTP
    iifname "eth0" ip protocol tcp tcp dport 443 ip daddr 203.0.113.10 dnat to 172.16.0.10   # webserver HTTPS
    iifname "eth0" ip protocol tcp tcp dport 8443 ip daddr 203.0.113.10 dnat to 172.16.0.10  # webserver alternate
    iifname "eth0" ip daddr 203.0.113.11 dnat to 172.16.0.11                                # app
    iifname "eth0" ip daddr 203.0.113.12 dnat to 172.16.0.12                                # email
}

chain postrouting {
    type nat hook postrouting priority 100;

    # SNAT rules
    oifname "eth0" ip saddr 172.16.0.10 snat to 203.0.113.10    # webserver
    oifname "eth0" ip saddr 172.16.0.11 snat to 203.0.113.11    # app
    oifname "eth0" ip saddr 172.16.0.12 snat to 203.0.113.12    # email
    oifname "eth0" ip saddr 172.16.0.13 snat to 203.0.113.12    # legacy (we share with email like brothers)
    oifname "eth0" ip saddr 172.17.0.0/24 snat to 203.0.113.11  # vpn range (entire network can go out on this ip)
}

}


keep in mind to have forward policy to accept - OR set forward accept rules on the node you run this nat


what our systemd script (execStartPre) does is:
we read the natfile
we extract all tables that are in this file
we flush em if they exist.
we also bind this service to proxmox.firewall so it gets restarted when firewall restarts (just to be sure)

this ensures not double rules after reloading
meanwhile you cant do a flush in your natfile because at system start that table will not exist and nftable pukes it out and breaks.
yea - flushing an non existing table will result in a break. therefore this solution

i made this autoread so you dont have to be careful when adding other tables in this file.

normally you should not have to, but now you can... enjoy
 
Last edited:
Hi!

Thanks for your guide! Some additional suggestions:

meanwhile you cant do a flush in your natfile because at system start that table will not exist and nftable pukes it out and breaks.
You can use destroy and recreate the table, which doesn't fail if a table does not exist:

Code:
       delete    Delete the specified table.

       destroy   Delete the specified table, it does not fail if it does not
                 exist.

       list      List all chains and rules of the specified table.

       flush     Flush all chains and rules of the specified table.

You can create and delete a table in the same nft file, so the replacement should be atomic.


You can also use maps to make the NAT rules more efficient, see [1]. You might not even need to flush / delete the whole ruleset then since you can just update the maps.

[1] https://wiki.nftables.org/wiki-nftables/index.php/Multiple_NATs_using_nftables_maps
 
yea personally i dislike maps a bit. while the idea is great the readability is a bit lacking, at least to me personally.
btw these few rules are just a basic barebones example, answering 99% of the questions in that topic.


i didnt wanted to destroy the table when there is a reload without a reboot.
unlike flushing it could have impact on current connections and conntrack
maybe iam getting old but i like to try to avoid destructive operations

i also wanted to leave it out of the rules file.
so that means i have to read in that file anyway to kill whatever is in there either way.

these mandatory - "i have todo this before i can do that" are kinda dangerous long term in configs,
at least for me because i will forget why i put it in there


so i wanted to make sure whatever is in the rules file it will be flushed before rewritten, and no kill commands or anything within the config.


so we could save the "reading in config file and flush/destroy" part if we only use maps or destroy in the config file. we still need the system.d file.
so i wanted a more solid approach, a simple files rules only, that will work without reading any notes 2 years later and that i can read 3 years later

as a public howto i would avoid maps even more, should be simple and easy to read without misreading half of it :)