How to deal with very large IPSet?

Jun 2, 2021
4
0
21
I have a service that automates the Proxmox Firewall based on a few inputs, like HTTP traffic analysis. The goal here is to block malicious traffic on the firewall level (primarily LLM crawling bots and vulnerability scanners). To achieve that, I have a tool that fires a lot of requests to the IPSet API endpoint at /cluster/firewall/ipset/{name}/{cidr}, and that works fine. Usually.

Earlier today, I hit a state where the list of blocked IPs exceeded something like 250k, and then everything became unhappy. pve-firewall started failing with status update error: No buffer space available errors, and the HTTP API and the browser UI also started failing with the same error. So I think what happened is that my /etc/pve/firewall/cluster.fw file exceeded some sort of max-buffer size? I was able to recovery by editing the firewall config file manually and restarting the daemon, but of course, that's not great.

I now wonder how to best deal with this. One option I of course have is to automate iptables or nfttables manually myself, which, according to some earlier threads like this should maybe be fine? However, I do wonder if there's an alternative solution to this, so I'm curios if there are any alternative approaches to blocklisting a large number of IPs.
 
There is a size limit of /etc/pve of 128 MiB and a size limit of individual files of 1 MiB and an inode limit of 256k. Seems like you hit at least one of these limits with such a large IPSet.

https://github.com/proxmox/pve-cluster/blob/master/src/pmxcfs/memdb.h said:
#define MEMDB_MAX_FILE_SIZE (1024 * 1024) // 1 MiB
#define MEMDB_MAX_FSSIZE (128 * 1024 * 1024) // 128 MiB
#define MEMDB_MAX_INODES (256 * 1024) // 256k

Just a thought: Do you keep the ip in the list forever or do you let them be removed after, lets say 24h?
 
TIL about the total size limit of /etc/pve. In that case, I should probably make sure I enforce a maximum limit for now, as that could break all kinds of other things. :/

I do remove IPs from the list again after a certain period of time, yeah. I also already aggregate subnet masks if I have multiple consecutive IPs that I can express as a single CIDR block. I don't yet speculatively aggregate subnets based on a threshold of "percentage of blocked IPs per subnet/ASN", primarily because that's annoying to implement and also not risk free either.