I am encountering a problem on busy servers were the nodes "inexplicably" lost connectivity with cluster partners and fence themselves off. Some investigation shows that when this happens, pve-firewall is enabled and conntrack table is full.
a quick look at "virgin" iptables rules has entries for cluster and ceph interfaces, which means that cluster traffic is subject to conntrack. This is NOT DESIRABLE. I am now beginning to write an override procedure but it occured to me that all of these are hard coded into /usr/share/perl5/PVE/firewall.pm. It is imperative that cluster interfaces are NOT hamstrung by conntrack.
1. For the immediate term I need to override the preset rules. What strategy do you guys suggest? via systemd, cron, ? do I need to delete existing rules or would
be sufficient at the top?
2. devs, PLEASE revisit the logic of creating your chain rules by limiting them to interfaces identified as internet or intranet facing. You can add a "user facing" checkbox in the network definition that adds the comment to the interface stanza to identify but you may have cleverer logic. I will also file this as a feature request.
a quick look at "virgin" iptables rules has entries for cluster and ceph interfaces, which means that cluster traffic is subject to conntrack. This is NOT DESIRABLE. I am now beginning to write an override procedure but it occured to me that all of these are hard coded into /usr/share/perl5/PVE/firewall.pm. It is imperative that cluster interfaces are NOT hamstrung by conntrack.
1. For the immediate term I need to override the preset rules. What strategy do you guys suggest? via systemd, cron, ? do I need to delete existing rules or would
iptables -A INPUT -i bond1 -p all -j NOTRACK
be sufficient at the top?
2. devs, PLEASE revisit the logic of creating your chain rules by limiting them to interfaces identified as internet or intranet facing. You can add a "user facing" checkbox in the network definition that adds the comment to the interface stanza to identify but you may have cleverer logic. I will also file this as a feature request.