Questions on Full-Mesh-Routing (with Fallback) FRR

Hello,
we're in the process of setting up our PoC-Cluster and want to use the FRR full mesh routing for the ceph cluster. See Routed Setup (with fallback) on https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server .

1st problem: Invalid position of "post-up" script
It says "Add this line to /etc/network/interfaces": "post-up /usr/bin/systemctl restart frr.service".
As far as I can see this is supposed to be added to the very bottom. But this leads to an error when running "ifreload -a" or clicking "apply configuration" in the proxmox network gui.
I've added these lines to the 2 interfaces that are used in FRR and the error is gone.

Q1: Is this correct or does it have to be somewhere else? Could someone please edit the wiki with the correct information? Thank you!

2nd problem: Clicking "apply configuration" fails to restart frr
If you're working on the network configuration and click "Apply configuration" too quickly frr will stop but wont start again.
`systemd status frr`:
Feb 10 08:28:46 pve-poc-1 systemd[1]: frr.service: Start request repeated too quickly.
Feb 10 08:28:46 pve-poc-1 systemd[1]: frr.service: Failed with result 'start-limit-hit'.

Looking into the systemd file:
StartLimitInterval=3m
StartLimitBurst=3

So restarting more than 3 times in 3 mins leads to the error above. Having the "post-up" script on two interfaces leads into the crash after 2 clicks on "Apply configuration" in 3 minutes. If you have more interfaces in frr (which means more post-up scripts) can lead to this failure even on a single click of "Apply configuration".

Q2: Is a restart really necessary or would a reload suffice?
Q3: Is it safe to remove the StartLimit* options from the frr service file?

3rd problem: Configure two full-mesh-networks
How can I configure 2 full mesh FRR networks with FRR? I need one for Ceph and one for corosync. In the guide above we're using "lo" the standard loopback adapter.
What I figured is this:
- Add another lo adapter in /etc/network/interfaces "auto lo:10 inet loopback"
- create another router in frr.conf and use it in the interfaces

Q4: Is my assumption correct? Could you add a working example to the wiki please?

Thanks for your help!
 
  • Like
Reactions: erathia
Hi!
A1: no you're right this should be on the interfaces.
A2: We usually recommend installing frr-pythontools which has ahttps://docs.frrouting.org/en/latest/frr-reload.html frr-reload.py script and enables us to apply configuration without restarting. systemctl restart frr is used as a fallback if this packages is not installed.
A3: It should be fine, it's there to prevent any frr loops and systemctl messing stuff up. Though you should be fine installing the pythontools package above.
A4: We don't recommend using a fabric with corosync as corosync usually likes to manage these things by itself. Sadly having two fabrics is not possible either as (eg.) lo:0 is just a loopback alias and is not addressable by frr. (Using dummy interfaces is also not possible as the openfabric daemon does not handle them correctly.)
 
Yep, will do.
A small correction on A4: the dummy interface will not work with frr when using interfaces without addresses. If you configure ip addresses for every interface in the fabric, you can use dummy interfaces and this layer two openfabric areas (obviously you need to use different interfaces for the two fabrics).
 
Last edited:
Regarding frr-reload:
Weve tested it extensivly and it doesnt work. Neither /usr/lib/frr-reload nor /usr/lib/frr-reload.py --reload /etc/frr/frr.conf nor systemctl reload frr.
After `ifreload -a` there are no openfabric routes. Each time.

Were now back to restarting frr, but only once via a systemd-timer regardless how many interfaces are reloaded.
Add this script in /opt/frr/queue-restart.sh:
#!/bin/bash
/usr/bin/systemd-run --on-active=1 --unit frr-restart.service >/dev/null 2>&1
exit 0
Add this to /lib/systemd/system/frr-restart.service
[Unit]
Description=Restart FRR

[Service]
ExecStart=/usr/bin/systemctl restart frr.service
And use this as post-up in /etc/network/interfaces on each interface:
post-up /opt/frr/queue-restart.sh