SDN : How is choice ce router-id/correct_src IP?

adofou

Member
Mar 14, 2020
20
1
23
34
Hello,

On the configuration generated by Proxmox SDN, this one select as router-id/IP in route-map correct_src the only IPv4 available during installation.
Since then, I've added a loopback for redundancy, and would like use this for router-id/correct_src.
Because even if this loopback is announced by FRR's BGP himself, it is important at kernel level.
In fact, this src ip is pushed as "src" parameter in the kernel :

default nhid 282 via 213.152.X.A dev vmbr0.3 proto bgp src 213.152.X.B metric 20

I get two default routes, via two different BGP sessions, but both are with the same src in the kernel. This source is the IP of the first physical interface.
The result is that when the first interface goes down, the second automatically takes over... but the host still tries to source each packet with the IP it has on the first interface.
If the external connections are working properly, any attempt to connect from the host will fail.

Except that I can't find where the choice of IPv4 for these parameters is made.
I don't seem to see anything like it in the GUI, or in the SDN configuration files.

I've tried overriding the “correct_src” route-map in the meantime (in frr.conf.local), but the result isn't what I expected.
It seems that the PVE parser is smart enough not to override it and adds it as a second rule. Which is never match

frr.conf.local
Code:
route-map correct_src permit 1
 match ip address prefix-list loopbacks_ips
 set src 213.152.LOOP.BACK
exit

frr.conf generated
Code:
route-map correct_src permit 1
 match ip address prefix-list loopbacks_ips
 set src 213.152.PHY.INT <- IP on the first physical interface & router-id
exit
!
route-map correct_src permit 2
 match ip address prefix-list loopbacks_ips
 set src 213.152.LOOP.BACK
exit

I have try to add "no route-map correct_src" in frr.conf.local with no more sucess. But I think it's because how work the PVE parser to merge configuration :-)

I see the correc_src is apply on "ip protocol bgp route-map correct_src".
Okay, try to set src on another route-map and apply this route-map on BGP session directly?
Code:
route-map PUBLIC_RING0_IN permit 1
 match ip address prefix-list loopbacks_ips
 set local-preference 100
 set metric 100
 set src 213.152.LOOP.BACK
exit

[...]
  neighbor 213.152.X.Y route-map PUBLIC_RING0_IN in
Result :
default nhid 282 via 213.152.PHY.INT dev vmbr0.3 proto bgp src 213.152.PHY.GW metric 20
The LP/metric have apply, but the src seems be just ignored in this way?


Does anyone know how to change/define the router-id/correct_src with a loopback?

Bonus : How apply different src for different BGP sessions? To avoid RFC1918 session have for src my public IP :-)
Code:
172.31.252.2/31 dev vmbr0.6 proto kernel scope link src 172.31.252.3
172.31.252.4/31 nhid 80 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.PHY.INT metric 20
172.31.252.6/31 nhid 80 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.PHY.INT metric 20
172.31.253.2/31 dev vmbr1.6 proto kernel scope link src 172.31.253.3
172.31.253.4/31 nhid 80 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.PHY.INT metric 20
172.31.253.6/31 nhid 80 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.PHY.INT metric 20


Thanks!
 
SDN tries to get the source IP from the interfaces file. You also need to have it configured in the interfaces file, if you havent already.

Do you have a loopback interface set in the BGP controller (it is found in the advanced options)? If you have it configured there, then it should take the IP from the loopback interface that is in the interfaces file.
 
Last edited:
if you already have "correct_src" in frr.conf, you must already have define an extra bgp controller in the sdn with the loopback interface ? so it should use the ip of this loopback interface.

Note that the loopback ip is also use as src of the vxlan tunnel, so you can't have some kind of ip failover with a primary ip and an failover secondary ip on the loopback.

For my production, I'm using 2 physical interfaces, each with 1 different ip address (it's doing loadbalancing with ecmp) + 1 loopback ip used as source for vxlan tunnel + routes.
 
Last edited:
I mean, the default behavior don't change the src value. if you have 2 bgp session, each session use their own source.

Only if you declare a loopback in the bgp controller (and the usage it's not for a failover ip here, it's really to have a unique src ip , for the vxlan tunnel && the bgp session)

so, for this special usecase, the best way is to remove the loopback from the gui in the bgp controller, and do some tuning in frr.conf.local.
(but as I said, the vxlan tunnel can't failover on a different ip)

Could be great to have all configs (/etc/network/interfaces , /etc/pve/sdn/* ) for better understanding the setup.


Here an example of my setup:

/etc/network/interfaces

Code:
auto eth0
iface eth0 inet static
        mtu 9200
        address 10.3.30.9/31

auto eth1
iface eth1 inet static
        mtu 9200
        address 10.3.30.11/31

auto dummy0   #this is my loopback, used as src
iface dummy0 inet static
        address 10.3.99.41/32

/etc/pve/sdn/controllers.cfg

Code:
evpn: evpnctl
        asn 65000
        peers x.x.x.x,y.y.y.y   (This is my routes reflectors ip)

bgp: bgpnode1
        asn 65000
        node node1
        peers 10.3.30.8,10.3.30.10  (This is the bgp peers of both eth0 && eth1)
        loopback dummy0

#ip route

Code:
10.3.30.8/31 dev eth0 proto kernel scope link src 10.3.30.9     #the src is the iface ip
10.3.30.10/31 dev eth1 proto kernel scope link src 10.3.30.11  #the src is the iface ip

default nhid 1923 proto bgp src 10.3.99.41 metric 20 #the src is the loopback
        nexthop via 10.3.30.8 dev eth0 weight 1
        nexthop via 10.3.30.10 dev eth1 weight 1
default metric 2147483748
        nexthop via 10.3.30.8 dev eth0 weight 1
        nexthop via 10.3.30.10 dev eth1 weight 1

10.3.99.5 nhid 1923 proto bgp src 10.3.99.41 metric 20   #the src is the loopback
        nexthop via 10.3.30.8 dev eth0 weight 1
        nexthop via 10.3.30.10 dev eth1 weight 1

/etc/network/interfaces.d/sdn

Code:
...
auto vxlan_vnet100
iface vxlan_vnet100
        vxlan-id 100
        vxlan-local-tunnelip 10.3.99.41   #the src is the loopback
        bridge-learning off
        mtu 1500
 
Last edited:
SDN tries to get the source IP from the interfaces file. You also need to have it configured in the interfaces file, if you havent already.

Do you have a loopback interface set in the BGP controller (it is found in the advanced options)? If you have it configured there, then it should take the IP from the loopback interface that is in the interfaces file.
By interfaces file, do you mean /etc/network/interfaces?
If yes, the loopback (in reality, between 2 and 3) is directly configured inside

Code:
auto lo
iface lo inet loopback
        address 213.152.XX.XX/32 <- Public lo0 for BGP routage redundancy I want to use for src.

auto lo:0
iface lo:0 inet loopback
        address 172.31.0.1/32 <- Lo0 for EVPN (redundancy of transport via 2 ports + BGP). Correctly used via VTEP update-source)

auto lo:1
iface lo:1 inet loopback
        address 172.31.0.128/32 <- Lo0 future for dedicated redanduncy LAN to migration/sync, same problem with src. Not used for moment.

I have one BGP controller by node, but the loopback interface is 172.31.0.0/25 (lo:0) loopback, used for VTEP update-source and working perfectly.



if you already have "correct_src" in frr.conf, you must already have define an extra bgp controller in the sdn with the loopback interface ? so it should use the ip of this loopback interface.
As mentioned above, I have one BGP controller by node, but the loopback interface is 172.31.0.0/25 (lo:0) loopback, used for VTEP update-source and working perfectly (cf capture of one node, maybe more explicit).

That's generated this configuration :
Code:
 neighbor VTEP peer-group
 neighbor VTEP remote-as 3XXXX
 neighbor VTEP bfd
 neighbor VTEP update-source 172.31.0.1
 neighbor 172.31.0.2 peer-group VTEP
 neighbor 172.31.0.3 peer-group VTEP

The correct_src used another IP:
Code:
route-map correct_src permit 1
 match ip address prefix-list loopbacks_ips
 set src 213.152.XX.XX
exit

This IP is in vmbr0.3, the first vmbr0/IP address (unless lo0) configured in /etc/network/interfaces.
There are other vmbr0.X and even several vmbr1.X. But for the moment only vmbr0.3 and vmbr1.3 have a public IP (+ lo).

Note that the loopback ip is also use as src of the vxlan tunnel, so you can't have some kind of ip failover with a primary ip and an failover secondary ip on the loopback.

For my production, I'm using 2 physical interfaces, each with 1 different ip address (it's doing loadbalancing with ecmp) + 1 loopback ip used as source for vxlan tunnel + routes.
Well, there seems to be a problem, because it's not the same IP used between update-source (the one on the GUI controller) and the one in update_src (and that's good, otherwise it wouldn't work at all on my side, because it would source with a private IP :D).
 

Attachments

  • 1734703828750.png
    1734703828750.png
    13.8 KB · Views: 5
i think in this case it maybe tries to find the IP from the peer list or ip route? [1]. Maybe because address is not set in interfaces file?

[1] https://git.proxmox.com/?p=pve-netw...e0ab365c00fd858655ad977138d3ca30;hb=HEAD#l301

My "ip route" of node :

Note important : To avoid confusion the subnet of lo0 is 213.152.X.X/24 where the subnet of vmbr0 connection ins 213.152.A.A/24 (another /24).
The second physical port was connected with 83.167.X.X/24.
The subnet on EVPN is 213.152.B.B/28

Code:
default nhid 180 via 213.152.A.A dev vmbr0.3 proto bgp src 213.152.A.B metric 20 <- IP Gateway of vmbr0.3, same subnet than correct_src
83.167.X.X/31 dev vmbr1.3 proto kernel scope link src 83.167.X.Y <- Seconde port for redudancy, with another /31 subnet
172.31.0.2 nhid 161 via 172.31.2.2 dev vmbr1.5 proto bgp src 213.152.A.B metric 20
172.31.0.3 nhid 161 via 172.31.2.2 dev vmbr1.5 proto bgp src 213.152.A.B metric 20
172.31.0.129 nhid 162 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.A.B metric 20
172.31.0.130 nhid 162 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.A.B metric 20
172.31.1.2/31 dev vmbr0.5 proto kernel scope link src 172.31.1.3
172.31.1.4/31 nhid 161 via 172.31.2.2 dev vmbr1.5 proto bgp src 213.152.A.B metric 20
172.31.1.6/31 nhid 161 via 172.31.2.2 dev vmbr1.5 proto bgp src 213.152.A.B metric 20
172.31.2.2/31 dev vmbr1.5 proto kernel scope link src 172.31.2.3
172.31.2.4/31 nhid 161 via 172.31.2.2 dev vmbr1.5 proto bgp src 213.152.A.B metric 20
172.31.2.6/31 nhid 161 via 172.31.2.2 dev vmbr1.5 proto bgp src 213.152.A.B metric 20
172.31.252.2/31 dev vmbr0.6 proto kernel scope link src 172.31.252.3
172.31.252.4/31 nhid 162 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.A.B metric 20
172.31.252.6/31 nhid 162 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.A.B metric 20
172.31.253.2/31 dev vmbr1.6 proto kernel scope link src 172.31.253.3
172.31.253.4/31 nhid 162 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.A.B metric 20
172.31.253.6/31 nhid 162 via 172.31.253.2 dev vmbr1.6 proto bgp src 213.152.A.B metric 20
172.31.254.0/24 via 172.31.254.2 dev vmbr0.4
172.31.254.2/31 dev vmbr0.4 proto kernel scope link src 172.31.254.3
172.31.255.0/24 via 172.31.255.2 dev vmbr1.4
172.31.255.2/31 dev vmbr1.4 proto kernel scope link src 172.31.255.3
213.152.A.A/31 dev vmbr0.3 proto kernel scope link src 213.152.A.B
213.152.B.0/28 nhid 208 dev public proto bgp src 213.152.19.141 metric 20
213.152.B.A nhid 224 via 172.31.0.3 dev vrfbr_public proto bgp src 213.152.A.B metric 20 onlink
213.152.B.B nhid 206 via 172.31.0.2 dev vrfbr_public proto bgp src 213.152.A.B metric 20 onlink
213.152.B.C nhid 206 via 172.31.0.2 dev vrfbr_public proto bgp src 213.152.A.B metric 20 onlink

Note : I can send you full discovered IP past in private :)
 
The loopback interface setting expects the name of an interface - not the IP itself. I think it is trying to search for an interface with the name '172[..]' - doesnt' find it and then tries to find the next hop of the first peer via ip route and then uses that IP.

Instead of configuring all IPs on loopback, I would add a dummy interface for each IP and then use the name of the dummy interface, with the IP you want, in the configuration.

E:

How I configure dummy interfaces usually

Code:
auto dummy0
iface dummy0 inet static
        link-type dummy
        address 172.16.0.1/32
 
Last edited:
I mean, the default behavior don't change the src value. if you have 2 bgp session, each session use their own source.

Only if you declare a loopback in the bgp controller (and the usage it's not for a failover ip here, it's really to have a unique src ip , for the vxlan tunnel && the bgp session)

so, for this special usecase, the best way is to remove the loopback from the gui in the bgp controller, and do some tuning in frr.conf.local.
(but as I said, the vxlan tunnel can't failover on a different ip)

Could be great to have all configs (/etc/network/interfaces , /etc/pve/sdn/* ) for better understanding the setup.

All files in attachments :)

The IP using in correct_src is the IP in subnet used for the first BGP peers of controller BGP of the node.
I have no idea whether this is a clue or just an unimportant coincidence.

I feel like I'm twisting the SDN system a bit by thinking about it.
I need to separate my public traffic from the underlay traffic for my EVPN. But with redundancy of two L3 links on differents router (so via BGP + loopback).
I realized that the BGP controller loopback was (I assumed) only used for the update-src for the VTEP group.
But reading you, I get the impression that it should have been used for correct_src too, except that for one reason or another this doesn't seem to be the case.
 

Attachments

But reading you, I get the impression that it should have been used for correct_src too, except that for one reason or another this doesn't seem to be the case.

see above, I think it is because you entered the IP in the loopback setting of the BGP controller - not the name of the interface.

It would make sense to configure each router IP separately as dummy interface instead of configuring them all on loopback, since otherwise FRR will also use the wrong one (because the key lo still has 213.152.XX.XX/32)
 
By interfaces file, do you mean /etc/network/interfaces?
If yes, the loopback (in reality, between 2 and 3) is directly configured inside

Code:
auto lo
iface lo inet loopback
        address 213.152.XX.XX/32 <- Public lo0 for BGP routage redundancy I want to use for src.

auto lo:0
iface lo:0 inet loopback
        address 172.31.0.1/32 <- Lo0 for EVPN (redundancy of transport via 2 ports + BGP). Correctly used via VTEP update-source)

auto lo:1
iface lo:1 inet loopback
        address 172.31.0.128/32 <- Lo0 future for dedicated redanduncy LAN to migration/sync, same problem with src. Not used for moment.

I have one BGP controller by node, but the loopback interface is 172.31.0.0/25 (lo:0) loopback, used for VTEP update-source and working perfectly.
interfaces aliases (with :X) is really deprecated since multiple years,

try to create multiple loopback interfaces instead (lo0, lo1, lo2, ....).

The current sdn code really only manage 1ip by loopback interfaces, I'm really not sure of the behaviour if you use multiples ip on the lo0 interface.


Then, I think you'll be able to manage your own prefix list without any problem,without overriding
 
The loopback interface setting expects the name of an interface - not the IP itself. I think it is trying to search for an interface with the name '172[..]' - doesnt' find it and then tries to find the next hop of the first peer via ip route and then uses that IP.
Oh sure? Both seems be accepted by GUI and the configuration of FRR about update-group permit both : update-source in FRR
I tested both during my lab, I don't really remember if it made a difference to be honest.

I'll see if I can try with the interface name. But in itself, if it changes my VTEP update-source, it's annoying because it will probably break my EVPN tunnels in my current configuration. And I don't know if I can overwrite it via frr.conf.local (unless I set it manually for each peer?).
In other words, I'd have to find a different source for BGP and VTEP :/

It would make sense to configure each router IP separately as dummy interface instead of configuring them all on loopback, since otherwise FRR will also use the wrong one (because the key lo still has 213.152.XX.XX/32)
Yeah I try this. I have used dummy vs loopback during, test but not remember why I have rollback.

I just know that the addresses must be on an interface and not added with "up/down" (try for have one lo with multiple adresses).
If this is the case (at least via lo), it will restart frr each time a new network configuration file is applied via the GUI.
frr reload command fail. Restarting frr. at /usr/share/perl5/PVE/Network/SDN/Controllers/EvpnPlugin.pm line 640.
I imagine that ifupdown2 handles interface addresses and routes added in up/down in a different way.
 
  • Like
Reactions: adofou
interfaces aliases (with :X) is really deprecated since multiple years,

try to create multiple loopback interfaces instead (lo0, lo1, lo2, ....).

The current sdn code really only manage 1ip by loopback interfaces, I'm really not sure of the behaviour if you use multiples ip on the lo0 interface.


Then, I think you'll be able to manage your own prefix list without any problem,without overriding
That's an interesting point indeed. That's the problem with doing things late, we sometimes reuse old habits :)

In any case, I'm going to look into several loopbacks (or dummy, I never know which is best for me, honestly). And see what effect it has on the generated code. The same goes for IP vs interface in the GUI.

I'll keep you posted.
Many thanks for your time on my case!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!