Proxmox SDN with EVPN and BGP routing

marcovalle · Sep 30, 2024

Hello everyone.

I'm trying to set up pfSense to route correctly the traffic from outside of Proxmox cluster to SDN networks.

I configured EVPN zone with two exit nodes (node #1 and node #2), a primary exit node (node #1) and two BGP controllers (one for each node).
On pfSense I set up priorities for BGP, in order to route the traffic through the primary exit node in normal situations and through the other node if the first one is offline.

The BGP routes on pfSense are:

Code:

BGP table version is 133, local router ID is 10.10.170.1, vrf id 0
Default local pref 100, local AS 65000
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 * i10.10.198.0/24   10.10.170.31             0    100      0 ?
 *>i                 10.10.170.32             0    100     10 ?
 *>i10.10.198.111/32 10.10.170.31                  100      0 i

Displayed  3 routes and 5 total paths

If I don't contact the VM for sometime the last routing rule (10.10.198.111/32 via 10.10.170.31) is not present.

10.10.170.1 is the IP of pfSense on EVPN/BGP Network
10.10.170.31 is the IP of node #1 on EVPN/BGP Network
10.10.170.32 is the IP of node #2 on EVPN/BGP Network
10.10.198.0/24 is a network managed by EVPN zone.
10.10.198.111 is the IP of a VM on node #2

With this configuration all the traffic must pass through node #1 even if the VM is on node #2.

I have two doubs:
1) Why node #1 advertise with BGP the IP of the VM hosted on the other node and node #2 doesn't? (this works even if the primary exit node is not set)
2) How can I avoid to set up a primary exit node and to make possible that each node advertise with BGP only the VM it is hosting (with host specific routing rules /32)?

Thanks in advance.

shanreich · Sep 30, 2024

Can you post your SDN configuration?

Code:

cat /etc/pve/sdn/*

marcovalle · Oct 1, 2024

Hi, thanks for your reply.

I revised my SDN configuration and now the situation is the following:

Code:

# cat /etc/pve/sdn/*

evpn: ctl-evpn
        asn 65000
        peers 10.10.170.31,10.10.170.32,10.10.170.33

bgp: bgppve01
        asn 65000
        node pve01
        peers 10.10.170.1
        bgp-multipath-as-path-relax 0
        ebgp 0

bgp: bgppve02
        asn 65000
        node pve02
        peers 10.10.170.1
        bgp-multipath-as-path-relax 0
        ebgp 0

subnet: sdn-10.10.198.0-24
        vnet vNet0
        gateway 10.10.198.1

vnet: vNet0
        zone sdn
        tag 7010

evpn: sdn
        controller ctl-evpn
        vrf-vxlan 7000
        exitnodes pve02,pve01
        exitnodes-primary pve01
        ipam pve
        mac BC:24:11:FA:81:1D
        mtu 1450

When the VM is on node #1:

Code:

BGP table version is 169, local router ID is 10.10.170.1, vrf id 0
Default local pref 100, local AS 65000
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found


    Network          Next Hop            Metric LocPrf Weight Path
 *>i10.10.198.0/24   10.10.170.32             0    100     10 ?
 * i                 10.10.170.31             0    100      0 ?
 *>i10.10.198.111/32 10.10.170.32                  100     10 i


Displayed  2 routes and 3 total paths

When the VM is on node #2:

Code:

BGP table version is 167, local router ID is 10.10.170.1, vrf id 0
Default local pref 100, local AS 65000
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found


    Network          Next Hop            Metric LocPrf Weight Path
 *>i10.10.198.0/24   10.10.170.32             0    100     10 ?
 * i                 10.10.170.31             0    100      0 ?
 *>i10.10.198.111/32 10.10.170.31                  100      0 i


Displayed  2 routes and 3 total paths

When the VM is on node #3:

Code:

BGP table version is 171, local router ID is 10.10.170.1, vrf id 0
Default local pref 100, local AS 65000
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found


    Network          Next Hop            Metric LocPrf Weight Path
 *>i10.10.198.0/24   10.10.170.32             0    100     10 ?
 * i                 10.10.170.31             0    100      0 ?
 * i10.10.198.111/32 10.10.170.31                  100      0 i
 *>i                 10.10.170.32                  100     10 i


Displayed  2 routes and 4 total paths

On previous post I forgot to mention the IP of Proxmox node 3 on EVPN/BGP network (10.10.170.33).

When the VM is on node #2 or on node #3 I can ping it from outside of the PVE cluster, instead when the VM is on node #1 is unreacheable.
Probably I could avoid that the VM is unreacheable when is hosted by node #1 using prefix filtering on pfSense (disallowing /32 routes), but this would be a workaround.

I noticed that the route /32 is not advertised before the first ping when the VM is on node #2 (I checked on BGP routing table of pfSense).

spirit · Oct 1, 2024

1) because of the primary exit-node
2) primary exit-node option is optionnal,but I remember that the gui have a bug, trying to force a primary . I thinked it was fixed.

@shanreich
https://lists.proxmox.com/pipermail/pve-devel/2024-February/061924.html

2b) we can't force announce of the vm ip only on the node where the vm is running. (The only way is to have upstream routers supporting the evpn protocol, so it's no possible with pfsense).

shanreich · Oct 1, 2024

spirit said:
2) primary exit-node option is optionnal,but I remember that the gui have a bug, trying to force a primary . I thinked it was fixed.

@shanreich
https://lists.proxmox.com/pipermail/pve-devel/2024-February/061924.html

I honestly thought that was already merged, I'll look into getting this in the next bump!

marcovalle · Oct 5, 2024

spirit said:
1) because of the primary exit-node
2) primary exit-node option is optionnal,but I remember that the gui have a bug, trying to force a primary . I thinked it was fixed.

@shanreich
https://lists.proxmox.com/pipermail/pve-devel/2024-February/061924.html

2b) we can't force announce of the vm ip only on the node where the vm is running. (The only way is to have upstream routers supporting the evpn protocol, so it's no possible with pfsense).

1/2) I observe the same behavior without having set the primary exit node and I still don't understand why if the VM is on node #1 (which is the primary exit node) the BGP advertises the route through node #2.

2b) It seems reasonable to me.

rentner · Oct 16, 2024

i can confirm too, in 8.2.7 that via ebgp advertised next hop (from primary exit node via bgp to outside router) is /31-route from node where vm is running on, not ip from primary exit node. Routing from outside to inside Proxmox SDN cloud not working, tcpdump confirmed, that routing from inside-outside is leaving primary exit node correctly

Search

Search

Proxmox SDN with EVPN and BGP routing

marcovalle

New Member

shanreich

Proxmox Staff Member

marcovalle

New Member

spirit

Distinguished Member

shanreich

Proxmox Staff Member

marcovalle

New Member

rentner

New Member