proxmox 7.0 sdn beta test

Question:

why do you define the bgp controller with "ebgp:1" , with your route-reflector as peer,
as you use the same 65001 asn on both node && route reflector ?

ebgp:1 add " neighbor BGP remote-as external", so the peer should have a different asn.


If you need to do a simple evpn route reflector with ibgp (same asn on each proxmox node), same as, you don't even need to define a bgp controller in the sdn.
Simply create the evpn controller, add add the route reflector as peer.



The bgp controller is mainly used for 2 things:

- If you want to use ebgp (different asn for each proxmox node) for the evpn.

- if you want add an external bgp peer to a specific node. (on an exit-node for example, to forward evpn routes to a classic bgp router)
 
Last edited:
The vms ipv4 /32 or ipv6 /128 should be announced as evpn type3 routes.
It's possible to advertise the subnets defined in subnets.cfg with evpn type5 routes. It's not available in gui yet (I had send patchs some months ago, but it's not applied, I need to check that). But it should be possible to add the option in the zones.cfg.
OK perfect! For what its worth, here are my package versions (haven't yet upgraded to 7.X as I was having issues with my Intel 10Gig NICs and still have to sort that):

Code:
proxmox-ve: 6.4-1 (running kernel: 5.4.162-1-pve)
pve-manager: 6.4-13 (running version: 6.4-13/9f411e79)
pve-kernel-5.4: 6.4-12
pve-kernel-helper: 6.4-12
pve-kernel-5.4.162-1-pve: 5.4.162-2
pve-kernel-5.4.157-1-pve: 5.4.157-1
pve-kernel-5.4.151-1-pve: 5.4.151-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.5-pve2~bpo10+1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: not correctly installed
ifupdown2: 3.0.0-1+pve4~bpo10
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve2~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-network-perl: 0.6.0
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.7-pve1

In your route reflector, you should be able to redistribute evpn routes into your classic bgp routes
Maybe it could help, I have wrote note about frr route reflector last year(s)
https://git.proxmox.com/?p=pve-docs.git;a=blob_plain;f=vxlan-and-evpn.adoc;hb=HEAD
(see Route Reflectors section at the end)
I think I had take info from here:
https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn#using-frr

I don't have tried it since this time, it's almost the same conf that your config, but they are some more option.
All of this is fantastic info, thanks for sharing! I had a good idea it was you that wrote the above documentation (on git), we were using this prior to the SDN, all REALLY useful . The SDN helps streamline this and makes deployments super easy (we're a service provider so isolating subnets per VM is essential and now can be automated :)).

Question:

why do you define the bgp controller with "ebgp:1" , with your route-reflector as peer,
as you use the same 65001 asn on both node && route reflector ?

ebgp:1 add " neighbor BGP remote-as external", so the peer should have a different asn.


If you need to do a simple evpn route reflector with ibgp (same asn on each proxmox node), same as, you don't even need to define a bgp controller in the sdn.
Simply create the evpn controller, add add the route reflector as peer.



The bgp controller is mainly used for 2 things:

- If you want to use ebgp (different asn for each proxmox node) for the evpn.

- if you want add an external bgp peer to a specific node. (on an exit-node for example, to forward evpn routes to a classic bgp router)
Yup, my bad. I was in a rush yesterday to get into a meeting and fired off that reply before I reviewed it; but was over complicating everything.

Basically, all I needed was an external BGP peer to forward the evpn routes to my upstream BGP router(s) as they don't support ECMP static routes hence we want to eliminate the single point of failure on the exit node. I know in your above docs you had mentioned using VRRP, but wasn't sure how that'd integrate with the SDN plugin and wanted to keep everything in one place.

This config achieved what we were looking to do:

Perl:
root@myhostname-24:~# cat /etc/pve/sdn/*.cfg
evpn: evpn400
        asn 65001
        peers 192.168.1.24,192.168.1.25

bgp: bgpmyhostname-24
        asn 65001
        node myhostname-24
        peers 192.168.1.18
        ebgp 1

bgp: bgpmyhostname-25
        asn 65001
        node myhostname-25
        peers 192.168.1.18
        ebgp 1

subnet: evpn400-10.10.10.160-27
        vnet pubnet2
        gateway 10.10.10.190

vnet: pubnet2
        zone evpn400
        tag 14000

evpn: evpn400
        controller evpn400
        vrf-vxlan 10001
        exitnodes myhostname-25,myhostname-24
        ipam pve
        mac 9A:E1:F8:7C:C2:70
        mtu 1450

The evpn subnet is now perfectly passing to 192.168.1.18 (this is test environment, adding a redundant peer on live)!

I cannot say enough good things about your work here @spirit - major kudos! Appreciate the support!
 
Ok great !

Note that I don't work anymore the proxmox6 version of the plugin.
proxmox7 have new frr8 version with some evpn fixes, and sdn evpn plugin have also some new features (like advertise-subnets option).

I need to look at vrrp soon, I don't known yet if it'll manage it with frr directly ou keepalived. I need to look at this. (Some users have requested it when they have non-bgp upstream routers with static routes)
 
Ok great !

Note that I don't work anymore the proxmox6 version of the plugin.
proxmox7 have new frr8 version with some evpn fixes, and sdn evpn plugin have also some new features (like advertise-subnets option).

I need to look at vrrp soon, I don't known yet if it'll manage it with frr directly ou keepalived. I need to look at this. (Some users have requested it when they have non-bgp upstream routers with static routes)
Thanks for letting me know! I think I'll start install ProxMox 7 on my new clusters going forward and also look at upgrading some of my existing.

Along that note, I played around with using VRRP with FRR but ran into some troubles with my maclan devices, which I created on top of my linux bridge that uses a linux bond in active-backup mode. The multicast packets would only pass with the same virtual router ID, but kept creating loops, regardless my stp settings (same switch VLAN). Changing them to use unique virtual router IDs solved the loop issue, but both stayed in Master mode. I tracked some of this down to my link-local IPV6 links, but eventually decided to just use keep alive.

For those interested in VRRP with KeepAlive:

Perform these steps on Proxmox exit nodes, making sure to modify keepalived.conf accordingly (i.e. master/backup):

Install:
Bash:
$ sudo apt-get install keepalived

Allow non-local IP binding:
Bash:
$ sudo echo "net.ipv4.ip_nonlocal_bind=1" > /etc/sysctl.conf

Restart sysctl:
Bash:
$ sudo sysctl -p

Configure:
Bash:
$ sudo nano /etc/keepalived/keepalived.conf

Bash:
# Create VRRP instance
vrrp_instance VRRP_ALPHA {

    # The interface keepalived will manage
    interface vmbr0

    # The initial state to transition to. This option isn't
    # really all that valuable, since an election will occur
    # and the host with the highest priority will become
    # the master. The priority is controlled with the priority
    # configuration directive. MASTER or BACKUP
    state MASTER

    # The virtual router id number to assign the routers to
    virtual_router_id 1

    # The priority to assign to this device. This controls
    # who will become the MASTER and BACKUP for a given
    # VRRP instance. Higher has more weight 1-255.
    priority 255

    # How many seconds to wait until a gratuitous arp is sent
    garp_master_delay 2

    # How often to send out VRRP advertisements
    advert_int 1

    # IP Address of this device
    unicast_src_ip 192.168.1.1

    # IP Address of the peer device
    unicast_peer{
        192.168.1.2
    }

    # Authenticate the remote endpoints via a simple
    # username/password combination
    authentication {
        auth_type AH
        auth_pass monkey
    }
    # The virtual IP addresses to float between nodes. The
    # label statement can be used to bring an interface
    # online to represent the virtual IP.
        virtual_ipaddress {
        192.168.1.3 dev vmbr0 label vmbr0:vip
    }
}

Start service:
Bash:
$ sudo service keepalived start

Check status for errors:
Bash:
$ sudo systemctl status keepalived
 
Last edited:
Hi, I am not sure this problem is suitable to ask here. I have a problem with VM to DNS query outside the node. It seems that DNS reply message was tampered under EVPN zone.

Topo: (172.20.132.1)VM --- Node --- DNS Server(10.88.0.1)

The VM network is attached on EVPN zone, there is an SNAT rule to outside node.

iptables SNAT rule
Code:
Chain POSTROUTING (policy ACCEPT 351 packets, 33491 bytes)
 pkts bytes target     prot opt in     out     source               destination
 9005  654K MASQUERADE  all  --  *      eno1    172.20.128.0/21      0.0.0.0/0

The DNS reply message was tampered, DNS server port was redirected from 53 to other(here example is 475).

capture with tcpdump on VM
Bash:
21:32:35.382963 IP 172.20.132.1.39787 > 10.88.0.1.53: 14240+ A? ntp.ubuntu.com. (32)
21:32:35.579912 IP 10.88.0.1.475 > 172.20.132.1.39787: UDP, length 96
21:32:35.579952 IP 172.20.132.1 > 10.88.0.1: ICMP 172.20.132.1 udp port 39787 unreachable, length 132

When I changed the VM network attached to simple zone, DNS reply message was OK.

Thanks for your answer.
 
Hi, I am not sure this problem is suitable to ask here. I have a problem with VM to DNS query outside the node. It seems that DNS reply message was tampered under EVPN zone.

Topo: (172.20.132.1)VM --- Node --- DNS Server(10.88.0.1)

The VM network is attached on EVPN zone, there is an SNAT rule to outside node.

iptables SNAT rule
Code:
Chain POSTROUTING (policy ACCEPT 351 packets, 33491 bytes)
 pkts bytes target     prot opt in     out     source               destination
 9005  654K MASQUERADE  all  --  *      eno1    172.20.128.0/21      0.0.0.0/0

The DNS reply message was tampered, DNS server port was redirected from 53 to other(here example is 475).

capture with tcpdump on VM
Bash:
21:32:35.382963 IP 172.20.132.1.39787 > 10.88.0.1.53: 14240+ A? ntp.ubuntu.com. (32)
21:32:35.579912 IP 10.88.0.1.475 > 172.20.132.1.39787: UDP, length 96
21:32:35.579952 IP 172.20.132.1 > 10.88.0.1: ICMP 172.20.132.1 udp port 39787 unreachable, length 132

When I changed the VM network attached to simple zone, DNS reply message was OK.

Thanks for your answer.
I m currently on holiday , i ll check that next week.
 
Hello, is there any tooling to troubleshoot VXLAN zones?. I have the configuration applied in two nodes, but once I move a VM to the second node, connectivity with a VM2 that stays in the original node is lost.
 
Hello, is there any tooling to troubleshoot VXLAN zones?. I have the configuration applied in two nodes, but once I move a VM to the second node, connectivity with a VM2 that stays in the original node is lost.
I don't think they are any tool for debugging kernel vxlan tunnel yet.
the 2 vms are on the same vnet ?
do you use proxmox firewall ? if yes, is the port 4789 openned between the 2 nodes ?
 
I don't think they are any tool for debugging kernel vxlan tunnel yet.
the 2 vms are on the same vnet ?
do you use proxmox firewall ? if yes, is the port 4789 openned between the 2 nodes ?

Interestingly enough, I see with tcpdump outgoing traffic on both sides but no incoming traffic. I haven't setup any firewall rules so far. Ping works with "no fragment" and size=1600 bytes (I have MTU=9000 on the NIC and switch side).

edit: added a rule allowing UDP traffic on 4789, but no joy.
 
Last edited:
Ping works with "no fragment" and size=1600 bytes (I have MTU=9000 on the NIC and switch side).


Interestingly enough, I see with tcpdump outgoing traffic on both sides but no incoming traffic.
That"s really strange indeed.

The switches also run VXLAN as TEPs between themselves. Could that be the issue?

I don't known if it's a problem.. (maybe they "intercept" vxlan frame ??) . do you have tried with a simple switch ? (or a cross-cable between 2 servers).

For testing, maybe can you try to enable ipsec tunnel, to hide vxlan to your switches ?
https://pve.proxmox.com/pve-docs/chapter-pvesdn.html#_vxlan_ipsec_encryption
 
Last edited:
That"s really strange indeed.



I don't known if it's a problem.. (maybe they "intercept" vxlan frame ??) . do you have tried with a simple switch ? (or a cross-cable between 2 servers).

For testing, maybe can you try to enable ipsec tunnel, to hide vxlan to your switches ?
https://pve.proxmox.com/pve-docs/chapter-pvesdn.html#_vxlan_ipsec_encryption

it's a remote location, I can't change the connectivity. Will look into de IPSEC alternative, the DCN team assures the switches shouldn't mess with the VXLAN traffic on the switches.
 
There is lack of examples when using SDN with external Network like LAN for example. How to manage connectivity between LAN user -> Router -> Proxmox -> SDN -> VM
 
There is lack of examples when using SDN with external Network like LAN for example. How to manage connectivity between LAN user -> Router -> Proxmox -> SDN -> VM
with epvn ?

(Yes, I known, I need to provide some examples and schema ^_^)

It's really depend of what your router is able to do. (bgp ? native evpn ? only static routes ? ....)
 
with epvn ?

(Yes, I known, I need to provide some examples and schema ^_^)

It's really depend of what your router is able to do. (bgp ? native evpn ? only static routes ? ....)
Even with static routes I have issues.
I manage to create a Simple Zone with VNet of course and one subnet for it 10.0.101.0/24 with gateway 10.0.101.254

And that's for VMs in Proxmox.
Proxmox is connected with SRV VLAN 11 and has 10.0.1.1
MikrotTik Router is 10.0.1.30 in that VLAN

User Network (my computer) is behind Unify Switch and has address 192.168.0.0/24
Switch is connected with MikroTik using InterVLAN (Unify default method) on network 10.255.253.0/24

So basically ping works like that
Me can ping InterVLAN address of switch, router and Proxmox (10.0.1.1)
I also can ping Simple Zone VNet subnet gateway which is 10.0.101.254

Code:
traceroute to 10.0.101.254 (10.0.101.254), 64 hops max, 52 byte packets
 1  192.168.0.254 (192.168.0.254)  1.807 ms  0.903 ms  0.868 ms
 2  10.255.253.1 (10.255.253.1)  0.208 ms  0.190 ms  0.176 ms
 3  10.0.101.254 (10.0.101.254)  0.358 ms  0.367 ms  0.363 ms

But the problem is with Me and VM .. we can't ping each other.
So 192.168.0.1 (Me) can't reach 10.0.101.1

Code:
traceroute to 10.0.101.1 (10.0.101.1), 64 hops max, 52 byte packets
 1  192.168.0.254 (192.168.0.254)  1.367 ms  1.269 ms  0.922 ms
 2  10.255.253.1 (10.255.253.1)  0.247 ms  0.201 ms  0.184 ms
 3  10.0.1.1 (10.0.1.1)  0.382 ms  0.363 ms  0.339 ms
 4  *

On Proxmox Routing table is
Code:
konrad@pve:~$ ip r
default via 10.0.1.30 dev vmbr0.11
10.0.1.0/27 dev vmbr0.11 proto kernel scope link src 10.0.1.1
10.0.101.0/24 dev mynet1 proto kernel scope link src 10.0.101.254
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown

On MT
Code:
 3 ADC  10.0.1.0/27        10.0.1.30       SRV_11                    0
 4 ADC  10.0.10.0/24       10.0.10.254     EX_110                    0
 5 A S  10.0.101.0/24                      10.0.1.1                  1
 6 ADC  10.255.0.0/27      10.255.0.30     MGMT_10                   0
 7 ADC  10.255.253.0/24    10.255.253.1    InterVLAN                 0
 8 ADC  10.255.255.0/24    10.255.255.1    IPMI                      0
10 A S  192.168.0.0/24                     10.255.253.2              1
11 A S  192.168.1.0/24                     10.255.253.2              1

So basically there is issue on Proxmox probably with routes
I was thinking if this was the issue with Switch because 192.168.0.0/24 is generated on Switch not on MT but then ping to VNet Subnet gateway 10.0.101.254 wouldn't work either but it works.

If it comes to BGP and BGP-EVPN When having single controller BGP-EVPN it won't connect to MT but when creating a BGP Controller on Proxmox it connects to MT and even I receive networks routes from VNet Subnet on MT and broadcasting network on MT in BGP settings works .. I see that network in Proxmox Host routing table but VM can't reach it.

But strange because RouterOS v6 has no EVPN capability but RouterOS v7 has. But why then that additional BGP Controller is available on Proxmox SDN Options? I'm confused.
 
Last edited:
Even with static routes I have issues.
I manage to create a Simple Zone with VNet of course and one subnet for it 10.0.101.0/24 with gateway 10.0.101.254

And that's for VMs in Proxmox.
Proxmox is connected with SRV VLAN 11 and has 10.0.1.1
MikrotTik Router is 10.0.1.30 in that VLAN

User Network (my computer) is behind Unify Switch and has address 192.168.0.0/24
Switch is connected with MikroTik using InterVLAN (Unify default method) on network 10.255.253.0/24

So basically ping works like that
Me can ping InterVLAN address of switch, router and Proxmox (10.0.1.1)
I also can ping Simple Zone VNet subnet gateway which is 10.0.101.254

Code:
traceroute to 10.0.101.254 (10.0.101.254), 64 hops max, 52 byte packets
 1  192.168.0.254 (192.168.0.254)  1.807 ms  0.903 ms  0.868 ms
 2  10.255.253.1 (10.255.253.1)  0.208 ms  0.190 ms  0.176 ms
 3  10.0.101.254 (10.0.101.254)  0.358 ms  0.367 ms  0.363 ms

But the problem is with Me and VM .. we can't ping each other.
So 192.168.0.1 (Me) can't reach 10.0.101.1

Code:
traceroute to 10.0.101.1 (10.0.101.1), 64 hops max, 52 byte packets
 1  192.168.0.254 (192.168.0.254)  1.367 ms  1.269 ms  0.922 ms
 2  10.255.253.1 (10.255.253.1)  0.247 ms  0.201 ms  0.184 ms
 3  10.0.1.1 (10.0.1.1)  0.382 ms  0.363 ms  0.339 ms
 4  *

On Proxmox Routing table is
Code:
konrad@pve:~$ ip r
default via 10.0.1.30 dev vmbr0.11
10.0.1.0/27 dev vmbr0.11 proto kernel scope link src 10.0.1.1
10.0.101.0/24 dev mynet1 proto kernel scope link src 10.0.101.254
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown

On MT
Code:
 3 ADC  10.0.1.0/27        10.0.1.30       SRV_11                    0
 4 ADC  10.0.10.0/24       10.0.10.254     EX_110                    0
 5 A S  10.0.101.0/24                      10.0.1.1                  1
 6 ADC  10.255.0.0/27      10.255.0.30     MGMT_10                   0
 7 ADC  10.255.253.0/24    10.255.253.1    InterVLAN                 0
 8 ADC  10.255.255.0/24    10.255.255.1    IPMI                      0
10 A S  192.168.0.0/24                     10.255.253.2              1
11 A S  192.168.1.0/24                     10.255.253.2              1

So basically there is issue on Proxmox probably with routes
I was thinking if this was the issue with Switch because 192.168.0.0/24 is generated on Switch not on MT but then ping to VNet Subnet gateway 10.0.101.254 wouldn't work either but it works.
mmm,I'll check that. do you have tried to test with enabling proxy arp ? (I'm not sure, until ip allocation in nic is not done, maybe I can route up to the nic).


If it comes to BGP and BGP-EVPN When having single controller BGP-EVPN it won't connect to MT but when creating a BGP Controller on Proxmox it connects to MT and even I receive networks routes from VNet Subnet on MT and broadcasting network on MT in BGP settings works .. I see that network in Proxmox Host routing table but VM can't reach it.

But strange because RouterOS v6 has no EVPN capability but RouterOS v7 has.

if you router don't do evpn, you need to define exit-nodes, and theses nodes will route to your miktrotik router.
(you can also define an extra bgp peer (with an extra bgp-controller), to announce evpn routes through bgp to your miktrotik router.


if you router do evpn (with full symetric l3vni implementation), you don't need exit-nodes definition on your proxmox nodes, you can configure your mikrotik router as exit-node. (they should annonce the 0.0.0.0/0 route)

But why then that additional BGP Controller is available on Proxmox SDN Options? I'm confused.
The additional bgp controller, is if you need to tune specific bgp option for each host. (like different asn for ebgp, different peers,etc...).

For bastic implementation (ibgp-evpn full mesh, you only need to declare the evpn controller)proxmox post-up arp proxy
 
mmm,I'll check that. do you have tried to test with enabling proxy arp ? (I'm not sure, until ip allocation in nic is not done, maybe I can route up to the nic).




if you router don't do evpn, you need to define exit-nodes, and theses nodes will route to your miktrotik router.
(you can also define an extra bgp peer (with an extra bgp-controller), to announce evpn routes through bgp to your miktrotik router.


if you router do evpn (with full symetric l3vni implementation), you don't need exit-nodes definition on your proxmox nodes, you can configure your mikrotik router as exit-node. (they should annonce the 0.0.0.0/0 route)


The additional bgp controller, is if you need to tune specific bgp option for each host. (like different asn for ebgp, different peers,etc...).

For bastic implementation (ibgp-evpn full mesh, you only need to declare the evpn controller)proxmox post-up arp proxy


So I have added on MikroTik this
Code:
On MikroTik
/routing bgp peer
add address-families=ip,l2vpn in-filter=pve-in name=peer1 out-filter=pve-out remote-address=10.0.1.1 remote-as=65000 ttl=default

/routing bgp advertisements print
PEER     PREFIX               NEXTHOP          AS-PATH                                                           ORIGIN     LOCAL-PREF
peer1    192.168.0.0/16       10.0.1.30                                                                          igp               100

/ip route pr where dst-address=10.0.101.0/24
Flags: X - disabled, A - active, D - dynamic, C - connect, S - static, r - rip, b - bgp, o - ospf, m - mme,
B - blackhole, U - unreachable, P - prohibit
 #      DST-ADDRESS        PREF-SRC        GATEWAY            DISTANCE
 0 ADb  10.0.101.0/24                      10.0.1.1                200
Those routes to 10.0.101.0/24 received through BGP controller that connects with 10.0.1.1 on Proxmox BGP Controllers. Seems working? Seems.

So now on Proxmox

1. SDN/Options
- EVPN Controller
1658146246341.png
- BGP Controller
1658146326576.png

2. SDN/Zones
1658146375870.png

3. SDN/VNets
1658146409392.png

4. SDN/Subnets
1658146440571.png

So Ping doesn't goes to 10.0.101.254 from MikroTik (10.0.1.30)


Code:
ping 10.10.101.254 src-address=10.0.1.30
  SEQ HOST                                     SIZE TTL TIME  STATUS
    0 10.10.101.254                                           timeout
    1 10.10.101.254                                           timeout
    2 10.10.101.254                                           timeout

Traceroute from my computer also stops here


Code:
traceroute to 10.0.101.254 (10.0.101.254), 64 hops max, 52 byte packets
 1  192.168.1.254 (192.168.1.254)  2.244 ms  1.582 ms  1.727 ms
 2  10.255.253.1 (10.255.253.1)  1.626 ms  2.425 ms  0.837 ms
 3  10.0.1.1 (10.0.1.1)  1.299 ms  2.039 ms  1.562 ms
 4  *^C


1. Is MikroTik on USER VLAN (192.168.1.0/24) * btw all gateways are on .254 in each subnet/vlan with small exceptions
2. Is MikroTik on InterVLAN that connects MikroTik with Unifi Switch (10.255.253.2)
3. 10.0.1.1 is Proxmox Server on VLAN attached from MikroTik (Switch passthrough just VLAN Header)

Networks 192.168.0.0/24 and 192.168.1.0/24 are created on Unifi Switch with own DHCP on that switch.

My guess is something isn't correct on Proxmox Node as each ICMP stops there, on the Proxmox Server (10.0.1.1).
 
Last edited:
  • Like
Reactions: claud
So I have added on MikroTik this
Code:
On MikroTik
/routing bgp peer
add address-families=ip,l2vpn in-filter=pve-in name=peer1 out-filter=pve-out remote-address=10.0.1.1 remote-as=65000 ttl=default

/routing bgp advertisements print
PEER     PREFIX               NEXTHOP          AS-PATH                                                           ORIGIN     LOCAL-PREF
peer1    192.168.0.0/16       10.0.1.30                                                                          igp               100

/ip route pr where dst-address=10.0.101.0/24
Flags: X - disabled, A - active, D - dynamic, C - connect, S - static, r - rip, b - bgp, o - ospf, m - mme,
B - blackhole, U - unreachable, P - prohibit
 #      DST-ADDRESS        PREF-SRC        GATEWAY            DISTANCE
 0 ADb  10.0.101.0/24                      10.0.1.1                200
Those routes to 10.0.101.0/24 received through BGP controller that connects with 10.0.1.1 on Proxmox BGP Controllers. Seems working? Seems.

So now on Proxmox

1. SDN/Options
- EVPN Controller
View attachment 39062
- BGP Controller
View attachment 39063

2. SDN/Zones
View attachment 39064

3. SDN/VNets
View attachment 39065

4. SDN/Subnets
View attachment 39066

So Ping doesn't goes to 10.0.101.254 from MikroTik (10.0.1.30)


Code:
ping 10.10.101.254 src-address=10.0.1.30
  SEQ HOST                                     SIZE TTL TIME  STATUS
    0 10.10.101.254                                           timeout
    1 10.10.101.254                                           timeout
    2 10.10.101.254                                           timeout

Traceroute from my computer also stops here


Code:
traceroute to 10.0.101.254 (10.0.101.254), 64 hops max, 52 byte packets
 1  192.168.1.254 (192.168.1.254)  2.244 ms  1.582 ms  1.727 ms
 2  10.255.253.1 (10.255.253.1)  1.626 ms  2.425 ms  0.837 ms
 3  10.0.1.1 (10.0.1.1)  1.299 ms  2.039 ms  1.562 ms
 4  *^C


1. Is MikroTik on USER VLAN (192.168.1.0/24) * btw all gateways are on .254 in each subnet/vlan with small exceptions
2. Is MikroTik on InterVLAN that connects MikroTik with Unifi Switch (10.255.253.2)
3. 10.0.1.1 is Proxmox Server on VLAN attached from MikroTik (Switch passthrough just VLAN Header)

Networks 192.168.0.0/24 and 192.168.1.0/24 are created on Unifi Switch with own DHCP on that switch.

My guess is something isn't correct on Proxmox Node as each ICMP stops there, on the Proxmox Server (10.0.1.1).
sorry I'm currently on holiday, I have only access with my phone with poor connection.


if mikrotic don't support evpn correctly
---------------------------------------------------------
for evpn controller: peers: you need to use all proxmox host ips, to exchange evpn routes.

then, for the exit node(pve), you add the bgp controller with the additionnal mikrotik peer (exactly like on your screenshot)


if mikrotic support evpn correctly (with full symmetric l3vni)
--------------------------------------------------------------------------------------

on epvn controler: peers: define all proxmox hosts ip + mikrotik ip
don't define exit node on zone
configure your mikrotik to announce an evpn type5 route 0.0.0.0/0 + an l3vni vxlan interface.
 
I try to setup the following:
3 Node Proxmox Cluster using EVPN between nodes. This works as expected.
Now I want to uplink the EVPN to a Fortigate via BGP for uplink.
I have added a BGP Controller to a node and set this node as exit node.
Routes are correctly advertised.
But traffic sent from a lxc container is leaving on the wrong interface and not sent to the correct gateway:

PCAP:
10:32:49.632523 veth124i0 P IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632530 fwln124i0 Out IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632531 fwpr124p0 P IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632531 evpn01 In IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632545 vmbr0_182 Out IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632551 eno2 Out IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64

As you can see traffic is sent out via vmbr0_182, vmbr0_182 is the interface with the default gateway, but traffic should be sent to the IP 100.111.64.1, which is directly connected on vmbr0_164

IP Route:
default via 10.182.2.1 dev vmbr0_182 proto kernel onlink
default nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.42.1.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.42.42.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.42.55.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.182.0.0/24 nhid 150 via 100.111.64.2 dev vmbr0_164 proto bgp metric 20
10.182.1.0/24 nhid 150 via 100.111.64.2 dev vmbr0_164 proto bgp metric 20
10.182.2.0/24 dev vmbr0_182 proto kernel scope link src 10.182.2.101
10.212.134.254/31 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
100.111.64.0/29 dev vmbr0_164 proto kernel scope link src 100.111.64.3
100.111.64.10/31 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
100.111.64.12/31 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
151.248.130.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
192.168.101.0/24 dev vmbr1_101 proto kernel scope link src 192.168.101.101
192.168.102.0/24 dev vmbr1_102 proto kernel scope link src 192.168.102.101

FRR routing table:
root@chsfl1-cl01-pve01:~# vtysh -c "sh ip route"
B>* 0.0.0.0/0 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:49:13
B>* 10.42.1.0/24 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 10.42.42.0/24 [20/10] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 10.42.55.0/24 [20/10] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 10.182.0.0/24 [20/0] via 100.111.64.2, vmbr0_164, weight 1, 00:59:57
B>* 10.182.1.0/24 [20/0] via 100.111.64.2, vmbr0_164, weight 1, 00:59:57
B 10.182.2.0/24 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
C>* 10.182.2.0/24 is directly connected, vmbr0_182, 1d14h32m
B>* 10.182.3.0/24 [20/0] is directly connected, evpn01 (vrf vrf_evpn), weight 1, 00:49:14
B>* 10.212.134.254/31 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B 100.111.64.0/29 [20/0] via 100.111.64.1 inactive, weight 1, 00:59:57
C>* 100.111.64.0/29 is directly connected, vmbr0_164, 1d14h31m
B>* 100.111.64.10/31 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 100.111.64.12/31 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 151.248.130.0/24 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
C>* 192.168.101.0/24 is directly connected, vmbr1_101, 1d14h32m
C>* 192.168.102.0/24 is directly connected, vmbr1_102, 1d14h32m

How can I get the SDN to send traffic to the gateway received by FRR via BGP instead of to the default gateway of the PVE host?
 
I try to setup the following:
3 Node Proxmox Cluster using EVPN between nodes. This works as expected.
Now I want to uplink the EVPN to a Fortigate via BGP for uplink.
I have added a BGP Controller to a node and set this node as exit node.
Routes are correctly advertised.
But traffic sent from a lxc container is leaving on the wrong interface and not sent to the correct gateway:

PCAP:
10:32:49.632523 veth124i0 P IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632530 fwln124i0 Out IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632531 fwpr124p0 P IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632531 evpn01 In IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632545 vmbr0_182 Out IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64
10:32:49.632551 eno2 Out IP 10.182.3.100 > one.one.one.one: ICMP echo request, id 57889, seq 6, length 64

As you can see traffic is sent out via vmbr0_182, vmbr0_182 is the interface with the default gateway, but traffic should be sent to the IP 100.111.64.1, which is directly connected on vmbr0_164

IP Route:
default via 10.182.2.1 dev vmbr0_182 proto kernel onlink
default nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.42.1.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.42.42.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.42.55.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
10.182.0.0/24 nhid 150 via 100.111.64.2 dev vmbr0_164 proto bgp metric 20
10.182.1.0/24 nhid 150 via 100.111.64.2 dev vmbr0_164 proto bgp metric 20
10.182.2.0/24 dev vmbr0_182 proto kernel scope link src 10.182.2.101
10.212.134.254/31 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
100.111.64.0/29 dev vmbr0_164 proto kernel scope link src 100.111.64.3
100.111.64.10/31 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
100.111.64.12/31 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
151.248.130.0/24 nhid 149 via 100.111.64.1 dev vmbr0_164 proto bgp metric 20
192.168.101.0/24 dev vmbr1_101 proto kernel scope link src 192.168.101.101
192.168.102.0/24 dev vmbr1_102 proto kernel scope link src 192.168.102.101

FRR routing table:
root@chsfl1-cl01-pve01:~# vtysh -c "sh ip route"
B>* 0.0.0.0/0 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:49:13
B>* 10.42.1.0/24 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 10.42.42.0/24 [20/10] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 10.42.55.0/24 [20/10] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 10.182.0.0/24 [20/0] via 100.111.64.2, vmbr0_164, weight 1, 00:59:57
B>* 10.182.1.0/24 [20/0] via 100.111.64.2, vmbr0_164, weight 1, 00:59:57
B 10.182.2.0/24 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
C>* 10.182.2.0/24 is directly connected, vmbr0_182, 1d14h32m
B>* 10.182.3.0/24 [20/0] is directly connected, evpn01 (vrf vrf_evpn), weight 1, 00:49:14
B>* 10.212.134.254/31 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B 100.111.64.0/29 [20/0] via 100.111.64.1 inactive, weight 1, 00:59:57
C>* 100.111.64.0/29 is directly connected, vmbr0_164, 1d14h31m
B>* 100.111.64.10/31 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 100.111.64.12/31 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
B>* 151.248.130.0/24 [20/0] via 100.111.64.1, vmbr0_164, weight 1, 00:59:57
C>* 192.168.101.0/24 is directly connected, vmbr1_101, 1d14h32m
C>* 192.168.102.0/24 is directly connected, vmbr1_102, 1d14h32m

How can I get the SDN to send traffic to the gateway received by FRR via BGP instead of to the default gateway of the PVE host?
mmm. That's a good question.
can't you remove the default gw of the host in /etc/network/interfaces and use the announced default too for the host ?


if not, the only way is to change the frr.config on the exit-node.
and move the extra peer to the vrf section directly.
Currently, for simplicity and to get it working for users with a simple default gw, I'm doing it in default vrf, and leak the routes from the default vrf to the evpn frr

we could try to edit frr.conf, and

move "neigbhbor BGP ..." , to the "router bgp ... vrf .." section

remove "import vrf ..."
remove "reditributed connected"

then restart frr.

(I'm currently on holiday with poor internet connection, can't help too much, but if it's working, I could look to add an extra option in gui for this).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!