EVPN SDN issues after Upgrade Proxmox VE from 7 to 8

ddimarx

Member
Oct 11, 2022
3
1
8
Hello all,

TL;DR: I upgraded my Proxmox VE cluster from 7.4-16 to 8.0.3 and my SDN and EVPN setup stopped working properly. I fixed some issues by removing a BGP default route, disabling arp-nd suppression, and setting rp_filter to 0. But I still have random connectivity problems between VMs and LXCs on different nodes.

I'm running multiple 3-node clusters, installed Proxmox VE on top of a plain Debian Bullseye (without using the Proxmox VE ISO) on version 7.4-16 with a no-subscription repository and recently I've proceeded and successfully upgrade one 3-node (lab/test) PVE cluster to 8.0.3 according to this guide https://pve.proxmox.com/wiki/Upgrade_from_7_to_8

I was already using SDN with EVPN in the cluster which was working just fine prior to the upgrade and VMs and LXCs between different nodes of the cluster were able to communicate and reach/ping each other being connected on the same EVPN zone and VNet and at the same time using the Vnet Gateway with SNAT enabled were able to reach the internet without any issues

But after the upgrade to 8.0.3 all VMs and LXCs which are connected on the same EVPN zone and VNet, can not reach/ping between different nodes but only can ping between them on the current cluster node and no one of the VMs and LXCs is able to reach the internet.

I've noticed the following in the 8.0.3 routing table of each node

Code:
root@labpve2:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
default nhid 46 proto bgp metric 20
        nexthop via 10.2.0.21 dev vrfbr_evpnzone weight 1 onlink
        nexthop via 10.2.0.23 dev vrfbr_evpnzone weight 1 onlink
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.22
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.8
10.10.10.0/24 nhid 24 dev evpnet10 proto bgp metric 20
10.10.10.4 nhid 33 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.12 nhid 33 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.19 nhid 34 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.31 nhid 34 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.100 nhid 33 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
192.168.200.0/24 dev zt2k2mncp5 proto kernel scope link src 192.168.200.31

Code:
root@labpve2:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
default nhid 46 proto bgp metric 20
        nexthop via 10.2.0.21 dev vrfbr_evpnzone weight 1 onlink
        nexthop via 10.2.0.23 dev vrfbr_evpnzone weight 1 onlink
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.22
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.8
10.10.10.0/24 nhid 24 dev evpnet10 proto bgp metric 20
10.10.10.4 nhid 33 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.12 nhid 33 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.19 nhid 34 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.31 nhid 34 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.100 nhid 33 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
192.168.200.0/24 dev zt2k2mncp5 proto kernel scope link src 192.168.200.31

Code:
root@labpve3:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink 
default nhid 50 proto bgp metric 20 
        nexthop via 10.2.0.21 dev vrfbr_evpnzone weight 1 onlink 
        nexthop via 10.2.0.22 dev vrfbr_evpnzone weight 1 onlink 
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.23 
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.9 
10.10.10.0/24 nhid 24 dev evpnet10 proto bgp metric 20 
10.10.10.4 nhid 40 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.10 nhid 41 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.12 nhid 40 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.25 nhid 41 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.29 nhid 41 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.32 nhid 41 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.40 nhid 41 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.43 nhid 41 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink 
10.10.10.100 nhid 40 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink 
192.168.200.0/24 dev zt2k2mncp5 proto kernel scope link src 192.168.200.32


And compared with the working 7.4-16 routing table the was only one default gateway
Code:
default via 10.2.0.1 dev vmbr0 proto kernel onlink

and not a default gateway that was propagated through the bgp

So I've managed to remove this bgp default gate way using the following command on each node
vtysh -c "configure terminal" -c "router bgp 65000 vrf vrf_evpnzone" -c "address-family l2vpn evpn" -c "no default-originate ipv4" -c "no default-originate ipv6"

But every time a cluster node was restarting or the SDN ware reloading i has to re-run it , so I've managed to apply this permanently by creating a frr.conf.locak file on each node under /etc/frr

root@labpve1:/etc/frr# cp frr.conf frr.conf.local

and then edit the frr.conf.local and under "address-family l2vpn evpn" set "no" to "default-originate ipv4" and "default-originate ipv6"

Code:
root@labpve1:/etc/frr# cat  frr.conf.local
frr version 8.5.1
frr defaults datacenter
hostname labpve1
log syslog informational
service integrated-vtysh-config
!
!
vrf vrf_evpnzone
 vni 10000
exit-vrf
!
router bgp 65000
 bgp router-id 10.2.0.21
 no bgp default ipv4-unicast
 coalesce-time 1000
 neighbor VTEP peer-group
 neighbor VTEP remote-as 65000
 neighbor VTEP bfd
 neighbor 10.2.0.22 peer-group VTEP
 neighbor 10.2.0.23 peer-group VTEP
 !
 address-family ipv4 unicast
  import vrf vrf_evpnzone
 exit-address-family
 !
 address-family ipv6 unicast
  import vrf vrf_evpnzone
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor VTEP route-map MAP_VTEP_IN in
  neighbor VTEP route-map MAP_VTEP_OUT out
  neighbor VTEP activate
  advertise-all-vni
 exit-address-family
exit
!
router bgp 65000 vrf vrf_evpnzone
 bgp router-id 10.2.0.21
 !
 address-family ipv4 unicast
  redistribute connected
 exit-address-family
 !
 address-family ipv6 unicast
  redistribute connected
 exit-address-family
 !
 address-family l2vpn evpn
  no default-originate ipv4
  no default-originate ipv6
 exit-address-family
exit
!
route-map MAP_VTEP_IN deny 1
 match evpn route-type prefix
exit
!
route-map MAP_VTEP_IN permit 2
exit
!
route-map MAP_VTEP_OUT permit 1
exit
!
line vty
!

After this change, the routing table in the 8.0.3 cluster nodes looks again as on 7.4-16

Code:
root@labpve1:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.21
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.7
10.10.10.0/24 nhid 81 dev evpnet10 proto bgp metric 20
10.10.10.10 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.19 nhid 85 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.20 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.25 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.26 nhid 85 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.29 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.31 nhid 85 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.32 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.40 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.43 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.204 nhid 84 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
192.168.200.0/24 dev zt2k2mncp5 proto kernel scope link src 192.168.200.30

Code:
root@labpve2:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.22
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.8
10.10.10.0/24 nhid 24 dev evpnet10 proto bgp metric 20
10.10.10.4 nhid 32 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.12 nhid 32 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.19 nhid 33 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.26 nhid 33 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.31 nhid 33 via 10.2.0.23 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.100 nhid 32 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
192.168.200.0/24 dev zt2k2mncp5 proto kernel scope link src 192.168.200.31

Code:
root@labpve3:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.23
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.9
10.10.10.0/24 nhid 76 dev evpnet10 proto bgp metric 20
10.10.10.4 nhid 81 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.10 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.12 nhid 81 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.20 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.25 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.29 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.32 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.35 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.40 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.43 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.100 nhid 81 via 10.2.0.21 dev vrfbr_evpnzone proto bgp metric 20 onlink
10.10.10.204 nhid 80 via 10.2.0.22 dev vrfbr_evpnzone proto bgp metric 20 onlink
192.168.200.0/24 dev zt2k2mncp5 proto kernel scope link src 192.168.200.32

After this changes all the VMs and LXCs is able to reach the internet but still weren't able to reach/ping between different nodes but only can ping between them on the
current cluster node

Here is the the network configuration of node 1

Code:
root@labpve1:~# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

iface enP57647s1 inet manual

iface eth0 inet manual

iface enP55469s1 inet manual

auto eth1
iface eth1 inet static
        address 10.2.3.7/24

auto vmbr0
iface vmbr0 inet static
        address 10.2.0.21/24
        gateway 10.2.0.1
        bridge-ports eth0
        bridge-stp off
        bridge-fd 0
        post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
        post-up   echo 1 > /proc/sys/net/ipv4/conf/vmbr0/proxy_arp
        post-up iptables -t nat -A PREROUTING -i vmbr0 -p tcp -m multiport --dport 80,443,2222,3379,3389,5555,8007,5050 -j DNAT --to 10.10.10.10
        post-up iptables -t nat -A PREROUTING -i vmbr0 -p tcp -m multiport --dport 25,587,465,110,143,993 -j DNAT --to 10.10.10.10
        post-down iptables -t nat -D PREROUTING -i vmbr0 -p tcp -m multiport --dport 80,443,2222,3379,3389,5555,8007,5050 -j DNAT --to 10.10.10.10
        post-down iptables -t nat -D PREROUTING -i vmbr0 -p tcp -m multiport --dport 25,587,465,110,143,993 -j DNAT --to 10.10.10.10
source-directory /etc/network/interfaces.d
source-directory /run/network/interfaces.d
source /etc/network/interfaces.d/*


Here is the the EVPN/SDN configuration

Code:
root@labpve1:~# cat  /etc/network/interfaces.d/sdn
#version:229

auto evpnet10
iface evpnet10
        address 10.10.10.1/24
        post-up iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o vmbr0 -j SNAT --to-source 10.2.0.21
        post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o vmbr0 -j SNAT --to-source 10.2.0.21
        post-up iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
        post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
        hwaddress 32:04:D8:3A:7F:E9
        bridge_ports vxlan_evpnet10
        bridge_stp off
        bridge_fd 0
        mtu 1450
        ip-forward on
        arp-accept on
        vrf vrf_evpnzone

auto vrf_evpnzone
iface vrf_evpnzone
        vrf-table auto
        post-up ip route del vrf vrf_evpnzone unreachable default metric 4278198272

auto vrfbr_evpnzone
iface vrfbr_evpnzone
        bridge-ports vrfvx_evpnzone
        bridge_stp off
        bridge_fd 0
        mtu 1450
        vrf vrf_evpnzone

auto vrfvx_evpnzone
iface vrfvx_evpnzone
        vxlan-id 10000
        vxlan-local-tunnelip 10.2.0.21
        bridge-learning off
        mtu 1450

auto vxlan_evpnet10
iface vxlan_evpnet10
        vxlan-id 11000
        vxlan-local-tunnelip 10.2.0.21
        bridge-learning off
        mtu 1450

I've resolved this by uncheck the "Disable arp-nd suppression:" in the EPVN zone configuration

1690382308903.png

And this is the current EVPN/SDN configuration after the arp-nd change

Code:
root@labpve1:/etc/network# cat interfaces.d/sdn
#version:231

auto evpnet10
iface evpnet10
        address 10.10.10.1/24
        post-up iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o vmbr0 -j SNAT --to-source 10.2.0.21
        post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o vmbr0 -j SNAT --to-source 10.2.0.21
        post-up iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
        post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
        hwaddress 32:04:D8:3A:7F:E9
        bridge_ports vxlan_evpnet10
        bridge_stp off
        bridge_fd 0
        mtu 1450
        ip-forward on
        arp-accept on
        vrf vrf_evpnzone

auto vrf_evpnzone
iface vrf_evpnzone
        vrf-table auto
        post-up ip route del vrf vrf_evpnzone unreachable default metric 4278198272

auto vrfbr_evpnzone
iface vrfbr_evpnzone
        bridge-ports vrfvx_evpnzone
        bridge_stp off
        bridge_fd 0
        mtu 1450
        vrf vrf_evpnzone

auto vrfvx_evpnzone
iface vrfvx_evpnzone
        vxlan-id 10000
        vxlan-local-tunnelip 10.2.0.21
        bridge-learning off
        bridge-arp-nd-suppress on
        mtu 1450

auto vxlan_evpnet10
iface vxlan_evpnet10
        vxlan-id 11000
        vxlan-local-tunnelip 10.2.0.21
        bridge-learning off
        bridge-arp-nd-suppress on
        mtu 1450

Also set the "net.ipv4.conf.default.rp_filter" and "net.ipv4.conf.all.rp_filter" to "0" on all 3 nodes according to the SDN guide : https://pve.proxmox.com/pve-docs/chapter-pvesdn.html

Code:
root@labpvr1:/etc/frr# sysctl -a | grep net.ipv4.conf.default.rp_filter
net.ipv4.conf.default.rp_filter = 0
root@labpve1:/etc/frr# sysctl -a | grep net.ipv4.conf.all.rp_filter
net.ipv4.conf.all.rp_filter = 0

The odd issue that still persist is that randomly and without any obvious cause, there are still some VMs or LXCs (not all of them) from one of the cluster nodes that are losing connection/communication with some other VMs or LXCs (not all of them) on another cluster node and vice-versa:

Example
---------------
VM/LXC_A on Node1 can ping VM/LXC_B on Node2 and vice-versa
VM/LXC_A on Node1 can not ping VM/LXC_C on Node3 and vice-versa
VM/LXC_B on Node2 can ping VM/LXC_C on Node3 and vice-versa
VM/LXC_C on Node3 can ping VM/LXC_D on Node1 and vice-versa
VM/LXC_D on Node1 can ping VM/LXC_B on Node2 and vice-versa
---------------

The issue sometimes is getting resolved if i restart VM_C or just disconnect/reconnect it's network interface or if the VM/LXC_C will be migrated to another node and back to its original one.

Thank you for any suggestion in order to resolve this odd issue.
 
Im currently on holiday with limited connection , but for your first question with default originate, its done because you have defined the node as exit node. So its announce 0.0.0.0 default route.
 
A little late, but I have been plagued with those issues for weeks now (After upgrading to pve 8). Well, beta feature ^^.


So my problem is similar: Hosts in the same subnet/vxlan cannot ping other hosts on different nodes.
It sometimes works / in sometimes does not.


The reason for this is a bug in FRR 8.5.1 apparently.


Disabling optimization for the 2 route maps proxmox creates in the frr.conf / or another one.
seems to be a workaround: https://github.com/FRRouting/frr/issues/13792

Cannot validate that yet, because I applied that change a few minutes ago. Will see how this turns out...


There is already a bug for this: https://bugzilla.proxmox.com/show_bug.cgi?id=4810

Solution seems that frr shipped by the proxmox repos gets updated.
 
FYI : I've tested these provided frr*.deb files and have resolved this issue.

Also these packages have been recently officially updated :
--------------------
frr (8.5.2-1+pve1) bookworm; urgency=medium

* update upstream sources to current stable/8.5 (commit
1622c2ece2f68e034b43fb037503514c2195aba5) fixing among other things:
- critical bug evpn bug with Type-3 EVPN route
- problematic BGP session resets with corrupted tunnel encapsulation
attributes, breaking RFC 7606

-- Proxmox Support Team <support@proxmox.com> Wed, 30 Aug 2023 16:58:08 +0200

frr (8.5.1-1+pve1) bookworm; urgency=medium
--------------------
 
  • Like
Reactions: spirit
I kinda have also encountered this issue with my virtual Proxmox 8 SDN lab.

I had no success at all getting my vms into the internet and my ip ro sh statements seem very similar. I also installed those packages which has been shared here but it didn't fixed it at all.

So I guess I will wait until an official fix for this issue with Proxmox 8 will be released.

Today I reinstalled my lab with latest Proxmox 7 release and it did worked out of the box with the provided documentation.
 
Last edited:
I kinda have also encountered this issue with my virtual Proxmox 8 SDN lab.

I had no success at all getting my vms into the internet and my ip ro sh statements seem very similar. I also installed those packages which has been shared here but it didn't fixed it at all.

So I guess I will wait until an official fix for this issue with Proxmox 8 will be released.

Today I reinstalled my lab with latest Proxmox 7 release and it did worked out of the box with the provided documentation.
please share your configurations (/etc/pve/sdn/*) && /etc/network/interfaces. + pveversion -v.
I can't help without them.

(The frr bug from this thread is already fixed in offiial repo with 8.5.2 release)
 
I have no time right now to analyze it further. I am sorry. Maybe it is working by this time. Just ignore me.
 
I kinda have also encountered this issue with my virtual Proxmox 8 SDN lab.

I had no success at all getting my vms into the internet and my ip ro sh statements seem very similar. I also installed those packages which has been shared here but it didn't fixed it at all.

So I guess I will wait until an official fix for this issue with Proxmox 8 will be released.

Today I reinstalled my lab with latest Proxmox 7 release and it did worked out of the box with the provided documentation.
Except for the frr bug that was affecting the VM/LXC communication where Hosts in the same subnet/vxlan cannot ping other hosts on different nodes, I had the exact same issue as what you've described here and I explained that in my original post, where ager the Proxmox upgrade from v7 to v8 it seems that Proxmox cluster nodes were reporting except of the default gateway (as it was reported on Proxmox v7 ) also 2 nexthop default gateways

Code:
Code:
root@labpve2:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
default nhid 46 proto bgp metric 20
        nexthop via 10.2.0.21 dev vrfbr_evpnzone weight 1 onlink
        nexthop via 10.2.0.23 dev vrfbr_evpnzone weight 1 onlink
...

This issue still persists after the latest frr upgrade, but I've already explained the workaround on how to make work as it was working on Proxmox v7

I've managed to apply this permanently by creating a frr.conf.local file on each cluster node under /etc/frr using the original one.

root@labpve1:/etc/frr# cp frr.conf frr.conf.local

and then edit the frr.conf.local and under "address-family l2vpn evpn" set "no" to "default-originate ipv4" and "default-originate ipv6"


Code:
root@labpve1:/etc/frr# cat  frr.conf.local
frr version 8.5.1
frr defaults datacenter
hostname labpve1
log syslog informational
service integrated-vtysh-config
!
!
vrf vrf_evpnzone
vni 10000
exit-vrf
!
router bgp 65000
bgp router-id 10.2.0.21
no bgp default ipv4-unicast
coalesce-time 1000
neighbor VTEP peer-group
neighbor VTEP remote-as 65000
neighbor VTEP bfd
neighbor 10.2.0.22 peer-group VTEP
neighbor 10.2.0.23 peer-group VTEP
!
address-family ipv4 unicast
import vrf vrf_evpnzone
exit-address-family
!
address-family ipv6 unicast
import vrf vrf_evpnzone
exit-address-family
!
address-family l2vpn evpn
neighbor VTEP route-map MAP_VTEP_IN in
neighbor VTEP route-map MAP_VTEP_OUT out
neighbor VTEP activate
advertise-all-vni
exit-address-family
exit
!
router bgp 65000 vrf vrf_evpnzone
bgp router-id 10.2.0.21
!
address-family ipv4 unicast
redistribute connected
exit-address-family
!
address-family ipv6 unicast
redistribute connected
exit-address-family
!
address-family l2vpn evpn
no default-originate ipv4
no default-originate ipv6
exit-address-family
exit
!
route-map MAP_VTEP_IN deny 1
match evpn route-type prefix
exit
!
route-map MAP_VTEP_IN permit 2
exit
!
route-map MAP_VTEP_OUT permit 1
exit
!
line vty
!

When you prepare this on every cluster node, then you have just to apply this from Datacenter level > SDN > Status > Apply

1696430842416.png

After this change, the routing table in the v8 cluster nodes looks again as on v7

Code:
root@labpve1:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.21
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.7
...

Now all VMs/LXCs on any cluster node will be able to reach the Internet again.
 
Except for the frr bug that was affecting the VM/LXC communication where Hosts in the same subnet/vxlan cannot ping other hosts on different nodes, I had the exact same issue as what you've described here and I explained that in my original post, where ager the Proxmox upgrade from v7 to v8 it seems that Proxmox cluster nodes were reporting except of the default gateway (as it was reported on Proxmox v7 ) also 2 nexthop default gateways

Code:
Code:
root@labpve2:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
default nhid 46 proto bgp metric 20
        nexthop via 10.2.0.21 dev vrfbr_evpnzone weight 1 onlink
        nexthop via 10.2.0.23 dev vrfbr_evpnzone weight 1 onlink
...

This issue still persists after the latest frr upgrade, but I've already explained the workaround on how to make work as it was working on Proxmox v7

I've managed to apply this permanently by creating a frr.conf.local file on each cluster node under /etc/frr using the original one.

root@labpve1:/etc/frr# cp frr.conf frr.conf.local

and then edit the frr.conf.local and under "address-family l2vpn evpn" set "no" to "default-originate ipv4" and "default-originate ipv6"

Well, default routes are announced (through default-originate ..) by the exit-nodes if you have defined them.
as you have define multiple exit-nodes , so you have 2 nexthop with same weight with ecmp balancing.

simply remove exit-node. (as you don't seem to need it anyway?)
 
Except for the frr bug that was affecting the VM/LXC communication where Hosts in the same subnet/vxlan cannot ping other hosts on different nodes, I had the exact same issue as what you've described here and I explained that in my original post, where ager the Proxmox upgrade from v7 to v8 it seems that Proxmox cluster nodes were reporting except of the default gateway (as it was reported on Proxmox v7 ) also 2 nexthop default gateways

Code:
Code:
root@labpve2:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
default nhid 46 proto bgp metric 20
        nexthop via 10.2.0.21 dev vrfbr_evpnzone weight 1 onlink
        nexthop via 10.2.0.23 dev vrfbr_evpnzone weight 1 onlink
...

This issue still persists after the latest frr upgrade, but I've already explained the workaround on how to make work as it was working on Proxmox v7

I've managed to apply this permanently by creating a frr.conf.local file on each cluster node under /etc/frr using the original one.

root@labpve1:/etc/frr# cp frr.conf frr.conf.local

and then edit the frr.conf.local and under "address-family l2vpn evpn" set "no" to "default-originate ipv4" and "default-originate ipv6"


Code:
root@labpve1:/etc/frr# cat  frr.conf.local
frr version 8.5.1
frr defaults datacenter
hostname labpve1
log syslog informational
service integrated-vtysh-config
!
!
vrf vrf_evpnzone
vni 10000
exit-vrf
!
router bgp 65000
bgp router-id 10.2.0.21
no bgp default ipv4-unicast
coalesce-time 1000
neighbor VTEP peer-group
neighbor VTEP remote-as 65000
neighbor VTEP bfd
neighbor 10.2.0.22 peer-group VTEP
neighbor 10.2.0.23 peer-group VTEP
!
address-family ipv4 unicast
import vrf vrf_evpnzone
exit-address-family
!
address-family ipv6 unicast
import vrf vrf_evpnzone
exit-address-family
!
address-family l2vpn evpn
neighbor VTEP route-map MAP_VTEP_IN in
neighbor VTEP route-map MAP_VTEP_OUT out
neighbor VTEP activate
advertise-all-vni
exit-address-family
exit
!
router bgp 65000 vrf vrf_evpnzone
bgp router-id 10.2.0.21
!
address-family ipv4 unicast
redistribute connected
exit-address-family
!
address-family ipv6 unicast
redistribute connected
exit-address-family
!
address-family l2vpn evpn
no default-originate ipv4
no default-originate ipv6
exit-address-family
exit
!
route-map MAP_VTEP_IN deny 1
match evpn route-type prefix
exit
!
route-map MAP_VTEP_IN permit 2
exit
!
route-map MAP_VTEP_OUT permit 1
exit
!
line vty
!

When you prepare this on every cluster node, then you have just to apply this from Datacenter level > SDN > Status > Apply

View attachment 56136

After this change, the routing table in the v8 cluster nodes looks again as on v7

Code:
root@labpve1:/etc/frr# ip ro sh
default via 10.2.0.1 dev vmbr0 proto kernel onlink
10.2.0.0/24 dev vmbr0 proto kernel scope link src 10.2.0.21
10.2.3.0/24 dev eth1 proto kernel scope link src 10.2.3.7
...

Now all VMs/LXCs on any cluster node will be able to reach the Internet again.

I observed the exact same issue that you did, on PVE v8.0.3. However, after updating the test cluster to PVE v8.0.4, I no longer need to override frr.conf as you described in order for workloads on the EVPN vnet to be able to reach the Internet (and each other, if on different hosts).

I did notice that several PVE-network related packages were updated as part of the v8.0.4 release:
https://github.com/proxmox/pve-network/commits/master

@spirit do you have visibility into what changed and why? We are still seeing some regressions with PVE SDN on v8.x vs v7.x that are keeping us from deploying v8.x in production.

Thanks,

DC
 
I don't have made any change in evpn plugin since the release of pve8 (They was a frr bug fixed with frr upgrade).

I'm running v8 in production without any evpn problem.

What regression do you see ? what override config did you needed ?

can you share your /etc/pve/sdn/*.cfg ?
 
I don't have made any change in evpn plugin since the release of pve8 (They was a frr bug fixed with frr upgrade).

I'm running v8 in production without any evpn problem.

What regression do you see ? what override config did you needed ?

can you share your /etc/pve/sdn/*.cfg ?

The main issue that we are struggling with appears to be related to the (still unpatched) issue that creates a routing loop when using multiple exit nodes. Do you have any idea why the Proxmox team has not yet released v0.9.6 of libpve-network-perl? We have tested the .deb that you linked @spirit , but are reluctant to upgrade nodes to PVE 8 until the fix has been properly packaged. It has been months since you posted and made your patched .deb available, but the team hasn't yet incorporated your changes. This is most definitely a regression, and renders SDN unusable on PVE8 for anybody using multiple exit nodes.
 
The main issue that we are struggling with appears to be related to the (still unpatched) issue that creates a routing loop when using multiple exit nodes. Do you have any idea why the Proxmox team has not yet released v0.9.6 of libpve-network-perl? We have tested the .deb that you linked @spirit , but are reluctant to upgrade nodes to PVE 8 until the fix has been properly packaged. It has been months since you posted and made your patched .deb available, but the team hasn't yet incorporated your changes. This is most definitely a regression, and renders SDN unusable on PVE8 for anybody using multiple exit nodes.
mmm,indeed, it's seem that a new official package version is still not released.
The patch is already in git since 2 months
https://git.proxmox.com/?p=pve-network.git;a=commit;h=e614da43f13e3c61f9b78ee9984364495eff91b6

I'll ask to pve-devel to have a new package release
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!