[SOLVED] EVPN SDN not forwarding traffic to host with CT

Oct 18, 2024
11
1
3
Hello!

PVE version 8.4.1

I am trying to setup an EVPN SDN between 4 nodes on a single L2/L3 domain (10.0.32.0/24) with BGP to the router (10.0.32.1) to announce the EVPN networks and do ECMP between all 4 nodes. All nodes are configured as exit-nodes with no primary-exit-node defined. The BGP to the router seems to be working, the connections establish and I receive prefixes.

This seems to work, except that a test CT on the EVPN Vnet (10.0.33.0/24) is unable to ping the gateway (10.0.32.1) of the upstream network. I can ping the anycast gateway (10.0.33.1) fine, and the router sees the ping packet, replies via a different host and the reply seems to get lost at that point.

My understanding is that the EVPN magic should forward the reply to the correct host. Is that my mistake?

Diagram:
router (10.0.32.1) <-BGP-> PVE nodes (10.0.32.10, 11, 13, 14) <-EVPN vnet-> 10.0.33.0/24
 
I'm seeing the same on 8.4.5 and unfortunately do not bring any answers, I have given SDN a few tries but always find myself banging my head on the wall and end up just doing something else.

Intra-cluster routing works great, but I've never gotten external routing to work even a single time since I started poking around with SDN some releases back.

My understanding is that the EVPN magic should forward the reply to the correct host

In my case (and most likely yours too) this is working correctly, the router punts packets to some other node (because of ECMP) but it gets routed over to the correct target node over the VXLAN fabric.

The unfortunate thing is, it chokes right at the finish line; more specifically, the hand-off from the Zone bridge to the VNet bridge. My tests showed that packets reach the vrfbr_* interface (the Zone bridge) on the target node but somehow fail to get routed over to the specific VNet Bridge.

The vrfbr_* bridge is enslaved to the Zone VRF, and during my tests I confirmed that there is a route to the specific VNet subnet on the associated routing table. YOUR_VNET_SUBNET dev YOUR_VNET proto kernel scope link src YOUR_VNET_GATEWAY, but alas no ARP is seen as viewed from the guest VM/CT, which I presume is the reason for the packet not being delivered to the guest.
 
Last edited:
Can you post the output of the following commands (inside vtysh)?

Code:
show bgp l2vpn evpn
show ip route vrf <vrf_name>

Also, on the PVE node:

Code:
ip r show vrf <vrf>
ip neigh show

How does your SDN configuration look like?

Code:
cat /etc/pve/sdn/*.cfg

Do you have an example tcpdump of the packet arriving on the physical interface, as well as on the vrfbr? Preferably as pcap file:

Code:
tcpdump -envi <iface> -w output_<iface>.pcap

edit: also the routing table on one of the exit nodes (if existing) would be interesting:

Code:
ip r
 
Last edited:
I'm seeing the same on 8.4.5 and unfortunately do not bring any answers, I have given SDN a few tries but always find myself banging my head on the wall and end up just doing something else.

Intra-cluster routing works great, but I've never gotten external routing to work even a single time since I started poking around with SDN some releases back.



In my case (and most likely yours too) this is working correctly, the router punts packets to some other node (because of ECMP) but it gets routed over to the correct target node over the VXLAN fabric.

The unfortunate thing is, it chokes right at the finish line; more specifically, the hand-off from the Zone bridge to the VNet bridge. My tests showed that packets reach the vrfbr_* interface (the Zone bridge) on the target node but somehow fail to get routed over to the specific VNet Bridge.

The vrfbr_* bridge is enslaved to the Zone VRF, and during my tests I confirmed that there is a route to the specific VNet subnet on the associated routing table. YOUR_VNET_SUBNET dev YOUR_VNET proto kernel scope link src YOUR_VNET_GATEWAY, but alas no ARP is seen as viewed from the guest VM/CT, which I presume is the reason for the packet not being delivered to the guest.
Not OP but I noticed, in my case, removing the neighbor entry ip neigh del VM_IP dev VNET_IF temporarily allows the reply to go through (for ICMP, I get exactly one reply back) and then stops again.

Not quite sure what this means though, since the neighbor entry that comes back is identical to the one that has been there before.
 
if you can reproduce this issue, then the output of your node might also be quite valuable, particularly if you can provide it in working / non-working state. There might be some issues with ARP potentially, but it's currently hard to tell without a reproducer. I set up multiple EVPN clusters with exit nodes lately and they seemed to work fine though, so without a reliable reproducer it's hard for me to say anything. Your SDN configuration might help in trying to reproduce this.
 
Last edited:
if you can reproduce this issue, then the output of your node might also be quite valuable, particularly if you can provide it in working / non-working state. There might be some issues with ARP potentially, but it's currently hard to tell without a reproducer. I set up multiple EVPN clusters with exit nodes lately and they seemed to work fine though, so without a reliable reproducer it's hard for me to say anything. Your SDN configuration might help in trying to reproduce this.

I managed to figure out the issue with mine, not exactly sure if this is what the OP was experiencing. But in my case, it was reverse path filtering.

Funnily enough, after solving it and knowing what keywords to look for; I tried to look if someone else was hitting it and lo and behold: https://forum.proxmox.com/threads/e...ll-drop-packet-with-asymetric-routing.158225/.

I even had the exact directive to set the sysctl to 0 under a custom conf file in /etc/sysctl.d already, unknowingly it was being overridden by /usr/lib/sysctl.d/pve-firewall.conf.
 
Last edited:
  • Like
Reactions: shanreich
I managed to figure out the issue with mine, not exactly sure if this is what the OP was experiencing. But in my case, it was reverse path filtering.

Funnily enough, after solving it and knowing what keywords to look for; I tried to look if someone else was hitting it and lo and behold: https://forum.proxmox.com/threads/e...ll-drop-packet-with-asymetric-routing.158225/.

I even had the exact directive to set the sysctl to 0 under a custom conf file in /etc/sysctl.d already, unknowingly it was being overridden by /usr/lib/sysctl.d/pve-firewall.conf.
Huh, that might have been it! I tore down the environment, but what you describe (one ping, then nothing) matches my experience with it.

That being said, I gave the EVPN another try with PVE 9.0.5 and a better understanding of what I was doing and it works!

I am using a Mikrotik router as the exit node/gateway, and everything works perfectly. Manual configuration of the Mikrotik is required to line it up with the PVE SDN controller, but once you do, the magic works as expected.
 
Can you post the output of the following commands (inside vtysh)?

Code:
show bgp l2vpn evpn
show ip route vrf <vrf_name>

Also, on the PVE node:

Code:
ip r show vrf <vrf>
ip neigh show

How does your SDN configuration look like?

Code:
cat /etc/pve/sdn/*.cfg

Do you have an example tcpdump of the packet arriving on the physical interface, as well as on the vrfbr? Preferably as pcap file:

Code:
tcpdump -envi <iface> -w output_<iface>.pcap

edit: also the routing table on one of the exit nodes (if existing) would be interesting:

Code:
ip r
Sorry, just saw this, tore down the environment and am not currently able to reproduce. :(