SDN stopped forwarding

cyruspy

Renowned Member
Jul 2, 2013
168
16
83
Hello!,

What's the SDN logic to enable forwarding?. After an upgrade from 8 to 9 my overlay network is not working anymore. Just checked and
net.ipv4.conf.all.forwarding is disabled.

I don't recall its specific configuration in the past, but I don't have evidence of the working configuration being different.
 
I didn't add the config explicitly, it came from the SDN setup somehow.

For the time being I fixed it adding the option by hand, not sure if that is the correct approach.

Thanks.
 
Do you know the underlying cause from the healthcheck script? Is it always the same? How do you usually go about fixing it?
What I found:
1. Changes in reverse path filter behavior (8 to 9 or 9.0 to 9.1, couldn't tell). Fixed once, shouldn't repeat

2. East west traffic didn't work with some nodes or with specific segments or specific VMs. Happened several times. FRR restart fixed that.

3. With some star alignment (still not identified), broken tap/bridge wiring. Fixed manually or via VM stop/start.

Will monitor for #2/#3 after upgrading to 9.2
 
2. East west traffic didn't work with some nodes or with specific segments or specific VMs. Happened several times. FRR restart fixed that.

Would be interesting to see the FRR status before restarting:

Code:
vtysh -c 'show bgp l2vpn evpn route'
vtysh -c 'show bgp neighbor'

+ the corresponding SDN config:

Code:
cat /etc/pve/sdn/zones.cfg
cat /etc/pve/sdn/vnets.cfg
cat /etc/pve/sdn/subnets.cfg
cat /etc/pve/sdn/controllers.cfg

Also, what traffic exactly is failing (VNI, src/dst, protocol, ...). Are you using the firewall? Can you check via tcpdump on both involved hosts where the traffic is getting dropped? Does migrating the VM fix the problem?


3. With some star alignment (still not identified), broken tap/bridge wiring. Fixed manually or via VM stop/start.

Would be interesting to see the corresponding ip a output - what exactly goes wrong? Do all interfaces get created properly? Are they not getting plugged into the bridge? Any output in the journal / VM start log?
 
Would be interesting to see the FRR status before restarting:

Code:
vtysh -c 'show bgp l2vpn evpn route'
vtysh -c 'show bgp neighbor'

+ the corresponding SDN config:

Code:
cat /etc/pve/sdn/zones.cfg
cat /etc/pve/sdn/vnets.cfg
cat /etc/pve/sdn/subnets.cfg
cat /etc/pve/sdn/controllers.cfg

Also, what traffic exactly is failing (VNI, src/dst, protocol, ...). Are you using the firewall? Can you check via tcpdump on both involved hosts where the traffic is getting dropped? Does migrating the VM fix the problem?




Would be interesting to see the corresponding ip a output - what exactly goes wrong? Do all interfaces get created properly? Are they not getting plugged into the bridge? Any output in the journal / VM start log?
Will monitor and collect the evidence when/if it happens again.