SDN evpn issue - routing unstable all routing stops every few days

kwslavens · Nov 4, 2022

We've got a 5 node proxmox cluster running with 2 vxlan zones and a evpn zone with exit nodes on all 5 servers.
The evpn zone works as expected but every few days routing completely stops for all virtual devices in the zone. Devices in the zone can still talk to each other but no routing out the frr is possible other than icmp (pings). We can ping north / south and east west... but no other protocols seem to function. Re-apply the SDN configuration and the problem disappears. I'm assuming that restarting the frr service clears the issue.

We've have not found any error messages indicating a problem on the hosts. The frr.log doesn't show any issues.

Has anyone else experiences this issue? We're seeing the loss of routing every few days.

kwslavens · Nov 4, 2022

I've enabled debug on all 5 hosts. Waiting for the issue to happen again and I'll post log data.

spirit · Nov 5, 2022

Hi. That's strange than icmp is working but not other protocol. evpn is about routing mac && ip address, not specific protocol.

(I'm running 100 nodes with evpn, so I'm sure 100% sure that current kernel && frr don't have bug).

How do you route traffic from the exit-node to external word ? (simple static gateway ? bgp ?).

Do you use a primary exit-node ? or do you load balance between all exit-nodes ? (maybe icmp could be balanced differently than tcp if all exit-node are active)

kwslavens · Nov 7, 2022

spirit said:
Hi. That's strange than icmp is working but not other protocol. evpn is about routing mac && ip address, not specific protocol.

(I'm running 100 nodes with evpn, so I'm sure 100% sure that current kernel && frr don't have bug).

How do you route traffic from the exit-node to external word ? (simple static gateway ? bgp ?).

Do you use a primary exit-node ? or do you load balance between all exit-nodes ? (maybe icmp could be balanced differently than tcp if all exit-node are active)

Just simple default gateway. I have the 3rd node set as the primary exit-node and all 5 nodes are configured as exit-nodes.
Its setup pretty much directly from the SDN example for evpn.

Its good to hear that you've got 100 nodes using evpn with no issues. Gives me hope. Also makes me wonder what did I do wrong. If it works fine on a large deployment the what is different with this 5 node cluster.

The cluster is running on 5 Cisco blades, so I couldn't use the ISO installer. I had to install Debian 11 first and get multipath working correctly for the storage controllers, then installed PVE on top. I wouldn't think that could be a source of a problem.
I attempted to upgrade to the newest stable frr.... found out pretty quick that doesn't work. I'm assuming there are some changes in the frr package to include the pve config files. Rolled back to the standard pve package version.

Any advice on how to troubleshoot the issue to track down the cause?

spirit · Nov 7, 2022

kwslavens said:
Just simple default gateway. I have the 3rd node set as the primary exit-node and all 5 nodes are configured as exit-nodes.
Its setup pretty much directly from the SDN example for evpn.

how did-you setup the routes on the other side (in your router)?
you should have done something like "route add <evpn zone subnets> gw xxxx" .

So, depend of your router model, how do you have implemented that ? are you able use multiple gw ? (ecmp routing ?). if not, you need somekind of keepalived vip on the proxmox exit-nodes. ( or use bgp between your routers and proxmox exit-nodes)

Note that 2 exit-nodes are enough, for redundancy. (for example, in my network, I'm using 2 arista switch (supporting evpn) as exit-node, with 100 hypervisors behind).

kwslavens said:
Its good to hear that you've got 100 modes using evpn with no issues. Gives me hope. Also makes me wonder what did I do wrong. If it works fine on a large deployment the what is different with this 5 node cluster.

The cluster is running on 5 Cisco blades, so I couldn't use the ISO installer. I had to install Debian 11 first and get multipath working correctly for the storage controllers, then installed PVE on top. I wouldn't think that could be a source of a problem.
I attempted to upgrade to the newest stable frr.... found out pretty quick that doesn't work. I'm assuming there are some changes in the frr package to include the pve config files. Rolled back to the standard pve package version.

you just need to use frr && ifupdown2 coming from proxmox repository. (frr had a lot of bugs in past, I have well tested it with evpn, sometime include patch from stable branch, because frr sometimes don't release minor version, you need to compile it yourself, so use the proxmox package version

kwslavens said:
Any advice on how to troubleshoot the issue to track down the cause?

If vms are still able to communicate in evpn network, it's that evpn is working correctly, and problem is coming from outside not able to join the evpn network.

kwslavens · Nov 7, 2022

I'm not sure I understand completely. We have nothing configured outside of ProxMox. We're not routing the evpn network zone from external devices. Basicly treating it as a firewall / NAT network. Outbound originating traffic only. Anything needing to access resources from the evpn zone, I'm using an HA-Proxy setup or for SSH access a VM Jump Box.

spirit · Nov 7, 2022

kwslavens said:
I'm not sure I understand completely. We have nothing configured outside of ProxMox. We're not routing the evpn network zone from external devices. Basicly treating it as a firewall / NAT network. Outbound originating traffic only. Anything needing to access resources from the evpn zone, I'm using an HA-Proxy setup or for SSH access a VM Jump Box.

oh ok, so only outbound traffic with nat. could you share your /etc/pve/sdn/*.cfg ?

I really don't see why it could drop (until you don't have any primary exit-node shutdown).
But if you have a failver of exit-node, you need to use something like conntrackd to sync conntrack between the exit-nodes. Or current established connections will be dropped.

another possibility:

Also, you need to increase conntrack max; because the default is 32000. If conntrack is satured (could be a ddos for example)., no more connection is possible.

to verify:

#apt-install conntrack
#conntrack -L

kwslavens · Nov 7, 2022

Cfg below. Only the vxprod network fails. the others keep working when vxprod stops.
Conntrack indicates less than 500 flow entries.
I'll increase the max tracking but it doesn't look like that is a problem.


evpn: vxprod
        asn 65000
        peers 172.20.98.32, 172.20.98.33, 172.20.98.34, 172.20.98.35, 172.20.98.36

powerdns: pdnsauth1
        key -------------------------------------------------------------------------------
        url http://pdns-auth1.ccst.net:8081/api/v1/servers/localhost
        ttl 300

pve: pve

netbox: NetBox1
        token ----------------------------------------------------------------------------
        url http://netbox1.ccst.net:8000/api

subnet: vxrisky-10.9.0.0-24
        vnet HighRisk
        dnszoneprefix vxrisky.ccst.net
        gateway 10.9.0.1

subnet: vxtstlab-10.10.0.0-21
        vnet TestLab
        dnszoneprefix testlab.ccst.net
        gateway 10.10.0.1

subnet: vxprod-10.11.0.0-20
        vnet vxprod
        gateway 10.11.0.1
        snat 1

subnet: vxprod-192.168.100.0-24
        vnet vxprod
        gateway 192.168.100.1
        snat 1

subnet: vxprod-192.168.122.0-24
        vnet vxprod
        gateway 192.168.122.1
        snat 1

vnet: HighRisk
        zone vxrisky
        alias vxrisky-vnets
        tag 100000

vnet: TestLab
        zone vxtstlab
        alias vxtstlab-testlab
        tag 100001

vnet: vxprod
        zone vxprod
        alias vxprod Primary production
        tag 11000

vxlan: vxrisky
        peers 10.8.0.1,10.8.0.2,10.8.0.3,10.8.0.4,10.8.0.5
        dns pdnsauth1
        dnszone ccst.net
        ipam NetBox1
        mtu 1450
        reversedns pdnsauth1

vxlan: vxtstlab
        peers 10.8.0.1,10.8.0.2,10.8.0.3,10.8.0.4,10.8.0.5
        dns pdnsauth1
        dnszone ccst.net
        ipam NetBox1
        mtu 1450
        reversedns pdnsauth1

evpn: vxprod
        controller vxprod
        vrf-vxlan 10000
        dns pdnsauth1
        dnszone ccst.net
        exitnodes ccst-ostackbbu4,ccst-ostackbbu3
        exitnodes-primary ccst-ostackbbu3
        ipam NetBox1
        mac 32:F4:05:FE:6C:0A
        mtu 1450
        reversedns pdnsauth1

kwslavens · Nov 7, 2022

This may be nothing... but I can't seem to find the the ip_conntrack_max value at all. Checked the path... and it truly isn't there. On all 5 hosts.

sysctl net.ipv4.netfilter.ip_conntrack_max
sysctl: cannot stat /proc/sys/net/ipv4/netfilter/ip_conntrack_max: No such file or directory

I did find this however
sysctl net.nf_conntrack_max
net.nf_conntrack_max = 262144

It appears at first glance that values are missing.
I don't have anything like conntrackd installed, didn't know that I needed it. I'll look into that.

spirit · Nov 8, 2022

kwslavens said:
This may be nothing... but I can't seem to find the the ip_conntrack_max value at all. Checked the path... and it truly isn't there. On all 5 hosts.

sysctl net.ipv4.netfilter.ip_conntrack_max
sysctl: cannot stat /proc/sys/net/ipv4/netfilter/ip_conntrack_max: No such file or directory

I did find this however
sysctl net.nf_conntrack_max
net.nf_conntrack_max = 262144

the proc path is : /proc/sys/net/nf_conntrack_max
(should be the same value)

kwslavens said:
It appears at first glance that values are missing.
I don't have anything like conntrackd installed, didn't know that I needed it. I'll look into that.

conntrackd is not mandatory, but I recommend it in case of failover, to avoid connections hang of currently established connections.

Looking at your config, it seem pretty basic, I really don't known why it's hanging after X days.

for debug, if you have the problem, it should be interesting to see the result of (on each node):

# vtysh -c "sh ip bgp l2vpn evpn"
# vtysh -c "sh ip bgp summary"
# vtysh -c "sh ip route vrf all"

BTW, What is your current running kernel version ?

kwslavens · Nov 8, 2022

spirit said:
the proc path is : /proc/sys/net/nf_conntrack_max
(should be the same value)

conntrackd is not mandatory, but I recommend it in case of failover, to avoid connections hang of currently established connections.

Looking at your config, it seem pretty basic, I really don't known why it's hanging after X days.

for debug, if you have the problem, it should be interesting to see the result of (on each node):

# vtysh -c "sh ip bgp l2vpn evpn"
# vtysh -c "sh ip bgp summary"
# vtysh -c "sh ip route vrf all"

BTW, What is your current running kernel version ?

I'll gather that output from all nodes as soon as the issue happens again. It last happened on 11/4 so it should be any time. All 5 nodes are patched up and current. Kernel version below.

uname -a

Linux ccst-ostackbbu1 5.15.64-1-pve #1 SMP PVE 5.15.64-1 (Thu, 13 Oct 2022 10:30:34 +0200) x86_64 GNU/Linux

kwslavens · Nov 16, 2022

Quick Update. Its been more than a week now with no issue at all. We've been unable to further troubleshoot since the issue hasn't happened again. This is out of the norm. We've seen the problem every 3-4 days at most for several months.

Only Three things have changed.
1). The FRR and Tools packages where removed and re-installed completely.
2). The number of exit nodes has been reduced from 5 to 2. (One primary one secondary)
3). All host have been patched and updated to 100% match code versions.

That is it. Either its just random luck, or one of those 3 things solved the problem.
I will come back in couple of weeks and update if the problem still has not resurfaced Or sooner if the problem does happen again.

Fingers Crossed!

Search

Search

SDN evpn issue - routing unstable all routing stops every few days

kwslavens

Member

kwslavens

Member

spirit

Distinguished Member

kwslavens

Member

spirit

Distinguished Member

kwslavens

Member

spirit

Distinguished Member

kwslavens

Member

kwslavens

Member

spirit

Distinguished Member

kwslavens

Member

kwslavens

Member