Slow speeds using EVPN

To share some experience in case all bonds are going to pair nexus VPC and MTU 9000 and on vmbr and vxlan 1520 that optimal. We get 9.5 gb/s.
 
Greetings,

I apologize for posting in an old topic, but I would like to share a solution to similar problem.

We had a similar issue with one of our five-server cluster communicating trough BGP-EVPN fabric, where 20 Gbit/s links were operating at only ~4-5 Gbit/s through a VXLAN tunnel between two servers using iperf3. The reason for this behavior is that each Proxmox node does not learn the MAC addresses of the other nodes, causing the traffic pushed into vxlan tunnel to flood to every node.

To debug this, you can use command: bridge fdb |grep [vxlan interface]

To fix this issue, add advertise-svi-ip under bgp configuration address-family l2vpn evpn (FRR).

Here are example configs:

/etc/network/interfaces:

Code:
auto br_ceph
iface br_ceph inet manual
        address [SVI IP]
        bridge_stp off
        bridge-ports none
        bridge-fd 0

auto vxlan666
iface vxlan666 inet manual
        pre-up ip link add vxlan666 type vxlan id 666 dstport 4789 local [LOOPBACK IP] nolearning
        pre-up ip link set dev vxlan666 master br_ceph
        pre-up ip link set up dev vxlan666
        post-up ip link set mtu 9000 dev vxlan666


frr:

Code:
router bgp 65002
 bgp router-id [LOOPBACK IP]
 bgp graceful-restart-disable
 neighbor LEAF peer-group
 neighbor LEAF remote-as 65001
 neighbor LEAF capability dynamic
 neighbor [ IP] peer-group LEAF
 neighbor [ IP] peer-group LEAF
 !
 address-family ipv4 unicast
  network [LOOPBACK IP]/32
  neighbor LEAF allowas-in
  maximum-paths 8
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor LEAF activate
  neighbor LEAF allowas-in
  advertise-all-vni
  advertise-svi-ip
  advertise ipv4 unicast
 exit-address-family
exit

In the case you have multiple links connected into node, you probably want load balancing:

Code:
sysctl -w net.ipv4.fib_multipath_hash_policy = 1

With this setup, we were able to have full 20Gbit/s throughput between two nodes trough vxlan by using iperf3 with multiple parallel streams.
 
Greetings,

I apologize for posting in an old topic, but I would like to share a solution to similar problem.

We had a similar issue with one of our five-server cluster communicating trough BGP-EVPN fabric, where 20 Gbit/s links were operating at only ~4-5 Gbit/s through a VXLAN tunnel between two servers using iperf3. The reason for this behavior is that each Proxmox node does not learn the MAC addresses of the other nodes, causing the traffic pushed into vxlan tunnel to flood to every node.

To debug this, you can use command: bridge fdb |grep [vxlan interface]

To fix this issue, add advertise-svi-ip under bgp configuration address-family l2vpn evpn (FRR).

Here are example configs:

/etc/network/interfaces:

Code:
auto br_ceph
iface br_ceph inet manual
        address [SVI IP]
        bridge_stp off
        bridge-ports none
        bridge-fd 0

auto vxlan666
iface vxlan666 inet manual
        pre-up ip link add vxlan666 type vxlan id 666 dstport 4789 local [LOOPBACK IP] nolearning
        pre-up ip link set dev vxlan666 master br_ceph
        pre-up ip link set up dev vxlan666
        post-up ip link set mtu 9000 dev vxlan666


frr:

Code:
router bgp 65002
 bgp router-id [LOOPBACK IP]
 bgp graceful-restart-disable
 neighbor LEAF peer-group
 neighbor LEAF remote-as 65001
 neighbor LEAF capability dynamic
 neighbor [ IP] peer-group LEAF
 neighbor [ IP] peer-group LEAF
 !
 address-family ipv4 unicast
  network [LOOPBACK IP]/32
  neighbor LEAF allowas-in
  maximum-paths 8
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor LEAF activate
  neighbor LEAF allowas-in
  advertise-all-vni
  advertise-svi-ip
  advertise ipv4 unicast
 exit-address-family
exit
interesting. In your usecase, the SVI for br_ceph is different on each host ?

In the case you have multiple links connected into node, you probably want load balancing:

Code:
sysctl -w net.ipv4.fib_multipath_hash_policy = 1

With this setup, we were able to have full 20Gbit/s throughput between two nodes trough vxlan by using iperf3 with multiple parallel streams.
another tunning possible:
Code:
 # sysctl -wq net.ipv4.fib_multipath_hash_fields=0x0037
 # sysctl -wq net.ipv4.fib_multipath_hash_policy=3

 
 

0x0001 Source IP address
    0x0002 Destination IP address
    0x0004 IP protocol
    0x0008 Flow Label
    0x0010 Source port
    0x0020 Destination port
    0x0040 Inner source IP address
    0x0080 Inner destination IP address
    0x0100 Inner IP protocol
    0x0200 Inner Flow Label
    0x0400 Inner source port
    0x0800 Inner destination port
    ====== ==========================
 
interesting. In your usecase, the SVI for br_ceph is different on each host ?


another tunning possible:
Code:
 # sysctl -wq net.ipv4.fib_multipath_hash_fields=0x0037
 # sysctl -wq net.ipv4.fib_multipath_hash_policy=3

 
 

0x0001 Source IP address
    0x0002 Destination IP address
    0x0004 IP protocol
    0x0008 Flow Label
    0x0010 Source port
    0x0020 Destination port
    0x0040 Inner source IP address
    0x0080 Inner destination IP address
    0x0100 Inner IP protocol
    0x0200 Inner Flow Label
    0x0400 Inner source port
    0x0800 Inner destination port
    ====== ==========================
Yes. Unique IP for each br_ceph interface in each node, used for Ceph in this example. The setup gives read/write speeds of ~2000MB/s in rados bench, practically full network line rate. Each server contains 4x 6.4TB nvme SSD.

Setup is quite excellent in the view of redundancy and scaling. In the case more speed is required, just need to add another nics in to the servers and add them to eBGP routing.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!