sdn vxlan zone not propagating arp/any traffic

landryb · Oct 19, 2023

Hi,

on a up-to-date setup with two nodes, the only difference being the kernel version (because of https://forum.proxmox.com/threads/3...ly-slow-after-kernel-5-13.129909/#post-570343 that i havent found time to bisect)

Code:

pve-manager/8.0.4/d258a813cfa6b390 (running kernel: 6.2.16-3-pve)
pve-manager/8.0.4/d258a813cfa6b390 (running kernel: 5.13.19-6-pve)

i doubt that can be relevant to the problem i'm seeing, since the vxlan driver shouldnt have changed much between those versions.

i'm trying to build a vxlan zone between them to 'join' linux containers on the same subnet on both sides. i dont use openvswitch at all, the rest of the setup is plain linux bridges for VMs/CTs.

direct link between nodes is on the 10.0.254.0/30 subnet (ie node openbsd-amd64 has 10.0.254.1, node pve-openbsd has 10.0.254.2) - that's also the link used for cluster trafic, and not the public IP interface.

Code:

root@pve-openbsd:~# ip a sh dev eth1 |grep brd
    link/ether 00:30:48:cd:be:11 brd ff:ff:ff:ff:ff:ff
    inet 10.0.254.2/30 brd 10.0.254.3 scope global eth1
root@openbsd-amd64:~# ip a sh dev eth1 |grep brd
    link/ether 00:30:48:cd:c2:85 brd ff:ff:ff:ff:ff:ff
    inet 10.0.254.1/30 brd 10.0.254.3 scope global eth1

building the vxlan zone from the web iface generates this config:

Code:

root@pve-openbsd:~# cat /etc/network/interfaces.d/sdn 
#version:16

auto vxlan_vxnet5
iface vxlan_vxnet5
        vxlan-id 5
        vxlan_remoteip 10.0.254.1
        mtu 1450

auto vxnet5
iface vxnet5
        bridge_ports vxlan_vxnet5
        bridge_stp off
        bridge_fd 0
        mtu 1450
root@openbsd-amd64:~# cat /etc/network/interfaces.d/sdn 
#version:16

auto vxlan_vxnet5
iface vxlan_vxnet5
        vxlan-id 5
        vxlan_remoteip 10.0.254.2
        mtu 1450

auto vxnet5
iface vxnet5
        bridge_ports vxlan_vxnet5
        bridge_stp off
        bridge_fd 0
        mtu 1450

containers 106 & 107 are on one node, containers 105 & 108 are on the other node, and all are bridged on vxnet5

Code:

root@pve-openbsd:~# grep net0 /etc/pve/nodes/*/lxc/*
/etc/pve/nodes/openbsd-amd64/lxc/106.conf:net0: name=eth0,bridge=vxnet5,hwaddr=DE:48:1D:B3:25:DA,ip=10.1.1.2/24,type=veth
/etc/pve/nodes/openbsd-amd64/lxc/107.conf:net0: name=eth0,bridge=vxnet5,hwaddr=E6:1F:3C:53:38:90,ip=10.1.1.3/24,type=veth
/etc/pve/nodes/pve-openbsd/lxc/105.conf:net0: name=eth0,bridge=vxnet5,hwaddr=26:69:E4:88:6F:3D,ip=10.1.1.1/24,type=veth
/etc/pve/nodes/pve-openbsd/lxc/108.conf:net0: name=eth0,bridge=vxnet5,hwaddr=D2:97:E7:2A:C4:B3,ip=10.1.1.4/24,type=veth

afaict, the vxlan interface on both sides seem configured, although the generated config uses vxlan_remoteip (coming from https://github.com/proxmox/pve-network/blame/master/src/PVE/Network/SDN/Zones/VxlanPlugin.pm#L80) instead of vxlan-remoteip which is documented on https://manpages.debian.org/stretch/ifupdown2/ifupdown-addons-interfaces.5.en.html but looking at ifupdown2 logs it doesnt seem to bother/complain about that, and the code on the pve-network side is this way since forever.

the bridge fdb table seems correctly configured with 00:00:00:00:00:00 entries with the remote ip as dst, which seems to be for BUM traffic as i've understood from reading https://vincent.bernat.ch/en/blog/2017-vxlan-linux#unicast-with-static-flooding

Code:

root@pve-openbsd:~# bridge fdb show dev vxlan_vxnet5
0a:0b:8c:a4:10:77 vlan 1 master vxnet5 permanent
0a:0b:8c:a4:10:77 master vxnet5 permanent
00:00:00:00:00:00 dst 10.0.254.1 self permanent
root@openbsd-amd64:~# bridge fdb show dev vxlan_vxnet5
4e:2f:d3:20:9d:42 vlan 1 master vxnet5 permanent
4e:2f:d3:20:9d:42 master vxnet5 permanent
00:00:00:00:00:00 dst 10.0.254.2 self permanent

all nodes have IPs in the 10.1.1.0/24 subnet - i havent configured a subnet in proxmox sdn because i was unsure if it was required/useful outside of IPAM modules..

if i ping from CT 106 to 107 or to/from CT 105 to 108 (eg CTs on the same node) then ping works fine.

if i try pinging a CT on the other side of the vxlan tunnel, then nothing goes through. tcpdumping on the various interfaces, i see ARP requests being sent:
- from the ping emitter host on the vxnet5, vxlan_vxnet5 and eth1 interfaces
- only on the eth1 interface on the receiving side (eg the remote node hosting the ping target CT) - the ARP request never makes it to the vxnet5/vxlan_vxnet5 interfaces there
- and there's never an ARP reply sent - so there's no ping going through.

Code:

root@openbsd-amd64:~# tcpdump -i eth1 port 4789
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
13:12:46.900622 IP 10.0.254.2.50445 > 10.0.254.1.4789: VXLAN, flags [I] (0x08), vni 5
ARP, Request who-has 10.1.1.2 tell 10.1.1.4, length 28
13:12:47.903180 IP 10.0.254.2.50445 > 10.0.254.1.4789: VXLAN, flags [I] (0x08), vni 5
ARP, Request who-has 10.1.1.2 tell 10.1.1.4, length 28
13:12:48.927082 IP 10.0.254.2.50445 > 10.0.254.1.4789: VXLAN, flags [I] (0x08), vni 5
ARP, Request who-has 10.1.1.2 tell 10.1.1.4, length 28

- i have the default proxmox firewall setup on the cluster, but i dont think it should matter much for the vxlan traffic since i see it on both sides of the eth1 link.

i've looked at the details of the vxlan iface with ip -d, and i've tried various things after looking at ifupdown2 documentation:
- enforcing remoteip via vxlan-remoteip instead of vxlan_remoteip
- enforcing local ip via vxlan-local-tunnelip, eg adding to interfaces.d/sdn

Code:

        vxlan-remoteip 10.0.254.1
        vxlan-local-tunnelip 10.0.254.2

which results in (after ifreload -a of course)

Code:

root@pve-openbsd:~# ip -d a sh dev vxlan_vxnet5
57: vxlan_vxnet5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master vxnet5 state UNKNOWN group default qlen 1000
    link/ether 0a:0b:8c:a4:10:77 brd ff:ff:ff:ff:ff:ff promiscuity 1  allmulti 1 minmtu 68 maxmtu 65535 
    vxlan id 5 local 10.0.254.2 srcport 0 0 dstport 4789 ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx 
    bridge_slave state forwarding priority 32 cost 100 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8001 port_no 0x1 designated_port 32769 designated_cost 0 designated_bridge 8000.a6:f6:b2:59:c8:e9 designated_root 8000.a6:f6:b2:59:c8:e9 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 2 mcast_fast_leave off mcast_flood on bcast_flood on mcast_to_unicast off neigh_suppress off group_fwd_mask 0 group_fwd_mask_str 0x0 vlan_tunnel off isolated off locked off numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536

but nothing seems to change, ie the 'unicast flooding' of the ARP requests doesnt seem to make it where it should. since i dont have a working vxlan setup i cant compare what works/doesnt work.. should i be able to see the CT mac addresses somewhere in the ip neighbour table on the hosts ?

help and hints welcome, it feels like im missing something, i originally just followed the example from https://blog.raspot.in/fr/blog/mise-en-place-du-sdn-sur-promox-7 which seems to say it should just work... ofc, i can provide more details on the setup.

landryb · Oct 19, 2023

ok, as soon as i hit sent, i tried adding rules to the proxmox firewall.. and now it works, pings go through both sides of the vxlan tunnel.

so to have vxlan properly working, i needed this rule in /etc/pve/firewall/cluster.fw

Code:

IN ACCEPT -i eth1 -p udp -dport 4789 -log info # vxlan

**edited** only the incoming rule seems needed
could this be done by default when a vxlan SDN zone is configured ? or i did something wrong and it's supposed to be the case but got forgotten on the way ?

spirit · Oct 19, 2023

landryb said:
ok, as soon as i hit sent, i tried adding rules to the proxmox firewall.. and now it works, pings go through both sides of the vxlan tunnel.

so to have vxlan properly working, i needed this rule in /etc/pve/firewall/cluster.fw

Code:

IN ACCEPT -i eth1 -p udp -dport 4789 -log info # vxlan

**edited** only the incoming rule seems needed
could this be done by default when a vxlan SDN zone is configured ? or i did something wrong and it's supposed to be the case but got forgotten on the way ?

Hi,
done auto, not currently. (Maybe later it'll be possible, as I'm thinkg to manage nat rules through pve-firewall, so opening vxlan port could be done too).

I'll add a note in the documentation about the vxlan port for the firewall.

andaga · Oct 19, 2023

Hello!
Have you do correctly this step?

After this, you need to add the following line to the end of the /etc/network/interfaces configuration file, so that the SDN configuration gets included and activated.
source /etc/network/interfaces.d/*

landryb · Oct 19, 2023

andaga said:
Hello!
Have you do correctly this step?

After this, you need to add the following line to the end of the /etc/network/interfaces configuration file, so that the SDN configuration gets included and activated.
source /etc/network/interfaces.d/*

yes of course that part is done, otherwise the vxlan & bridge interfaces wouldnt had been created..

andaga · Oct 19, 2023

Can you change your rule into the firewall cluster like this

IN ACCEPT -p udp -dport 4789 -log nolog

landryb · Oct 19, 2023

andaga said:
Can you change your rule into the firewall cluster like this

IN ACCEPT -p udp -dport 4789 -log nolog

well i can sure, but i dont see what that would change, in my case i know the vxlan traffic is on eth1, and as far as logging goes i've already demoted it to nolog.. at the beginning i had two rules for OUT/IN but figured out the OUT rule wasnt needed in my case, since i only filter incoming traffic (as is the default with proxmox fw iirc)

andaga · Oct 19, 2023

I'm just trying to help you! I have only setuped this on my cluster and it was working immediatly

landryb · Oct 19, 2023

andaga said:
I'm just trying to help you! I have only setuped this on my cluster and it was working immediatly

yes, as i said in my second comment this started working once i had added this firewall rule..

Search

Search

sdn vxlan zone not propagating arp/any traffic

landryb

New Member

landryb

New Member

spirit

Distinguished Member

andaga

Member

landryb

New Member

andaga

Member

landryb

New Member

andaga

Member

landryb

New Member

We value your privacy