QinQ SDN zone not forwarding VLAN traffic to VM bridge (vnet/vmbr chain issue)

Vishaal Golam

Active Member
Feb 8, 2019
4
0
41
24
Hi, I’m stuck with a QinQ+Proxmox SDN forwarding issue — short version: the outer/inner tags arrive on the host bridge, but the VM bridge never learns the remote MAC and never receives inbound frames.

Topology (relevant parts):

VM (tap101i1) <=> vnetQnQ (pr_HexaDonQ) <=> ln_HexaDonQ
^
z_HexaDonQ
^
vmbrQnQ.300 (on vmbrQnQ -> bond1)
^
swD <-> swVDBI <-> end-device (remote MAC 00:50:56:a6:2d:f2)


Proxmox SDN:
zones.cfg (QinQ):

qinq: HexaDonQ
bridge vmbrQnQ
tag 300
vlan-protocol 802.1ad

vnets.cfg:
vnet: vnetQnQ
zone HexaDonQ
vlanaware 1


Linux bridges: vmbrQnQ, vmbrQnQ.300, z_HexaDonQ (bridge), ln/pr veth pair, vnetQnQ (bridge), tap101i1.

What I see (evidence)

vmbrQnQ tcpdump:
00:50:56:a6:2d:f2 > ff:ff:ff, ethertype 802.1Q, vlan 802, ARP request (remote → host)

So the host receives inner VLAN 802 frames (after QinQ decap some place on the path).

tap101i1 tcpdump:
only shows ARPs sent by the VM (outbound). No inbound ARP/frames from 00:50:56:a6:2d:f2.


Linux bridge FDB:

bridge fdb show | grep 00:50:56:a6:2d:f2
# at one point showed: 00:50:56:a6:2d:f2 dev bond1 vlan 1 master vmbrQnQ
# currently no entry shown for that MAC on vlan 802


Switch configs (actual state)

swD → link to pveD / to swVDBI:

interface Ten-GigabitEthernet2/0/46 (swD)
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 300
port trunk pvid vlan 300
qinq enable


swVDBI → port toward end-device (physical port / BAGG)

interface Ten-GigabitEthernet2/0/51 (swVDBI)
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 802
port trunk pvid vlan 99


Observations on switches:

swVDBI shows MAC 00:50:56:a6:2d:f2 learned on VLAN 802 (BAGG2).

swD shows the same MAC learned on VLAN 300 on XGE2/0/46 (the QinQ SVLAN).

What I tried / changed

Ensured vmbrQnQ.300 and z_HexaDonQ exist and are UP.

Added bridge vlan add for vid 802 on vmbrQnQ.300, ln_HexaDonQ, pr_HexaDonQ, tap101i1, vnetQnQ so VLAN 802 is allowed across the veth chain. bridge vlan show now lists 802 on those interfaces.

Observed tcpdump on ln_HexaDonQ/pr_HexaDonQ — sometimes the 802 frames are visible on vmbrQnQ but not on the veth/tap; FDB shows no entry for the remote MAC on vlan 802.

Question:
Given the above, why does the Linux bridging stack / Proxmox SDN not deliver the inbound VLAN-802 frames to the VM/tap even though vmbrQnQ sees the tagged frames and the chain allows VLAN 802? Is there a Proxmox SDN behavior or Linux-bridge/FDB nuance (vlan filtering, learned FDB on wrong interface, qinq interaction) that prevents learning/flooding to the vnet bridge? Any pointers on the exact debug steps or settings I should inspect?

Thanks for any hints — I feel the QinQ path and switch side is ok, but the VM side bridging/learning is what’s failing.
 
Last edited:
Can you post the full generated SDN config?

Code:
cat /etc/network/interfaces
cat /etc/network/interfaces.d/sdn

Is vmbrQnQ itself VLAN-aware in the network settings?

How are you handling the inner tagging? You have a vlan-aware VNet without any tag set, so you'd have to tag the traffic manually inside the VM / network device on the vm - how does the VM config look like?

Code:
qm config <vmid>

do you potentially have a full tcpdump across all devices?

Code:
tcpdump -envi any arp or icmp


I'd have to try and replicate the VLAN-aware setup myself tbh, since I'm not 100% sure how it works, but maybe I can find a quick pointer with the output of the commands above on where to start debugging.
 
Thanks for your reply,
Code:
cat /etc/network/interfaces

auto lo
iface lo inet loopback

auto eno1np0
iface eno1np0 inet manual

auto eno2np1
iface eno2np1 inet manual

auto eno3np2
iface eno3np2 inet manual

auto eno4np3
iface eno4np3 inet manual
mtu 9000

auto bond0
iface bond0 inet manual
bond-slaves eno1np0 eno2np1 eno3np2
bond-miimon 100
bond-mode 802.3ad

auto bond1
iface bond1 inet manual
bond-slaves eno4np3
bond-miimon 100
bond-mode 802.3ad
mtu 9000
#QinQ

auto vmbr0
iface vmbr0 inet static
address 10.123.0.47/22
gateway 10.123.0.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#MGT

auto vmbrQnQ
iface vmbrQnQ inet manual
bridge-ports bond1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
mtu 9000

source /etc/network/interfaces.d/*

Code:
cat /etc/network/interfaces.d/sdn

auto ln_HexaDonQ
iface ln_HexaDonQ
link-type veth
veth-peer-name pr_HexaDonQ

auto pr_HexaDonQ
iface pr_HexaDonQ
link-type veth
veth-peer-name ln_HexaDonQ

auto vmbrQnQ
iface vmbrQnQ
bridge-vlan-protocol 802.1ad

auto vmbrQnQ.300
iface vmbrQnQ.300
vlan-protocol 802.1ad

auto vnetQnQ
iface vnetQnQ
bridge_ports pr_HexaDonQ
bridge_stp off
bridge_fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
mtu 9000

auto z_HexaDonQ
iface z_HexaDonQ
mtu 9000
bridge-stp off
bridge-ports vmbrQnQ.300 ln_HexaDonQ
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094

Code:
qm config <vmid>

qm config 101
agent: 1
boot: order=virtio0
cores: 2
cpu: kvm64
memory: 4096
meta: creation-qemu=9.2.0,ctime=1752524254
name: vm-test-vdbi
net1: virtio=BC:24:11:C4:25:F9,bridge=vnetQnQ,mtu=9000,tag=802
numa: 0
ostype: l26
rng0: source=/dev/urandom
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=b2ec3b24-0747-4684-aff8-7092ffa5351e
sockets: 2
tablet: 0
virtio0: data:vm-101-disk-0,backup=0,format=raw,iothread=1,size=32G
vmgenid: 11acff7a-39e2-4bfa-8ebc-98e97e83e8e2


do you potentially have a full tcpdump across all devices?
Code:
tcpdump -envi any arp or icmp

Code:
14:14:06.924480 eno4np3 B   ifindex 5 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
14:14:06.924484 bond1 B   ifindex 11 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
14:14:06.924491 vmbrQnQ B   ifindex 12 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46

14:14:07.948544 eno4np3 B   ifindex 5 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
14:14:07.948548 bond1 B   ifindex 11 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
14:14:07.948555 vmbrQnQ B   ifindex 12 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46

14:14:08.972455 eno4np3 B   ifindex 5 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
14:14:08.972458 bond1 B   ifindex 11 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
14:14:08.972466 vmbrQnQ B   ifindex 12 00:50:56:a6:2d:f2 ethertype ARP (0x0806), length 66: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46

Many thanks for your help
Sincerely
 
Last edited:
Sorry, it seems like -i any does not show VLANs, so the tcpdump output is not quite as helpful.
Could you try again with tcpdumping on the involved ports separately (you'll have to run multiple tcpdump instances at the same time).

Code:
tcpdump -envi eno4np3 arp or icmp
tcpdump -envi bond1 arp or icmp
tcpdump -envi vmbrQnQ arp or icmp
tcpdump -envi vnetQnQ arp or icmp
tcpdump -envi tap101i1 arp or icmp

Which of the the IPs involved is the VM? 10.125.100.17?
Can you try pinging from both directions?
 
Last edited:
To recap:

Code:
VM  (tap101i1 - bc:24:11:c4:25:f9
    vlan 802 - 10.125.100.17 ) <=> vnetQnQ (pr_HexaDonQ)
                                ||
                       z_HexaDonQ (ln_HexaDonQ)
                                ||
                         vmbrQnQ.300 (on vmbrQnQ
                                      on bond1
                                      on eno4np3) <=> [qinq] swD [qinq] <-> swVDBI <-trunk-> VMext (00:50:56:a6:2d:f2-vlan802-10.125.100.18)

ping from VM inside proxmox SDN (bc:24:11:c4:25:f9 - 10.125.100.17):
Code:
tcpdump -envi tap101i1
13:42:59.444417 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:43:00.468328 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:43:01.492344 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28

tcpdump -envi vnetQnQ
13:41:46.739703 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:41:47.763719 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:41:48.788004 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28

tcpdump -envi vmbrQnQ
13:40:44.275455 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:40:45.299258 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:40:46.323270 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28

tcpdump -envi bond1
13:39:55.123208 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:39:56.146882 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:39:57.170939 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28

tcpdump -envi eno4np3
13:38:54.706512 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:38:55.730538 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28
13:38:56.754648 bc:24:11:c4:25:f9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 300, p 0, ethertype 802.1Q (0x8100), vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.18 tell 10.125.100.17, length 28

ping from end-device (another VM) (00:50:56:a6:2d:f2-10.125.100.18) :
Code:
tcpdump -envi eno4np3
13:45:59.009684 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
13:46:00.033641 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
13:46:01.057639 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46

tcpdump -envi bond1
13:48:28.513814 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
13:48:29.537885 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
13:48:30.561822 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46

tcpdump -envi vmbrQnQ
13:50:11.937901 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
13:50:12.961930 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
13:50:13.985957 00:50:56:a6:2d:f2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 802, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.125.100.17 tell 10.125.100.18, length 46
but for
tcpdump -envi vnetQnQ
or
tcpdump -envi tap101i1
Code:
tcpdump: listening on vnetQnQ, link-type EN10MB (Ethernet), snapshot length 262144 bytes
^C195 packets captured
207 packets received by filter
0 packets dropped by kernel

tcpdump: listening on tap101i1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
^C59 packets captured
60 packets received by filter
0 packets dropped by kernel

I hope it was readable
Thanks
 
The request from outside seems to have only VLAN Tag 802 when arriving at the PVE host. Since you're using a QinQ zone it would need to have two VLAN tags (300 and 802). That's why the traffic is not able to reach the container.

You can see in the first tcpdump how the VLAN tags are expected to look like. If you only need one VLAN tag because you are stripping the Service VLAN on the switch port already, then you'd need to use a VLAN zone / VLAN-aware bridge instead.
 
So, the problem is probably on the swD/swVDBI switch configuration side.
Even though this is not a Proxmox issue, what would you suggest for the ports configuration on those switchs (HPE/Comware)?
My actual config:

Code:
         interface Ten-GigabitEthernet2/0/2   interface Ten-GigabitEthernet2/0/46    interface Ten-GigabitEthernet2/0/51   interface Bridge-Aggregation2
          port link-type trunk                 port link-type trunk                   port link-type trunk                  port link-type trunk
eno4np3 = port trunk permit vlan 300   =swD=   port trunk permit vlan 300     ====    port trunk permit vlan 802  =swVDBI=  port trunk permit vlan 802   = end device
          port trunk pvid vlan 300             port trunk pvid vlan 300               port trunk pvid vlan 99               port trunk pvid vlan 99
          qinq enable                          qinq enable

Thanks
 
Last edited: