Node accessibility differs from VM accessibility with VLAN tags

harry0

New Member
Aug 9, 2021
6
0
1
45
I've recently set up a Proxmox node with a pretty straightforward networking setup - 2x LACP bonded 10GbE interfaces with a handful of VLANs. I have set up my interfaces as below:

Code:
auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves eno1 eno2
        bond-miimon 100
        bond-mode 802.3ad

auto vmbr0
iface vmbr0 inet static


        address 192.168.5.205/24
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

auto vmbr0.10
iface vmbr0.10 inet static
        address 192.168.5.206/24
        gateway 192.168.5.254

I also have a VM with a network interface using the vmbr0 bridge with a VLAN tag:
Code:
virtio=[MAC],bridge=vmbr0,tag=10

The strange behaviour is that if I set the ports on the switch to be untagged members of VLAN 10, the node is accessible via the web gui or SSH, but the VM is inaccessible. When I set the ports to be tagged members of VLAN 10, the node itself becomes inaccessible via gui or SSH, but the VM becomes accessible.

This is rather unexpected, since I would have expected the node to behave exactly like the VM with the vmbr0.10 VLAN created - any ideas why it doesn't and how I can correct so that the node is accessible via a tagged VLAN?
 
I did a little more experimenting and changed the VLAN id to 11 (which is unused / no other devices) and I can still access the gui and SSH from VLAN 10 via the IP assigned to the VLAN (192.168.5.206) while the port is an untagged member of VLAN 10, which I would not expect to have been able to do.

It seems like perhaps the vlan tags are not being passed across the bridge from the vlan created in the network settings on the node, but they do get passed from the VM? What would cause this to happen?
 
As additional info, TCPdump shows packets getting to the bridge when I activate tagged mode on the relevant switch ports:

Code:
root@pve:~# tcpdump -i vmbr0 -n -e vlan 10
13:56:11.044406 18:3e:ef:c9:16:ff > ec:b1:d7:8b:1e:30, ethertype 802.1Q (0x8100), length 66: vlan 10, p 0, ethertype IPv4 (0x0800), 192.168.5.130.63931 > 192.168.5.206.8006: Flags [S], seq 2540212532, win 65535, options [mss 1460,sackOK,eol], length 0

and they seem to get to the VLAN interface (note that I removed the vlan 10 option, as I assume the bridge will remove the tags before passing):

Code:
root@pve:~# tcpdump -i vmbr0.10 -n -e
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vmbr0.10, link-type EN10MB (Ethernet), snapshot length 262144 bytes
14:10:42.872628 18:3e:ef:c9:16:ff > ec:b1:d7:8b:1e:30, ethertype IPv4 (0x0800), length 1135: 192.168.5.130.64535 > 192.168.5.206.8006: Flags [P.], seq 2496117057:2496118126, ack 1780371973, win 2048, options [nop,nop,TS val 671140303 ecr 2879051291], length 1069

So it seems like the traffic is getting through... do I need to tell the server to use this interface rather than the bridge?
 
OK, I worked this out eventually - for anyone coming here with a similar problem, I had to remove the IP address from the bridge to get the node to use the VLAN interface. I don't think this is intuitive behaviour when both interfaces have different IPs? Ideally, I would have liked to be able to access the bridge directly on .205 and the VLAN on .206, but seems that's not possible.