Hello!
I'm having a hard time getting VLANs to work properly on my new Proxmox cluster. Hopefully someone has some suggestions to get this working. In short, it seems that Proxmox isn't properly passing tagged packets to my VMs.
Example VLAN 542 (10.0.42.0/24):
I have a secondary 10GB NIC that's attached to a trunk port on my switch. I have a vlan aware bridge (vmbr1) created for this NIC (enp65s0). Next, I have a Windows 10 VM that is set to use vmbr1 with a vlan tag of 542. The guest OS NIC is configured on IP 10.0.42.8.
There is an IP configured on another router of 10.0.42.1.
If I try to ping .1 from .8, I get no reply. And same goes for pinging .8 from .1. However, if I run a tcpdump on the proxmox node for enp65s0, I can see the packets come in (ARP packets at least) and but never go out.
TCPDump with ping from .1 to .8:
Network config on node:
All the above was generated via the GUI. As you can see, I have a few other vlans setup that I've been using for testing. The 'Linux VLAN' with IPs on the hypervisor also do not work.... with the exception of VLAN 88 !?
That last bit is what really makes me scratch my head, vlan 88 works just fine from the router, to the node, to the guest VM (if I switch it from 542 to 88). Initially this last bit of information made me question the switch's configuration. But I'm using that switch to handle trunking to several other switches without issue, and this port is configured the same. Additionally, tcpdump shows the packets coming in on the correct vlan.
If I run the packet capture on the bridge, I can see the VM calling out with ARP requests but the router on .1 never makes it to the bridge.
It's entirely possible that there's something super simple that I'm overlooking. I appreciate any suggestions in advance!
Thanks!
I'm having a hard time getting VLANs to work properly on my new Proxmox cluster. Hopefully someone has some suggestions to get this working. In short, it seems that Proxmox isn't properly passing tagged packets to my VMs.
Example VLAN 542 (10.0.42.0/24):
I have a secondary 10GB NIC that's attached to a trunk port on my switch. I have a vlan aware bridge (vmbr1) created for this NIC (enp65s0). Next, I have a Windows 10 VM that is set to use vmbr1 with a vlan tag of 542. The guest OS NIC is configured on IP 10.0.42.8.
There is an IP configured on another router of 10.0.42.1.
If I try to ping .1 from .8, I get no reply. And same goes for pinging .8 from .1. However, if I run a tcpdump on the proxmox node for enp65s0, I can see the packets come in (ARP packets at least) and but never go out.
TCPDump with ping from .1 to .8:
Bash:
root@hv2:~# tcpdump -envi enp65s0 -e '(vlan 542)'
-snip-
10:53:16.578688 18:fd:74:c1:4d:0a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.8 tell 10.0.42.1, length 42
10:53:17.635885 18:fd:74:c1:4d:0a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.8 tell 10.0.42.1, length 42
10:53:18.675923 18:fd:74:c1:4d:0a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.8 tell 10.0.42.1, length 42
10:53:20.603052 18:fd:74:c1:4d:0a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.8 tell 10.0.42.1, length 42
10:53:21.635981 18:fd:74:c1:4d:0a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.8 tell 10.0.42.1, length 42
10:53:22.675995 18:fd:74:c1:4d:0a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.8 tell 10.0.42.1, length 42
-snip-
Network config on node:
Bash:
root@hv2:~# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
iface eno1 inet manual
iface eno3 inet manual
iface eno4 inet manual
iface eno2 inet manual
auto enp65s0
iface enp65s0 inet manual
#10G
auto vmbr0
iface vmbr0 inet static
address 10.10.44.36/24
gateway 10.10.44.1
bridge-ports eno1
bridge-stp off
bridge-fd 0
auto vmbr1
iface vmbr1 inet manual
bridge-ports enp65s0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#10GB Bridge
auto vlan88
iface vlan88 inet static
address 10.10.88.36/24
vlan-raw-device vmbr1
#CEPH VLAN 88
auto vlan542
iface vlan542 inet static
address 10.0.42.9/24
vlan-raw-device vmbr1
#Test VLAN 542
auto vlan42
iface vlan42 inet static
address 10.10.42.242/24
vlan-raw-device vmbr1
#Test VLAN 42
All the above was generated via the GUI. As you can see, I have a few other vlans setup that I've been using for testing. The 'Linux VLAN' with IPs on the hypervisor also do not work.... with the exception of VLAN 88 !?
That last bit is what really makes me scratch my head, vlan 88 works just fine from the router, to the node, to the guest VM (if I switch it from 542 to 88). Initially this last bit of information made me question the switch's configuration. But I'm using that switch to handle trunking to several other switches without issue, and this port is configured the same. Additionally, tcpdump shows the packets coming in on the correct vlan.
If I run the packet capture on the bridge, I can see the VM calling out with ARP requests but the router on .1 never makes it to the bridge.
Bash:
root@hv2:~# tcpdump -envi vmbr1 -e '(vlan 542)'
tcpdump: listening on vmbr1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
11:06:17.091353 5e:87:93:37:21:39 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.1 tell 10.0.42.8, length 28
11:06:18.079245 5e:87:93:37:21:39 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.1 tell 10.0.42.8, length 28
11:06:19.079239 5e:87:93:37:21:39 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.1 tell 10.0.42.8, length 28
11:06:20.090636 5e:87:93:37:21:39 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.1 tell 10.0.42.8, length 28
11:06:21.079295 5e:87:93:37:21:39 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.1 tell 10.0.42.8, length 28
11:06:22.079316 5e:87:93:37:21:39 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.1 tell 10.0.42.8, length 28
11:06:23.092117 5e:87:93:37:21:39 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 542, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.42.1 tell 10.0.42.8, length 28
^C
7 packets captured
7 packets received by filter
0 packets dropped by kernel
It's entirely possible that there's something super simple that I'm overlooking. I appreciate any suggestions in advance!
Thanks!