VLAN Tagging doesn't work the same after PVE 6.x -> 7.x upgrade. pfsense VM + OVS errors in logs despite only using linux bridges.

jsalas424

Member
Jul 5, 2020
124
1
18
32
PVE 7.1-10

I had been successfully running a pfsense VM with VLAN tagging through the GUI/OVS for ~2 years prior to upgrading to PVE 7.x. Now, none of my tagged VM's can reach their gateways. VLAN routing is still working correctly in the rest of my network stack, it's only the VLAN Tagged VMs that have stuttered.

I have simplified the problem by using VLAN-aware Linux bridges, but this hasn't worked either. Even though it stopped working after the PVE upgrade, I considered that it might be a hardware issue with the NIC, so I installed a brand-new Intel NIC. Still nothing.

Screen Shot 2022-01-28 at 9.39.53 PM.png
This is the pfsense VM config

Here are what my logs look when I tag and untag a VM:

Tagging:
Code:
Jan 28 21:28:49 TracheNodeB pvedaemon[531415]: <root@pam> update VM 420: -net0 virtio=26:77:A8:6E:5E:4F,bridge=vmbr0,tag=3
Jan 28 21:28:49 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered disabled state
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084464]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln420i0
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084464]: ovs|00002|db_ctl_base|ERR|no port named fwln420i0
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084465]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap420i0
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084465]: ovs|00002|db_ctl_base|ERR|no port named tap420i0
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084466]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap420i0
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084466]: ovs|00002|db_ctl_base|ERR|no port named tap420i0
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084467]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln420i0
Jan 28 21:28:49 TracheNodeB ovs-vsctl[3084467]: ovs|00002|db_ctl_base|ERR|no port named fwln420i0
Jan 28 21:28:49 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered blocking state
Jan 28 21:28:49 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered disabled state
Jan 28 21:28:49 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered blocking state
Jan 28 21:28:49 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered forwarding state
Jan 28 21:29:42 TracheNodeB pmxcfs[1952]: [dcdb] notice: data verification successful

Untagging:
Code:
Jan 28 21:29:50 TracheNodeB pvedaemon[531417]: <root@pam> update VM 420: -net0 virtio=26:77:A8:6E:5E:4F,bridge=vmbr0
Jan 28 21:29:50 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered disabled state
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086020]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln420i0
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086020]: ovs|00002|db_ctl_base|ERR|no port named fwln420i0
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086021]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap420i0
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086021]: ovs|00002|db_ctl_base|ERR|no port named tap420i0
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086022]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap420i0
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086022]: ovs|00002|db_ctl_base|ERR|no port named tap420i0
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086023]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln420i0
Jan 28 21:29:50 TracheNodeB ovs-vsctl[3086023]: ovs|00002|db_ctl_base|ERR|no port named fwln420i0
Jan 28 21:29:50 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered blocking state
Jan 28 21:29:50 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered disabled state
Jan 28 21:29:50 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered blocking state
Jan 28 21:29:50 TracheNodeB kernel: vmbr0: port 3(tap420i0) entered forwarding state

I'm not sure why we're seeing OVS errors in the logs since nobody in my cluster is on OVS at the moment.

Thank for the advice!
 

jsalas424

Member
Jul 5, 2020
124
1
18
32
The plot thickens!

I have restored VLAN-tagged routing on one host, the host that is running pfsense. I did this by setting the gateway on the bridge to blank. Since pfsense makes a different gateway for each VLAN, this made sense to me.

This is the network config on the PVE node that is both running pfsense AND successfully passing VLAN-tagged traffic.
Code:
root@TracheNodeB:~# cat  /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet static
        address 10.0.0.4/24
#Cluster Port - Integrated NIC

auto enp3s0f0
iface enp3s0f0 inet manual
#WAN Port

auto enp3s0f1
iface enp3s0f1 inet manual
#LAN Port

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.25/24
        bridge-ports enp3s0f1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#LAN Bridge - Static

auto vmbr1
iface vmbr1 inet manual
        bridge-ports enp3s0f0
        bridge-stp off
        bridge-fd 0
#WAN Bridge

But neither of the other two nodes in the cluster are working! I have 1 node with the gateway declared, and the other node with a blank gateway.

Config that doesnt work:

Code:
root@TracheNodeA:~# cat  /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto enp2s0
iface enp2s0 inet static
        address 10.0.0.3/24
#cluster NIC

auto eno1
iface eno1 inet manual
#LAN NIC

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.24/24
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#LAN Bridge
 

jsalas424

Member
Jul 5, 2020
124
1
18
32
I also tried another network configuration per the Open vSwitch documentation.

1) I created an OVS bridge with no gateway or IP address,
2) Tag physical interface VLAN 1 and use option "vlan_mode=native-untagged"
3) Create IntPort to the Proxmox GUI with IP and Gateway, tag it VLAN 1
4) Create IntPort to VLAN3, tag it VLAN3
5) Add physical interface, and both intports to bridge
6) Reboot

Still can't tag a VM with VLAN3. This is the interface
Code:
root@TracheNodeA:~# cat  /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto enp2s0
iface enp2s0 inet static
        address 10.0.0.3/24
#cluster NIC

auto eno1
iface eno1 inet manual
        ovs_type OVSPort
        ovs_bridge vmbr0
        ovs_options tag=1 vlan_mode=native-untagged
#LAN NIC

auto admin
iface admin inet static
        address 192.168.1.24/24
        gateway 192.168.1.1
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=1

auto vlan3
iface vlan3 inet static
        address 192.168.1.24/24
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=3

auto vmbr0
iface vmbr0 inet manual
        ovs_type OVSBridge
        ovs_ports eno1 admin vlan3
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!