ifupdown2 + OVS trouble with Fake Bridges

Jun 30, 2020
23
1
8
Our PVE-Cluster has been running on legacy ifupdown with a complex OVS-Setup that we want to continue using with ifupdown2.

Therefore, we have set up test hardware mimicking the NIC setup of our production Cluster with a fresh PVE 7.2 installation.

Our setup is making use of a parent OVS bridge configuring RSTP and attaching "fake bridges" to it (following the ovs-vsctl manpage) in order to hard code different VLAN-Tags in the vmbr<vlan> Bridges presentet to VMs.

This has worked well with legacy ifupdown - however, using ifupdown2 we keep receiving the following error - and the vmbr "fake bridges" are not set up at all:

error: cmd '/usr/bin/ovs-vsctl -- --may-exist add-br vmbr1005 -- set bridge vmbr1005 br_mgmt 0' failed: returned 1 (ovs-vsctl: Bridge does not contain a column whose name matches "br_mgmt"
)
error: cmd '/usr/bin/ovs-vsctl -- --may-exist add-br vmbr4 -- set bridge vmbr4 br_mgmt 4' failed: returned 1 (ovs-vsctl: Bridge does not contain a column whose name matches "br_mgmt"
)
error: cmd '/usr/bin/ovs-vsctl -- --may-exist add-br vmbr6 -- set bridge vmbr6 br_mgmt 6' failed: returned 1 (ovs-vsctl: Bridge does not contain a column whose name matches "br_mgmt"
)

The same error is found in the boot log as well as being displayed when trying ifreload -a.

This is the /etc/network/interfaces we are currently using for testing. It is shortened to show just the error using only a few bridges for clarity. The original config contains more bridges and interfaces overall:

Code:
## Loopback Device
auto lo
iface lo inet loopback

## Fallback Single-Port NIC ##
auto enp65s0
iface enp65s0 inet static
        address 192.168.0.89/20

## Physical Interfaces ##

auto enp2s0f0
allow-br_mgmt enp2s0f0
iface enp2s0f0 inet manual
  ovs_bridge   br_mgmt
  ovs_type     OVSPort
  ovs_options  other_config:rstp-enable=true other_config:rstp-port-admin-edge=false other_config:rstp-port-auto-edge=false other_config:rstp-port-mcheck=true

auto enp2s0f1
allow-br_mgmt enp2s0f1
iface enp2s0f1 inet manual
  ovs_bridge   br_mgmt
  ovs_type     OVSPort
  ovs_options  other_config:rstp-enable=true other_config:rstp-port-admin-edge=false other_config:rstp-port-auto-edge=false other_config:rstp-port-mcheck=true

## Open vSwitch Bridges ##

# Proxmox Mgmt Network + VM-VLANs

auto br_mgmt
allow-ovs br_mgmt
iface br_mgmt inet manual
  ovs_type  OVSBridge
  ovs_ports if_mgmt enp2s0f0 enp2s0f1
  up        ovs-vsctl set Bridge ${IFACE} rstp_enable=true other_config:rstp-priority=61440
  # wait for spanning-tree convergence
  post-up   sleep 10

auto if_mgmt
allow-br_mgmt if_mgmt
iface if_mgmt inet static
  ovs_type   OVSIntPort
  ovs_bridge br_mgmt
  ovs_extra  set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
  address    10.5.5.106
  netmask    255.255.255.0
  gateway    10.5.5.1

## Additional internal Interfaces ##

auto if_backup
allow-br_mgmt if_backup
iface if_backup inet static
  ovs_type     OVSIntPort
  ovs_bridge   br_mgmt
  ovs_options  tag=1070
  ovs_extra    set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
  address      10.5.70.106
  netmask      255.255.255.0

## Fake Bridges for VM-VLANs ##

auto vmbr1005
allow-br_mgmt vmbr1005
iface vmbr1005 inet manual
  ovs_type    OVSBridge
  ovs_options br_mgmt 0

auto vmbr4
allow-br_mgmt vmbr4
iface vmbr4 inet manual
  ovs_type    OVSBridge
  ovs_options br_mgmt 4

auto vmbr6
allow-br_mgmt vmbr6
iface vmbr6 inet manual
  ovs_type    OVSBridge
  ovs_options br_mgmt 6

We have spent several hours Googling and scanning documentation, but don't find the reason for this problem. Strangely, though, ifreload -s -a complains about wrong ovs_type, though we can see nothing wrong there - and everything seems to work except the fake bridges.

root@pvetest01:~# ifreload -s -a
warning: enp2s0f0: ovs-type: invalid value "OVSPort": valid attribute values: ['OVSBridge']
warning: enp2s0f1: ovs-type: invalid value "OVSPort": valid attribute values: ['OVSBridge']
warning: br_mgmt: ovs-type: invalid value "OVSBridge": valid attribute values: ['OVSPort', 'OVSIntPort', 'OVSBond', 'OVSTunnel', 'OVSPatchPort']
warning: if_mgmt: ovs-type: invalid value "OVSIntPort": valid attribute values: ['OVSBridge']
warning: if_backup: ovs-type: invalid value "OVSIntPort": valid attribute values: ['OVSBridge']
warning: vmbr1005: ovs-type: invalid value "OVSBridge": valid attribute values: ['OVSPort', 'OVSIntPort', 'OVSBond', 'OVSTunnel', 'OVSPatchPort']
warning: vmbr4: ovs-type: invalid value "OVSBridge": valid attribute values: ['OVSPort', 'OVSIntPort', 'OVSBond', 'OVSTunnel', 'OVSPatchPort']
warning: vmbr6: ovs-type: invalid value "OVSBridge": valid attribute values: ['OVSPort', 'OVSIntPort', 'OVSBond', 'OVSTunnel', 'OVSPatchPort']

Can anyone help with this problem?
 
ok, found in documentation:

"ovs-vsctl add-br <fake bridge> <parent bridge> <VLAN>"

this is not going to work with ovs_options, this need to add support in ifupdown2. (I don't think that it's possible in ifupdown1).

do you have tried with a simple "post-up ovs-vsctl add-br <fake bridge> <parent bridge> <VLAN>" in br_mgm?
 
ok, found in documentation:

"ovs-vsctl add-br <fake bridge> <parent bridge> <VLAN>"

this is not going to work with ovs_options, this need to add support in ifupdown2. (I don't think that it's possible in ifupdown1).

do you have tried with a simple "post-up ovs-vsctl add-br <fake bridge> <parent bridge> <VLAN>" in br_mgm?
Actually, in ifupdown1 this is exactly how it works. The interfaces configuration on our production PVE-Cluster still running ifupdown1 is set up with the same ovs_options.

Initially, we wanted to migrate this to ifupdown2 without too many changes at one time.

Thanks for the idea with post-up - we're going to try this asap and report back.
 
another way could be to use the new sdn feature of proxmox for your vm-vlan networks.
https://pve.proxmox.com/pve-docs/chapter-pvesdn.html

(It's create a linux bridge by vlan, plugged on your ovs switch tagged port)
We have thought about the SDN-feature as well. For two reasons we're not pursuing it at this time, though:
  1. We don't want to change too many things in one step - tends to be more error-prone.
  2. SDN is still marked experimental. We're certainly going to have a look at it soonish on our test server - but for production we're still waiting.
 
do you have tried with a simple "post-up ovs-vsctl add-br <fake bridge> <parent bridge> <VLAN>" in br_mgm?

OK, we tried using post-up on the parent bridge br_mgmt as you suggested:

Code:
auto br_mgmt
iface br_mgmt inet manual
  ovs_type  OVSBridge
  ovs_ports if_mgmt enp2s0f0 enp2s0f1
  up        ovs-vsctl set Bridge ${IFACE} rstp_enable=true other_config:rstp-priority=61440
  # wait for spanning-tree convergence
  post-up   sleep 10; \
            ovs-vsctl add-br vmbr1005 br_mgmt 0; \
            ovs-vsctl add-br vmbr4    br_mgmt 4; \
            ovs-vsctl add-br vmbr6    br_mgmt 6

This does set up the vmbr<VLAN> "Fake Bridges" as the output of ovs-vsctl show confirms.

However, PVE does not recognize the vmbr<VLAN> Interfaces. And in VM Setup there are just no interfaces available at all.

Seemingly, PVE only relies on parsing /etc/network/interfaces to detect which interfaces are available in the system. It does not understand the commands used in post-up and relies on the (now missing) iface-lines.

-----

So we tried combining both:
  • Using post-up commands to set up the fake bridges AND
  • adding iface-entries for them at the same time.
To avoid errors, the ovs_options were left out in the iface-entries for vmbrs.

Code:
auto br_mgmt
iface br_mgmt inet manual
  ovs_type  OVSBridge
  ovs_ports if_mgmt enp2s0f0 enp2s0f1
  up        ovs-vsctl set Bridge ${IFACE} rstp_enable=true other_config:rstp-priority=61440
  # wait for spanning-tree convergence
  post-up   sleep 10; \
            ovs-vsctl add-br vmbr1005 br_mgmt 0; \
            ovs-vsctl add-br vmbr4    br_mgmt 4; \
            ovs-vsctl add-br vmbr6    br_mgmt 6
...
auto vmbr1005
iface vmbr1005 inet manual
  ovs_type    OVSBridge

auto vmbr4
iface vmbr4 inet manual
  ovs_type    OVSBridge

auto vmbr6
iface vmbr6 inet manual
  ovs_type    OVSBridge

This seems to work at first glance.

At least all the vmbr<VLAN> fake bridges are set up AND PVE is actually showing them in its interfaces list.

Functional tests will be approached tomorrow with a test VM setup (too late for today).

I do wonder, though, if there is any risk of a race condition or other problems due to undefined execution orders/states by specifying the same vmbrs twice in /etc/network/interfaces.

And I still wish there'd be a better way to do this than using post-up. With this approach, we can not use the PVE-WebUI to change the vmbrs and click "Apply Configuration"
 
another way, used in the sdn, is to create an ovsintport with tagged, and plug in a a vmbr

Code:
auto br_mgmt
iface br_mgmt inet manual
  ovs_type  OVSBridge
  ovs_ports if_mgmt enp2s0f0 enp2s0f1


auto ln_vlan4
iface ln_vlan4
    ovs_type OVSIntPort
    ovs_bridge br_mgmt
    ovs_options tag=4

auto vmbr4
iface vmbr4
    bridge_ports ln_vmbr4
    bridge_stp off
    bridge_fd 0

in this example, vmbr4 is a linux bridge, but I think it should works too with an ovs.
and you should be able to do it with gui without problem.
 
what is

"
ovs_options br_mgmt 4
ovs_options br_mgmt 6
ovs_options br_mgmt 0
"

on your fake bridge ???


(also note that with ifupdown2, you don't need "allow-X" in your config, only "auto ..."

We revisited this question and had a deeper look into the source code of ifupdown vs source code of ifupdown2.

We found significantly different handling of ovs_options in ifupdown2 and comparing with the description of options in the help text think this is a bug in ifupdown2.

Fixing this deviation is now our preferred approach, abolishing all need for workarounds. An issue has been opened on Github accordingly:
https://github.com/CumulusNetworks/ifupdown2/issues/245
 
Interesting approach. I'm suprised that SDN is supposed to do this, despite the Proxmox Wiki explizitly warning about mixing Linux Bridges and OVS Bridges.

(https://pve.proxmox.com/wiki/Open_vSwitch)
you shouldn't plug bond or physical vlan tagged interfaces inside ovs, because ovs already manage this. But it's really not a problem to plug a linux bridge on an ovs switch. (when you enabled proxmox firewall on a nic, you have a new fwbr bridge created on top of ovs for example).
 
We revisited this question and had a deeper look into the source code of ifupdown vs source code of ifupdown2.

We found significantly different handling of ovs_options in ifupdown2 and comparing with the description of options in the help text think this is a bug in ifupdown2.

Fixing this deviation is now our preferred approach, abolishing all need for workarounds. An issue has been opened on Github accordingly:
https://github.com/CumulusNetworks/ifupdown2/issues/245
I'm the author of the ifupdown2 ovs plugin, so yes, I known where is the problem, I'll try to look at it next week.
 
Has this been fixed in newer versions of ifupdown2? I see the issue was merged in 2022, I am using Debian12 and seeing this issue aswell. Has it been actually fixed or will it be in Debian13?

EDIT: Nvm, I had allow-* configs still present everywhere.
 
Last edited: