[SOLVED] So is OpenVSwitch bonding just broken on PVE 6? What's going on?

reckless

New Member
Feb 5, 2019
15
1
3
I had no problem using OVS to bond 2 Mellanox interfaces on PVE 5. Now I'm on 6 and I'm having a lot of issues, and I'm not the only one. Other recent threads are here:

https://forum.proxmox.com/threads/proxmox-6-network-wont-start.56362/
https://forum.proxmox.com/threads/pve-6-and-mellanox-4-x-drivers.56553/


I followed the tutorial but it's just not working for me. I have the latest Proxmox version as of today, with everything updated. I have 2 Mellanox interfaces I want to bond, on the same NIC: the Mellanox ConnectX-3 (MCX312A-XCBT, 2x SFP+ ports). Using ip a I get this:

Code:
4: enp132s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:02:c9:3b:61:10 brd ff:ff:ff:ff:ff:ff
5: enp132s0d1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:02:c9:3b:61:11 brd ff:ff:ff:ff:ff:ff
These are the two interfaces I want to bond together: enp132s0 + enp132s0d1. This is currently my /etc/network/interfaces (I followed the proxmox OVS manual on the website):

Code:
allow-vmbr2 bond0
iface bond0 inet manual
    ovs_bonds enp132s0 enp132s0d1
    ovs_type OVSBond
    ovs_bridge vmbr2
    mtu 9000
    ovs_options bond_mode=balance-tcp other_config:lacp-time=fast lacp=active
    pre-up ( ifconfig enp132s0 mtu 9000 && ifconfig enp132s0d1 mtu 9000 )
# Force the MTU of the physical interfaces to be jumbo-frame capable.
# This doesn't mean that any OVSIntPorts must be jumbo-capable.
# We cannot, however set up definitions for eth0 and eth1 directly due
# to what appear to be bugs in the initialization process.

auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

iface enp132s0 inet manual

iface enp132s0d1 inet manual

auto vmbr0
iface vmbr0 inet static
    address  192.168.1.23
    netmask  255.255.255.0
    gateway  192.168.1.1
    bridge-ports eno1
    bridge-stp off
    bridge-fd 0

allow-ovs vmbr2

auto vmbr2
iface vmbr2 inet manual
    ovs_type OVSBridge
    ovs_ports bond0
    mtu 9000
# NOTE: we MUST mention bond0, vlan50, and vlan55 even though each
#       of them lists ovs_bridge vmbr0!  Not sure why it needs this
#       kind of cross-referencing but it won't work without it!
Note that ifconfig is used while it's depreciated. Surely this can't be correct? What should I replace it with?

The above simply doesn't work. When I type dmesg | grep -i enp132s0 I get this output:
Code:
root@proxmox:~# dmesg | grep -i enp132s0
[   12.432376] mlx4_core 0000:84:00.0 enp132s0: renamed from eth0
[   12.473761] mlx4_core 0000:84:00.0 enp132s0d1: renamed from eth0
[   15.459747] mlx4_en: enp132s0: Link Up
[   15.564489] mlx4_en: enp132s0d1: Link Up
[ 1073.864057] mlx4_en: enp132s0: Link Down
[ 1082.575190] mlx4_en: enp132s0d1: Link Down
[ 1095.603476] mlx4_en: enp132s0: Link Up
[ 1099.024206] mlx4_en: enp132s0d1: Link Up
[ 1113.176462] mlx4_en: enp132s0: Link Down
[ 1116.346555] mlx4_en: enp132s0d1: Link Down
[ 1121.184069] mlx4_en: enp132s0: Link Up
[ 1122.601733] mlx4_en: enp132s0d1: Link Up

And from the syslog:

Code:
Sep 14 14:19:48 proxmox systemd[1]: Starting Open vSwitch...
Sep 14 14:19:48 proxmox zed[2215]: eid=2 class=config_sync pool_guid=0x713F0C2B1A688975
Sep 14 14:19:48 proxmox openvswitch-switch[2209]: ovsdb-server is already running.
Sep 14 14:19:48 proxmox openvswitch-switch[2209]: ovs-vswitchd is already running.
Sep 14 14:19:48 proxmox ovs-vsctl[2261]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . external-ids:hostname=proxmox.vice
Sep 14 14:19:48 proxmox openvswitch-switch[2209]: Enabling remote OVSDB managers.
Sep 14 14:19:48 proxmox zed[2272]: eid=3 class=pool_import pool_guid=0x713F0C2B1A688975
Sep 14 14:19:48 proxmox ovs-vsctl[2277]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-br vmbr2 --
Sep 14 14:19:48 proxmox systemd-udevd[920]: Using default interface naming scheme 'v240'.
Sep 14 14:19:48 proxmox systemd-udevd[920]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 14 14:19:48 proxmox systemd-udevd[920]: Could not generate persistent MAC address for ovs-system: No such file or directory
Sep 14 14:19:48 proxmox kernel: device ovs-system entered promiscuous mode
Sep 14 14:19:48 proxmox kernel: netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.
Sep 14 14:19:48 proxmox zed[2320]: eid=4 class=history_event pool_guid=0x713F0C2B1A688975
Sep 14 14:19:48 proxmox zed[2343]: eid=5 class=config_sync pool_guid=0x713F0C2B1A688975
Sep 14 14:19:48 proxmox systemd-udevd[920]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 14 14:19:48 proxmox systemd-udevd[920]: Could not generate persistent MAC address for vmbr2: No such file or directory
Sep 14 14:19:48 proxmox kernel: netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.
Sep 14 14:19:48 proxmox kernel: device vmbr2 entered promiscuous mode
Sep 14 14:19:48 proxmox openvswitch-switch[2209]: /bin/sh: 1: ifconfig: not found
Sep 14 14:19:48 proxmox openvswitch-switch[2209]: ifup: failed to bring up bond0
Sep 14 14:19:48 proxmox systemd[1]: Started Open vSwitch.
Sep 14 14:19:48 proxmox systemd[1]: Starting Raise network interfaces...
Sep 14 14:19:48 proxmox systemd-udevd[920]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 14 14:19:48 proxmox systemd-udevd[920]: Could not generate persistent MAC address for vmbr0: No such file or directory

How do I get this to work? And why is ifconfig still used when it's depreciated, shouldn't the tutorial be updated?
 

reckless

New Member
Feb 5, 2019
15
1
3
Hi spirit, so like this?

Code:
allow-vmbr2 bond0
iface bond0 inet manual
    ovs_bonds enp132s0 enp132s0d1
    ovs_type OVSBond
    ovs_bridge vmbr2
    mtu 9000
    ovs_options bond_mode=balance-tcp other_config:lacp-time=fast lacp=active
    pre-up ( ifconfig enp132s0 mtu 9000 && ifconfig enp132s0d1 mtu 9000 )
# Force the MTU of the physical interfaces to be jumbo-frame capable.
# This doesn't mean that any OVSIntPorts must be jumbo-capable.
# We cannot, however set up definitions for eth0 and eth1 directly due
# to what appear to be bugs in the initialization process.

auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

iface enp132s0 inet manual

iface enp132s0d1 inet manual

auto vmbr0
iface vmbr0 inet static
    address  192.168.1.23
    netmask  255.255.255.0
    gateway  192.168.1.1
    bridge-ports eno1
    bridge-stp off
    bridge-fd 0

allow-ovs vmbr2

iface vmbr2 inet manual
    ovs_type OVSBridge
    ovs_ports bond0
    mtu 9000
# NOTE: we MUST mention bond0, vlan50, and vlan55 even though each
#       of them lists ovs_bridge vmbr0!  Not sure why it needs this
#       kind of cross-referencing but it won't work without it!
 

reckless

New Member
Feb 5, 2019
15
1
3
@spirit I tried the above and it still doesn't work. This is the output of dmesg | grep -i mlx
Code:
root@machine:~# dmesg | grep -i mlx
[    4.953091] mlx4_core: Mellanox ConnectX core driver v4.0-0
[    4.953109] mlx4_core: Initializing 0000:84:00.0
[   12.173721] mlx4_core 0000:84:00.0: DMFS high rate steer mode is: disabled performance optimized steering
[   12.174069] mlx4_core 0000:84:00.0: 63.008 Gb/s available PCIe bandwidth (8 GT/s x8 link)
[   12.422921] mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.0-0
[   12.423137] mlx4_en 0000:84:00.0: Activating port:1
[   12.426692] mlx4_en: 0000:84:00.0: Port 1: Using 32 TX rings
[   12.426693] mlx4_en: 0000:84:00.0: Port 1: Using 16 RX rings
[   12.426990] mlx4_en: 0000:84:00.0: Port 1: Initializing port
[   12.427441] mlx4_en 0000:84:00.0: registered PHC clock
[   12.427907] mlx4_en 0000:84:00.0: Activating port:2
[   12.428508] mlx4_core 0000:84:00.0 enp132s0: renamed from eth0
[   12.429085] mlx4_en: 0000:84:00.0: Port 2: Using 32 TX rings
[   12.429086] mlx4_en: 0000:84:00.0: Port 2: Using 16 RX rings
[   12.429307] mlx4_en: 0000:84:00.0: Port 2: Initializing port
[   12.454384] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.0-0
[   12.456181] mlx4_core 0000:84:00.0 enp132s0d1: renamed from eth0
[   12.457955] <mlx4_ib> mlx4_ib_add: counter index 2 for port 1 allocated 1
[   12.457956] <mlx4_ib> mlx4_ib_add: counter index 3 for port 2 allocated 1
[   15.412408] mlx4_en: enp132s0: Link Up
[   15.567100] mlx4_en: enp132s0d1: Link Up
That looks fine to me, but now from the syslog I see these entries:

Code:
Sep 15 13:26:42 proxmox ovs-ctl[2865]: Starting ovsdb-server.
Sep 15 13:26:42 proxmox ovs-vsctl[2920]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=7.16.1
Sep 15 13:26:42 proxmox ovs-vsctl[2926]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.10.1 "external-ids:system-id=\"28943f0e-ee58-4523-9f0a-9077beb02629\"" "external-ids:rundir=\"/var/run/openvswitch\"" "system-type=\"debian\"" "system-version=\"10\""
Sep 15 13:26:42 proxmox ovs-ctl[2865]: Configuring Open vSwitch system IDs.
Sep 15 13:26:42 proxmox ovs-ctl[2865]: Inserting openvswitch module.
Sep 15 13:26:42 proxmox kernel: openvswitch: Open vSwitch switching datapath
Sep 15 13:26:42 proxmox ovs-ctl[2865]: Starting ovs-vswitchd.
Sep 15 13:26:42 proxmox ovs-vsctl[2944]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . external-ids:hostname=proxmox.machine
Sep 15 13:26:42 proxmox ovs-ctl[2865]: Enabling remote OVSDB managers.
Sep 15 13:26:42 proxmox systemd[1]: Started Open vSwitch Internal Unit.
Sep 15 13:26:42 proxmox systemd[1]: Reached target Network (Pre).
Sep 15 13:26:42 proxmox systemd[1]: Starting Open vSwitch...
Sep 15 13:26:42 proxmox systemd[1]: Started Proxmox VE Login Banner.
Sep 15 13:26:42 proxmox openvswitch-switch[2948]: ovsdb-server is already running.
Sep 15 13:26:42 proxmox openvswitch-switch[2948]: ovs-vswitchd is already running.
Sep 15 13:26:42 proxmox ovs-vsctl[2999]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . external-ids:hostname=proxmox.machine
Sep 15 13:26:42 proxmox openvswitch-switch[2948]: Enabling remote OVSDB managers.
Sep 15 13:26:42 proxmox ovs-vsctl[3015]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --may-exist add-br vmbr2 --
Sep 15 13:26:42 proxmox systemd-udevd[889]: Using default interface naming scheme 'v240'.
Sep 15 13:26:42 proxmox systemd-udevd[889]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 15 13:26:42 proxmox systemd-udevd[889]: Could not generate persistent MAC address for ovs-system: No such file or directory
Sep 15 13:26:42 proxmox kernel: device ovs-system entered promiscuous mode
Sep 15 13:26:42 proxmox kernel: netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.
Sep 15 13:26:42 proxmox systemd-udevd[889]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 15 13:26:42 proxmox systemd-udevd[889]: Could not generate persistent MAC address for vmbr2: No such file or directory
Sep 15 13:26:42 proxmox kernel: netlink: 'ovs-vswitchd': attribute type 5 has an invalid length.
Sep 15 13:26:42 proxmox kernel: device vmbr2 entered promiscuous mode
Sep 15 13:26:43 proxmox openvswitch-switch[2948]: /bin/sh: 1: ifconfig: not found
Sep 15 13:26:43 proxmox openvswitch-switch[2948]: ifup: failed to bring up bond0
Sep 15 13:26:43 proxmox systemd[1]: Started Open vSwitch.
Sep 15 13:26:43 proxmox systemd[1]: Starting Raise network interfaces...
Sep 15 13:26:43 proxmox systemd-udevd[889]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 15 13:26:43 proxmox systemd-udevd[889]: Could not generate persistent MAC address for vmbr0: No such file or directory
Sep 15 13:26:43 proxmox kernel: vmbr0: port 1(eno1) entered blocking state
Sep 15 13:26:43 proxmox kernel: vmbr0: port 1(eno1) entered disabled state
Sep 15 13:26:43 proxmox kernel: device eno1 entered promiscuous mode
Sep 15 13:26:43 proxmox ifup[3179]: Waiting for vmbr0 to get ready (MAXWAIT is 2 seconds).
Sep 15 13:26:43 proxmox systemd-udevd[889]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 15 13:26:43 proxmox systemd-udevd[889]: Could not generate persistent MAC address for vmbr1: No such file or directory
Sep 15 13:26:43 proxmox kernel: vmbr1: port 1(eno2) entered blocking state
Sep 15 13:26:43 proxmox kernel: vmbr1: port 1(eno2) entered disabled state
Sep 15 13:26:43 proxmox kernel: device eno2 entered promiscuous mode
Sep 15 13:26:43 proxmox ifup[3179]: Waiting for vmbr1 to get ready (MAXWAIT is 2 seconds).
Sep 15 13:26:43 proxmox systemd[1]: Started Raise network interfaces.
Sep 15 13:26:43 proxmox systemd[1]: Reached target Network.
Sep 15 13:26:43 proxmox systemd[1]: Condition check resulted in fast remote file copy program daemon being skipped.
Sep 15 13:26:43 proxmox systemd[1]: Started LXC Container Monitoring Daemon.
Sep 15 13:26:43 proxmox systemd[1]: Starting OpenBSD Secure Shell server...
Sep 15 13:26:43 proxmox systemd[1]: Reached target Network is Online.
Sep 15 13:26:43 proxmox systemd[1]: Starting LXC network bridge setup...
Note this one in particular:

Code:
Sep 15 13:26:42 proxmox kernel: device vmbr2 entered promiscuous mode
Sep 15 13:26:43 proxmox openvswitch-switch[2948]: /bin/sh: 1: ifconfig: not found
Sep 15 13:26:43 proxmox openvswitch-switch[2948]: ifup: failed to bring up bond0
The ifconfig command is not in use anymore on modern Debian-based systems, including Proxmox. So why is that still used in the Proxmox tutorial?

How do I replace these line of codes (straight from the official Proxmox OVS documentation pages) with a working line:

ifconfig enp132s0 mtu 9000 && ifconfig enp132s0d1 mtu 9000

Any ideas? Perhaps the documentation needs an update...




 

reckless

New Member
Feb 5, 2019
15
1
3
Making progress now! I replaced this: ifconfig enp132s0 mtu 9000 && ifconfig enp132s0d1 mtu 9000
With this: ip link set mtu 9000 dev enp132s0 && ip link set mtu 9000 dev enp132s0d1

And that seems to have done the trick, together with spirit's edit of removing the auto vmbr2 line. This is how it looks currently:

Code:
4: enp132s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP group default qlen 1000
    link/ether 00:02:c9:3b:61:10 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::202:c9ff:fe3b:6110/64 scope link
       valid_lft forever preferred_lft forever
5: enp132s0d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP group default qlen 1000
    link/ether 00:02:c9:3b:61:11 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::202:c9ff:fe3b:6111/64 scope link
       valid_lft forever preferred_lft forever
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether e6:5a:2e:18:cd:ed brd ff:ff:ff:ff:ff:ff
7: vmbr2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 00:02:c9:3b:61:10 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::202:c9ff:fe3b:6110/64 scope link
       valid_lft forever preferred_lft forever
8: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 02:08:50:1b:99:95 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::8:50ff:fe1b:9995/64 scope link
       valid_lft forever preferred_lft forever
I'm not sure why vmbr2 and bond0 both have the state as UNKNOWN, and I also don't know the ovs-system has the MTU set at 1500. Anybody know?

It does work currently, but I need to try and see if it's stable in the upcoming week.
 
  • Like
Reactions: elie.saintfelix

spirit

Well-Known Member
Apr 2, 2010
3,507
155
63
www.odiso.com
can you look inside

/etc/network/if-pre-up.d/openvswitch
and
/etc/network/if-post-down.d/openvswitch

and check if you have ifconfig reference inside them.
normally, you should not. (from a clean install of openvswitch package 2.10.0+2018.08.28+git.8ca7c82b7d+ds1-12 on proxmox6)
(if you have ifconfig reference, maybe something has been wrong during upgrade)
 

reckless

New Member
Feb 5, 2019
15
1
3
I think I do have it working now, it's been stable for the last few days. The key was to change the ifconfig command for a new ip based command as I listed above. Additionally, I removed the "auto ..." for the openvswitch interface (remove "auto vmbr2", and keep "allow-ovs vmbr2") just like spirit suggested.

Proxmox docs should be updated to reflect this.
 

spirit

Well-Known Member
Apr 2, 2010
3,507
155
63
www.odiso.com

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!