Help configuring open-vswitch on Proxmox 6.1-3?

surfrock66

Member
Feb 10, 2020
25
3
8
40
We are trying to get open vswitch working on 10 hosts running Proxmox 6.1-3. We think we have a configuration that should work, but it doesn't and we can't ping the gateway. These are hosts with 4 nics, we want to bond them with LACP then establish a bridge where we can have virtual NICs on different VLANs. We don't have much experience with ProxMox (I've used single-hosts at home, this is an experiment in our lab) and we have no experience with open-vswitch.

We're using this page, and Example 2 is the template for what we're doing: https://pve.proxmox.com/wiki/Open_vSwitch

For our test, even though we're going for LACP, we have 3/4 ports disabled to simplify troubleshooting. With the below configuration, we cannot ping the gateway. I believe we have a problem with the ovs config, specifically the fact that the interface for port vmbr0 is being listed as vmbr0 itself and not bond0, but we don't see how to correct that. There are no errors showing in systemctl status networking or systemctl status ovs-*. Below is the configuration for /etc/network/interfaces (please excuse any typos; I re-typed this from a photo of the console):

Code:
# Loopback interface
auto lo
iface lo inet loopback

allow-vmbr0 bond0
iface bond0 inet manual
    ovs_bridge vmbr0
    ovs_type OVSBond
    ovs_bonds eno1 eno2 eno3 eno4
    ovs_options bond-mode=balance-tcp lacp=active other_config:lacp-time=fast
    ovs_mtu 9000

allow-ovs vmbr0
iface vmbr0 inet manual
    ovs_type OVSBridge
    ovs_ports bond0 vlan10
    ovs_mtu 9000

allow-vmbr vlan10
iface vlan10 inet static
    ovs_type OVSIntPort
    ovs_bridge vmbr0
    ovs_options tag=10
    ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
    address 10.1.10.75
    netmask 255.255.255.0
    gateway 10.1.10.253
    ovs_mtu 9000

Following this, we get the following when we do "ip a"

Code:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
    link/ether 18:03:73:f5:7e:a5 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::1a03:73ff:fef5:7ea5/64 scope link
        valid_lft forever preferred_lft forever
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master ovs-system state DOWN group default qlen 1000
    link/ether 18:03:73:f5:7e:a7 brd ff:ff:ff:ff:ff:ff
4: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master ovs-system state DOWN group default qlen 1000
    link/ether 18:03:73:f5:7e:a9 brd ff:ff:ff:ff:ff:ff
5: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master ovs-system state DOWN group default qlen 1000
    link/ether 18:03:73:f5:7e:ab brd ff:ff:ff:ff:ff:ff
6: enp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 90:e2:ba:2b:06:24 brd ff:ff:ff:ff:ff:ff
7: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 66:9a:fb:4a:4f:b0 brd ff:ff:ff:ff:ff:ff
8: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 18:03:73:f5:7e:a5 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::1a03:73ff:fef5:7ea5/64 scope link
        valid_lft forever preferred_lft forever
9: vlan10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 76:1d:a2:b8:63:09 brd ff:ff:ff:ff:ff:ff
    inet 10.1.10.75/24 scope global vlan10
        valid_lft forever preferred_lft forever
    inet6 fe80::741d:a2ff:feb8:6309/64 scope link
        valid_lft forever preferred_lft forever
10: bond0: <BROADCAST,MULTICAST,UP,LWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 22:05:b7:42:a3:18 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::2005:b7ff:fe42:a318/64 scope link
        valid_lft forever preferred_lft forever

If I run "ovs-vsctl show" we see this:

Code:
f1733867-7e12-4d9a-bcf2-364bf38cd161
    Bridge "vmbr0"
        Port "bond0"
            Interface "eno2"
            Interface "eno3"
            Interface "eno4"
            Interface "eno1"
        Port "vmbr0"
            Interface "vmbr0"
                type: internal
        Port "vlan10"
            tag: 10
            Interface "vlan10"
                type: internal
    ovs_version: "2.10.1"
 
Hmmm.... that `allow-vmbr vlan10` I'm not so sure of....

I've this past month deployed LACP (though passive switch), and my config looks like this:


Code:
allow-vmbr0 bond0
iface bond0 inet manual
    ovs_bonds enp181s0f1 eno0
    ovs_type OVSBond
    ovs_bridge vmbr0
    ovs_options lacp=passive bond_mode=balance-slb other_config=lacp-fallback-ab=true

auto lo
iface lo inet loopback

iface enp181s0f1 inet manual

iface eno0 inet manual

allow-ovs vmbr0
iface vmbr0 inet static
    address  192.168.0.12
    netmask  24
    gateway  192.168.0.254
    ovs_type OVSBridge
    ovs_ports bond0

What I would advise, is to do the ping, (having only one interface connected) and do a tcpdump on that interface while you are doing the ping, like for eno1 interface:
tcpdump -i eno1 -e
The -e will have output like:
Code:
18:25:45.179312 c0:ff:ee:ca:fe:31 (oui Unknown) > c0:ff:ee:ca:fe:33 (oui Unknown), ethertype 802.1Q (0x8100), length 105: vlan 4004, p 0, ethertype IPv4, 10.20.234.40.36400 > 10.20.234.41.postgresql: Flags [P.], seq 173471:173518, ack 95558, win 7276, length 47
Notice the vlan, which will show you what is sent out on the interface.... then you can start sniffing on the gateway, or have a span/mirror port from the switch and sniff that to confirm the traffic etc.
 
I apologize for waking up an old thread; we got distracted on something else and are now revisiting this. The only real change we made is to disable ipv6.

I am concerned by the output of ovs-vsctl show; from my reading of it, it's saying the bridge is its own port?

Code:
f1733867-7e12-4d9a-bcf2-364bf38cd161
    Bridge "vmbr0"
        Port "bond0"
            Interface "eno2"
            Interface "eno3"
            Interface "eno4"
            Interface "eno1"
        Port "vmbr0"
            Interface "vmbr0"
                type: internal
        Port "vlan10"
            tag: 10
            Interface "vlan10"
                type: internal
    ovs_version: "2.10.1"

Is that correct? Shouldn't the interface for vmbr0 be bond0?
 
I apologize for waking up an old thread; we got distracted on something else and are now revisiting this. The only real change we made is to disable ipv6.

I am concerned by the output of ovs-vsctl show; from my reading of it, it's saying the bridge is its own port?

Code:
f1733867-7e12-4d9a-bcf2-364bf38cd161
    Bridge "vmbr0"
        Port "bond0"
            Interface "eno2"
            Interface "eno3"
            Interface "eno4"
            Interface "eno1"
        Port "vmbr0"
            Interface "vmbr0"
                type: internal
        Port "vlan10"
            tag: 10
            Interface "vlan10"
                type: internal
    ovs_version: "2.10.1"

Is that correct?

YEs, the bridge is a port on it's own too.

Shouldn't the interface for vmbr0 be bond0?

bond0 is *one* of the ports of vmbr0, just as vlan10 is a port of vmbr0
*port* vmbr0 is a port that "sees" all the traffic for *bridge* vmbr0 and it's native is where the configs above applied IP 192.168.0.12 above for the internal host. (USually to avoid confusion, I put a separate OVS_IntPOrt for the local host).

bond0 you could also give an IP to, BUT the idea of it is not to be an interface with an IP on it, but a port that is just a mechanism to switch the traffic from the vmbr0 *bridge* to the outside world and back. It's a layer2 device, not L3 ;)
 
We'll work on this more this week. I'm the sysadmin and my networking admin is the only one with access to the switch configuration, but she's quarantined due to the coronavirus so we're having trouble working on this straight through. If anything else looks wrong from the system side I'm happy to investigate.

I was reading here, but didn't see anything that related to what we have: https://forum.proxmox.com/threads/s...ng-just-broken-on-pve-6-whats-going-on.58020/
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!