ovs made bond with vlan and custom mtu

ioo

Renowned Member
Oct 1, 2011
22
0
66
Hi!

I apologize to open up another lot-discussed-topic, at the moment everything looks to me working well but still i would like to get confirmation if my thinking is good and specific configuration is good.

My effort is
  1. computer has four physical interfaces and two of them are 10g and conneted; and two are 1g and not connected
  2. two 10g interfaces need to have lacp-kind of bond formed
  3. on bond need to run traffic belonging to different vlans
  4. vlans have different mtu settings (1500 and 9000)
  5. physical switches those 10g interfaces are connected to are so-to-say well-managed and support this effort
  6. i am using proxmox 7.4 (upgraded naturally in-place from 6.4 etc so it has ifupdown not ifupdown2)
So i build upon my previous ovs experience (which is actually much more trivial consisting of simple one-physical-interface-multiple-vlans i.e. no bond and no mtu customization) and reading entries from this forum and also
and came out with this configuration

Code:
auto lo
iface lo inet loopback

# enp59s0f0
# enp59s0f1

auto g10
iface g10 inet manual
  mtu 9000

auto h10
iface h10 inet manual
  mtu 9000

auto vmbr0
allow-ovs vmbr0
iface vmbr0 inet manual
  ovs_type OVSBridge
  ovs_ports vlan10 vlan15 vlan88 bond0
  ovs_mtu 9000

allow-vmbr0 bond0
iface bond0 inet manual
  ovs_bonds g10 h10
  ovs_type OVSBond
  ovs_bridge vmbr0
  ovs_options bond_mode=balance-tcp other_config:lacp-time=fast lacp=active

allow-vmbr0 vlan10
iface vlan10 inet static
  address 10.40.0.180/19
  ovs_type OVSIntPort
  ovs_options tag=10
  ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
  ovs_bridge vmbr0
  ovs_mtu 1500

allow-vmbr0 vlan15
iface vlan15 inet static
  address 192.168.15.180/24
  ovs_type OVSIntPort
  ovs_options tag=15
  ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
  ovs_bridge vmbr0
  ovs_mtu 1500

allow-vmbr0 vlan88
iface vlan88 inet static
  address 10.8.8.180/22
  ovs_type OVSIntPort
  ovs_options tag=88
  ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
  ovs_bridge vmbr0
  ovs_mtu 9000

And i can comment on this choice like this
  • 'auto' finds and activates physical interface and also sets mtu for it; it happens outside of ovs-world, so there i use just 'mtu', not 'ovs_mtu'
  • as per manual https://github.com/openvswitch/ovs/blob/master/debian/openvswitch-switch.README.Debian 'auto vmbr0' seems to be not needed but without it vmbr0 is not created at all (it also logs something in that spirit), so i still use it; and even more, manual warns against using it
  • 'allow-xxx' construct activates some inner working of ifupdown and configures respective 'iface xxx' - so i have not used 'auto' there
  • bond0 will be 'port' in ovs parlance and ports dont get mtu set (ovs interface does) - so i do not use mtu here
  • vlans have ip configured and also mtu set because they are interfaces (i see with my current config there will be different ovs objects carrying same 'vlan15' kind of name (port and interface))
  • ovs_extra is something i inherited, i think it is way of attaching to this mix kind of informational value label (i see it afterwards with 'ovs-vsctl --columns=name,external-ids list Interface'); it is actually not needed
And as a result i get after reboot
Code:
$ ifconfig -a
bond0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::f40a:cdff:feb1:5fff  prefixlen 64  scopeid 0x20<link>
        ether f6:0a:cd:b1:5f:ff  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 56  bytes 4016 (3.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp59s0f0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether ac:1f:6b:24:66:40  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xb8400000-b84fffff

enp59s0f1: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether ac:1f:6b:24:66:41  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xb8300000-b83fffff

g10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet6 fe80::ec4:7aff:fed2:7c06  prefixlen 64  scopeid 0x20<link>
        ether 0c:c4:7a:d2:7c:06  txqueuelen 1000  (Ethernet)
        RX packets 38774204  bytes 142197613665 (132.4 GiB)
        RX errors 0  dropped 3  overruns 0  frame 0
        TX packets 18604581  bytes 9513158592 (8.8 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

h10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet6 fe80::ec4:7aff:fed2:7c07  prefixlen 64  scopeid 0x20<link>
        ether 0c:c4:7a:d2:7c:07  txqueuelen 1000  (Ethernet)
        RX packets 45906205  bytes 31899476797 (29.7 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 27614505  bytes 19666800332 (18.3 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 201930  bytes 41906810 (39.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 201930  bytes 41906810 (39.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ovs-system: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 2e:85:95:ad:98:43  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vlan10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.40.0.180  netmask 255.255.224.0  broadcast 10.40.31.255
        inet6 fe80::9ca4:d1ff:feed:6709  prefixlen 64  scopeid 0x20<link>
        ether 9e:a4:d1:ed:67:09  txqueuelen 1000  (Ethernet)
        RX packets 4085952  bytes 714423720 (681.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2932581  bytes 5732046879 (5.3 GiB)
        TX errors 2  dropped 0 overruns 0  carrier 0  collisions 0

vlan15: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.15.180  netmask 255.255.255.0  broadcast 192.168.15.255
        inet6 fe80::cca6:cff:fee5:9d26  prefixlen 64  scopeid 0x20<link>
        ether ce:a6:0c:e5:9d:26  txqueuelen 1000  (Ethernet)
        RX packets 30979695  bytes 27515034822 (25.6 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 26129667  bytes 17116605382 (15.9 GiB)
        TX errors 8  dropped 0 overruns 0  carrier 0  collisions 0

vlan80: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet 10.80.0.180  netmask 255.255.255.0  broadcast 10.80.0.255
        inet6 fe80::c0c1:33ff:fe70:cee2  prefixlen 64  scopeid 0x20<link>
        ether c2:c1:33:70:ce:e2  txqueuelen 1000  (Ethernet)
        RX packets 998277  bytes 984583303 (938.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 775007  bytes 107368344 (102.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vmbr0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet6 fe80::ec4:7aff:fed2:7c06  prefixlen 64  scopeid 0x20<link>
        ether 0c:c4:7a:d2:7c:06  txqueuelen 1000  (Ethernet)
        RX packets 9461545  bytes 754987309 (720.0 MiB)
        RX errors 0  dropped 30  overruns 0  frame 0
        TX packets 56  bytes 4016 (3.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

And i can comment on this
  • enp59s0f0 and enp59s0f1 are not activated, are non-UP - as expected
  • bond0 entry in ifconfig output is most perplexing because it has mtu 1500; and it does not have practically TX or RX traffic - i believe it is ok like this
  • g10 and h10 are up and mtu 9000 configured - as expected
  • vlan entries have ip configured, mtu configured, UP, TX, RX counters showing etc - like expected
  • vmbr0 has also mtu 9000 as configured (actually i am not sure how relevant that is, judging by RX/TX there happens not much)
And i have such ovs-vsctl output

Code:
$ ovs-vsctl show
81d8181f-54af-472b-844b-198668a3d3cc
    Bridge vmbr0
        Port bond0
            Interface h10
            Interface g10
        Port vlan15
            tag: 15
            Interface vlan15
                type: internal
        Port vmbr0
            Interface vmbr0
                type: internal
        Port vlan10
            tag: 10
            Interface vlan10
                type: internal
        Port vlan80
            tag: 80
            Interface vlan80
                type: internal
    ovs_version: "2.15.0"

Comments like this
  • vmbr0 is used for three things: bridge name, port name, interface name - i think it could be treated like something internal to ovs and better not touched and percieved as it is
  • i think ifconfig and ip etc tools output for ovs objects (i am resisting to say ports or intefaces) should be perceived with some amount of caution (like using df utility for inspecting zfs)
  • bond0 is not really network interface for linux operating system although ifconfig presents it, it rather is ovs internal construct; its duty is to contain h10 and g10 and make bonding happen; it also expains why 'ifconfig bond0' shows mtu 1500 but actually big-packets happen well (without fragmentation)
Code:
$ ping -c 2 -s 8000 10.80.0.156
PING 10.80.0.156 (10.80.0.156) 8000(8028) bytes of data.
8008 bytes from 10.80.0.156: icmp_seq=1 ttl=64 time=1.36 ms
8008 bytes from 10.80.0.156: icmp_seq=2 ttl=64 time=0.241 ms

--- 10.80.0.156 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, t

$ tcpdump -nei vlan80 icmp and host 10.80.0.156
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vlan80, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:50:38.952056 c2:c1:33:70:ce:e2 > 92:a1:f2:31:67:f3, ethertype IPv4 (0x0800), length 8042: 10.80.0.180 > 10.80.0.156: ICMP echo request, id 51956, seq 1, length 8008
22:50:38.953400 92:a1:f2:31:67:f3 > c2:c1:33:70:ce:e2, ethertype IPv4 (0x0800), length 8042: 10.80.0.156 > 10.80.0.180: ICMP echo reply, id 51956, seq 1, length 8008
22:50:39.953690 c2:c1:33:70:ce:e2 > 92:a1:f2:31:67:f3, ethertype IPv4 (0x0800), length 8042: 10.80.0.180 > 10.80.0.156: ICMP echo request, id 51956, seq 2, length 8008
22:50:39.953892 92:a1:f2:31:67:f3 > c2:c1:33:70:ce:e2, ethertype IPv4 (0x0800), length 8042: 10.80.0.156 > 10.80.0.180: ICMP echo reply, id 51956, seq 2, length 8008

Some more comments, if i had 'allow-vmbr0 bond0' section to contain 'ovs_mtu 9000' set then i got into logs

Code:
...
Sep 21 01:40:14 od3 kernel: [  125.945903] device h10 entered promiscuous mode
Sep 21 01:40:14 od3 kernel: [  125.951008] device g10 entered promiscuous mode
Sep 21 01:40:14 od3 systemd-udevd[1321]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep 21 01:40:14 od3 systemd-udevd[1321]: Could not generate persistent MAC address for bond0: No such file or directory
Sep 21 01:40:14 od3 kernel: [  125.956552] device bond0 entered promiscuous mode
Sep 21 01:40:14 od3 kernel: [  126.182662] ixgbe 0000:18:00.0: registered PHC device on g10
Sep 21 01:40:14 od3 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl set Interface g10 mtu_request=9000
Sep 21 01:40:14 od3 kernel: [  126.296976] ixgbe 0000:18:00.0 g10: changing MTU from 1500 to 9000
Sep 21 01:40:14 od3 kernel: [  126.353803] ixgbe 0000:18:00.0 g10: detected SFP+: 3
Sep 21 01:40:15 od3 kernel: [  126.822209] ixgbe 0000:18:00.0 g10: detected SFP+: 3
Sep 21 01:40:15 od3 kernel: [  126.969543] ixgbe 0000:18:00.0 g10: NIC Link is Up 10 Gbps, Flow Control: RX/TX
Sep 21 01:40:15 od3 kernel: [  126.979078] ixgbe 0000:18:00.1: registered PHC device on h10
Sep 21 01:40:15 od3 kernel: [  127.087305] IPv6: ADDRCONF(NETDEV_CHANGE): g10: link becomes ready
Sep 21 01:40:15 od3 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl set Interface h10 mtu_request=9000
Sep 21 01:40:15 od3 kernel: [  127.092852] ixgbe 0000:18:00.1 h10: changing MTU from 1500 to 9000
Sep 21 01:40:15 od3 kernel: [  127.154247] ixgbe 0000:18:00.1 h10: detected SFP+: 4
Sep 21 01:40:15 od3 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl set Interface bond0 mtu_request=9000
Sep 21 01:40:15 od3 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no row "bond0" in table Interface
Sep 21 01:40:15 od3 openvswitch-switch[1337]: ovs-vsctl: no row "bond0" in table Interface
...

So as by 'no row "bond0" in table Interface' it tries to configure interface named bond0 but find no such interface (and at the same time all is well bond0 being 'ovs port').

I dont have good distinction between ovs-port and ovs-interface but in my mind i imagine like port being closer to ovs internals and strictly logical thing; while interface being at the outer perimeter form ovs point of view. per ovs swith usually is at least one physical network interface (being ovs interface), and of course there could be internal-kind interfaces too; and mostly proxmox starting virtual machine connects it to ovs switch using tap or similar interface. And thouse ovs interfaces are things that are visible to operating system network tools (like ip, ifconfig, tcpdump etc).

Well, this is long post, i thank if somebody made it until here :) I tried to be consistent and welcome very much your comments and confirmation if things around ovs and proxmox go more-or-less like i presented them here.


Best regards,

Imre
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!