Thanks for the hint about Bufferbloat. At least there where sometimes few retries with OVS as well, but far less compared to Linux Bond. I always thought the less the better. Some examples with retries on OVS are included below. Very high retries occured when the CPUs maxed out on the target servers.
The tests where done using two VLANs. For convenience, I've used /24 networks while the third octet of the IPs indicates the used VLAN.
Interface config:
Code:
auto enp2s0f0np0
iface enp2s0f0np0 inet manual
mtu 9000
auto enp2s0f1np1
iface enp2s0f1np1 inet manual
mtu 9000
auto bond0
iface bond0 inet manual
ovs_bonds enp2s0f0np0 enp2s0f1np1
ovs_type OVSBond
ovs_bridge vmbr0
ovs_mtu 9000
ovs_options lacp=active bond_mode=balance-slb other-config:bond-rebalance-interval=0
auto vmbr0
iface vmbr0 inet manual
ovs_type OVSBridge
ovs_ports bond0 v114 v115
ovs_mtu 9000
auto v114
iface v114 inet static
address 10.10.114.{{ node_number }}/24
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_mtu 9000
ovs_options tag=114
auto v115
iface v115 inet static
address 10.10.115.{{ node_number }}/24
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_mtu 9000
ovs_options tag=115
___
Some more testing regarding OVS bond_modes with VLANs and bond-rebalance-interval:
bond_mode balance-slb (default bond-rebalance-interval, i.e. 10 seconds)
Client on single VLAN to two servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.114.13 --title 114-13 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
114-13: [ 5] 0.00-30.00 sec 43.2 GBytes 12.4 Gbits/sec 0 sender
114-12: [ 5] 0.00-30.00 sec 43.2 GBytes 12.4 Gbits/sec 0 sender
As expected only a single link is used, because source.MAC + source.VLAN is the same for both connections. This is where bond_mond balance-tcp has a benefit (see test below at section bond_mode balance-tcp).
Client on two VLANs to single server
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.12 --title 115-12 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
114-12: [ 5] 0.00-30.00 sec 72.9 GBytes 20.9 Gbits/sec 1837 sender
115-12: [ 5] 0.00-30.00 sec 73.2 GBytes 20.9 Gbits/sec 1549 sender
Sometimes, the connections don't get distributed evenly right away. This was the case with this test. At around second 8 you can see the rebalancing kicking in and doing its job. As expected, the connections are distributed on both links because the source VLAN differs. The CPU on the server maxed out after both links were used (retries up). Although switching connections in the middle of transfer from one link to the other seems to me a bit like screaming for trouble.
Connections got rebalanced after approx. 8 seconds:
114-12: [ ID] Interval Transfer Bitrate Retr Cwnd
114-12: [ 5] 0.00-1.00 sec 1.43 GBytes 12.3 Gbits/sec 0 3.12 MBytes
115-12: [ 5] 0.00-1.00 sec 1.43 GBytes 12.3 Gbits/sec 0 3.05 MBytes
115-12: [ 5] 1.00-2.00 sec 1.43 GBytes 12.3 Gbits/sec 0 3.05 MBytes
114-12: [ 5] 1.00-2.00 sec 1.44 GBytes 12.4 Gbits/sec 0 3.12 MBytes
115-12: [ 5] 2.00-3.00 sec 1.44 GBytes 12.3 Gbits/sec 0 3.05 MBytes
114-12: [ 5] 2.00-3.00 sec 1.44 GBytes 12.3 Gbits/sec 0 3.12 MBytes
114-12: [ 5] 3.00-4.00 sec 1.44 GBytes 12.3 Gbits/sec 0 3.12 MBytes
115-12: [ 5] 3.00-4.00 sec 1.44 GBytes 12.4 Gbits/sec 0 3.05 MBytes
115-12: [ 5] 4.00-5.00 sec 1.44 GBytes 12.4 Gbits/sec 0 3.05 MBytes
114-12: [ 5] 4.00-5.00 sec 1.44 GBytes 12.3 Gbits/sec 0 3.12 MBytes
114-12: [ 5] 5.00-6.00 sec 1.43 GBytes 12.3 Gbits/sec 0 3.12 MBytes
115-12: [ 5] 5.00-6.00 sec 1.43 GBytes 12.3 Gbits/sec 0 3.05 MBytes
114-12: [ 5] 6.00-7.00 sec 1.44 GBytes 12.4 Gbits/sec 0 3.12 MBytes
115-12: [ 5] 6.00-7.00 sec 1.44 GBytes 12.3 Gbits/sec 0 3.05 MBytes
114-12: [ 5] 7.00-8.00 sec 1.44 GBytes 12.3 Gbits/sec 0 3.12 MBytes
115-12: [ 5] 7.00-8.00 sec 1.44 GBytes 12.3 Gbits/sec 0 3.05 MBytes
114-12: [ 5] 8.00-9.00 sec 1.69 GBytes 14.5 Gbits/sec 0 3.12 MBytes
115-12: [ 5] 8.00-9.00 sec 1.69 GBytes 14.5 Gbits/sec 148 3.05 MBytes
114-12: [ 5] 9.00-10.00 sec 2.86 GBytes 24.6 Gbits/sec 41 2.17 MBytes
115-12: [ 5] 9.00-10.00 sec 2.87 GBytes 24.7 Gbits/sec 8 2.10 MBytes
115-12: [ 5] 10.00-11.00 sec 2.88 GBytes 24.7 Gbits/sec 15 3.17 MBytes
114-12: [ 5] 10.00-11.00 sec 2.87 GBytes 24.6 Gbits/sec 25 2.17 MBytes
115-12: [ 5] 11.00-12.00 sec 2.87 GBytes 24.6 Gbits/sec 81 2.28 MBytes
114-12: [ 5] 11.00-12.00 sec 2.84 GBytes 24.4 Gbits/sec 77 2.59 MBytes
115-12: [ 5] 12.00-13.00 sec 2.88 GBytes 24.7 Gbits/sec 2 2.52 MBytes
114-12: [ 5] 12.00-13.00 sec 2.87 GBytes 24.7 Gbits/sec 34 3.07 MBytes
115-12: [ 5] 13.00-14.00 sec 2.87 GBytes 24.6 Gbits/sec 39 2.53 MBytes
114-12: [ 5] 13.00-14.00 sec 2.84 GBytes 24.4 Gbits/sec 104 2.53 MBytes
115-12: [ 5] 14.00-15.00 sec 2.88 GBytes 24.7 Gbits/sec 14 2.18 MBytes
114-12: [ 5] 14.00-15.00 sec 2.85 GBytes 24.5 Gbits/sec 31 3.10 MBytes
114-12: [ 5] 15.00-16.00 sec 2.87 GBytes 24.6 Gbits/sec 23 2.65 MBytes
115-12: [ 5] 15.00-16.00 sec 2.88 GBytes 24.7 Gbits/sec 0 2.68 MBytes
114-12: [ 5] 16.00-17.00 sec 2.82 GBytes 24.2 Gbits/sec 237 1.48 MBytes
115-12: [ 5] 16.00-17.00 sec 2.83 GBytes 24.3 Gbits/sec 181 944 KBytes
115-12: [ 5] 17.00-18.00 sec 2.77 GBytes 23.8 Gbits/sec 144 2.67 MBytes
114-12: [ 5] 17.00-18.00 sec 2.82 GBytes 24.2 Gbits/sec 180 2.59 MBytes
114-12: [ 5] 18.00-19.00 sec 2.86 GBytes 24.6 Gbits/sec 66 2.58 MBytes
115-12: [ 5] 18.00-19.00 sec 2.86 GBytes 24.6 Gbits/sec 62 2.43 MBytes
114-12: [ 5] 19.00-20.00 sec 2.86 GBytes 24.6 Gbits/sec 128 2.83 MBytes
115-12: [ 5] 19.00-20.00 sec 2.86 GBytes 24.6 Gbits/sec 59 2.30 MBytes
114-12: [ 5] 20.00-21.00 sec 2.84 GBytes 24.4 Gbits/sec 84 2.34 MBytes
115-12: [ 5] 20.00-21.00 sec 2.86 GBytes 24.6 Gbits/sec 82 2.19 MBytes
114-12: [ 5] 21.00-22.00 sec 2.84 GBytes 24.4 Gbits/sec 63 2.15 MBytes
115-12: [ 5] 21.00-22.00 sec 2.86 GBytes 24.5 Gbits/sec 77 2.65 MBytes
114-12: [ 5] 22.00-23.00 sec 2.82 GBytes 24.2 Gbits/sec 154 1.55 MBytes
115-12: [ 5] 22.00-23.00 sec 2.81 GBytes 24.1 Gbits/sec 112 1.32 MBytes
115-12: [ 5] 23.00-24.00 sec 2.82 GBytes 24.2 Gbits/sec 112 1.99 MBytes
114-12: [ 5] 23.00-24.00 sec 2.81 GBytes 24.1 Gbits/sec 175 2.34 MBytes
114-12: [ 5] 24.00-25.00 sec 2.86 GBytes 24.6 Gbits/sec 62 2.09 MBytes
115-12: [ 5] 24.00-25.00 sec 2.88 GBytes 24.7 Gbits/sec 20 2.89 MBytes
115-12: [ 5] 25.00-26.00 sec 2.87 GBytes 24.7 Gbits/sec 46 2.70 MBytes
114-12: [ 5] 25.00-26.00 sec 2.85 GBytes 24.5 Gbits/sec 82 2.63 MBytes
115-12: [ 5] 26.00-27.00 sec 2.85 GBytes 24.5 Gbits/sec 63 2.62 MBytes
114-12: [ 5] 26.00-27.00 sec 2.82 GBytes 24.2 Gbits/sec 69 2.41 MBytes
115-12: [ 5] 27.00-28.00 sec 2.84 GBytes 24.4 Gbits/sec 106 952 KBytes
114-12: [ 5] 27.00-28.00 sec 2.84 GBytes 24.4 Gbits/sec 144 848 KBytes
115-12: [ 5] 28.00-29.00 sec 2.87 GBytes 24.6 Gbits/sec 49 2.22 MBytes
114-12: [ 5] 28.00-29.00 sec 2.86 GBytes 24.5 Gbits/sec 13 2.97 MBytes
114-12: [ 5] 29.00-30.00 sec 2.85 GBytes 24.5 Gbits/sec 45 2.76 MBytes
115-12: [ 5] 29.00-30.00 sec 2.87 GBytes 24.7 Gbits/sec 129 2.10 MBytes
114-12: - - - - - - - - - - - - - - - - - - - - - - - - -
115-12: - - - - - - - - - - - - - - - - - - - - - - - - -
114-12: [ ID] Interval Transfer Bitrate Retr
115-12: [ ID] Interval Transfer Bitrate Retr
114-12: [ 5] 0.00-30.00 sec 72.9 GBytes 20.9 Gbits/sec 1837 sender
115-12: [ 5] 0.00-30.00 sec 73.2 GBytes 20.9 Gbits/sec 1549 sender
114-12: [ 5] 0.00-30.04 sec 72.9 GBytes 20.8 Gbits/sec receiver
115-12: [ 5] 0.00-30.04 sec 73.1 GBytes 20.9 Gbits/sec receiver
Client on two VLANs to two servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.13 --title 115-13 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
115-13: [ 5] 0.00-30.00 sec 86.2 GBytes 24.7 Gbits/sec 117 sender
114-12: [ 5] 0.00-30.00 sec 85.9 GBytes 24.6 Gbits/sec 0 sender
As exptected connections got distributed on both links. Way less retries compared to "Client on two VLANs to single server" scenario because two servers are used.
Client on two VLANs to three servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.13 --title 115-13 --time 30 & iperf3 --client 10.10.115.14 --title 115-14 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
115-13: [ 5] 0.00-30.00 sec 43.1 GBytes 12.3 Gbits/sec 0 sender
114-12: [ 5] 0.00-30.00 sec 85.7 GBytes 24.6 Gbits/sec 0 sender
115-14: [ 5] 0.00-30.00 sec 43.1 GBytes 12.3 Gbits/sec 0 sender
As expected, one connection has its own link, the other two connections share the other link.
___
When setting bond-rebalance-interval=0 on bond_mode balance-slb, the connections never got rebalanced during transfer. Unfortunately, they ended up on just a single link most of the time (see tests below, total throughput always adds up to 25 Gbits/sec). If bond_mode balance-slb would use source.MAC + source.VLAN as hash input I'd expect the traffic to get distributed on both links. Somehow only one link is used. Maybe, because both got started at pretty much the same time? In this case, rebalancing might be beneficial.
balance-slb bond-rebalance-interval=0
Client on single VLAN to two servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.114.13 --title 114-13 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
114-13: [ 5] 0.00-30.00 sec 43.3 GBytes 12.4 Gbits/sec 0 sender
114-12: [ 5] 0.00-30.00 sec 43.0 GBytes 12.3 Gbits/sec 0 sender
Client on two VLANs to single server
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.12 --title 115-12 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
115-12: [ 5] 0.00-30.00 sec 43.2 GBytes 12.4 Gbits/sec 0 sender
114-12: [ 5] 0.00-30.00 sec 43.2 GBytes 12.4 Gbits/sec 0 sender
Client on two VLANs to two servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.13 --title 115-13 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
115-13: [ 5] 0.00-30.00 sec 43.2 GBytes 12.4 Gbits/sec 0 sender
114-12: [ 5] 0.00-30.00 sec 43.2 GBytes 12.4 Gbits/sec 0 sender
Client on two VLANs to three servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.13 --title 115-13 --time 30 & iperf3 --client 10.10.115.14 --title 115-14 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
115-13: [ 5] 0.00-30.00 sec 28.8 GBytes 8.24 Gbits/sec 0 sender
115-14: [ 5] 0.00-30.00 sec 28.8 GBytes 8.24 Gbits/sec 0 sender
114-12: [ 5] 0.00-30.00 sec 28.8 GBytes 8.24 Gbits/sec 0 sender
___
For comparison, bond_mode balance-tcp distributes the connections quite evenly to the links. Although there where test cases when the distribution was not as evenly as shown below, I'd rather leave bond-rebalance-interval=0 in order to avoid connections switching the link. The risk reward ratio seems to be insufficient, at least in our environment and use case (R&D Proxmox Cluster).
bond_mode balance-tcp bond-rebalance-interval=0
Client on single VLAN to two servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.114.13 --title 114-13 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
114-13: [ 5] 0.00-30.00 sec 86.3 GBytes 24.7 Gbits/sec 7 sender
114-12: [ 5] 0.00-30.00 sec 86.3 GBytes 24.7 Gbits/sec 0 sender
Client on two VLANs to single server
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.12 --title 115-12 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
114-12: [ 5] 0.00-30.00 sec 86.0 GBytes 24.6 Gbits/sec 1116 sender
115-12: [ 5] 0.00-30.00 sec 86.0 GBytes 24.6 Gbits/sec 1334 sender
Note: CPU maxed out on server
Client on two VLANs to two servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.13 --title 115-13 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
114-12: [ 5] 0.00-30.00 sec 86.1 GBytes 24.6 Gbits/sec 0 sender
115-13: [ 5] 0.00-30.00 sec 86.1 GBytes 24.7 Gbits/sec 11 sender
Client on two VLANs to three servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.13 --title 115-13 --time 30 & iperf3 --client 10.10.115.14 --title 115-14 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
114-12: [ 5] 0.00-30.00 sec 43.1 GBytes 12.4 Gbits/sec 0 sender
115-13: [ 5] 0.00-30.00 sec 43.0 GBytes 12.3 Gbits/sec 0 sender
115-14: [ 5] 0.00-30.00 sec 86.0 GBytes 24.6 Gbits/sec 0 sender
Client on two VLANs each to three servers
Code:
iperf3 --client 10.10.114.12 --title 114-12 --time 30 & iperf3 --client 10.10.115.12 --title 115-12 --time 30 & iperf3 --client 10.10.114.13 --title 114-13 --time 30 & iperf3 --client 10.10.115.13 --title 115-13 --time 30 & iperf3 --client 10.10.114.14 --title 114-14 --time 30 & iperf3 --client 10.10.115.14 --title 115-14 --time 30
iplink: [ ID] Interval Transfer Bitrate Retr
115-12: [ 5] 0.00-30.00 sec 28.8 GBytes 8.24 Gbits/sec 8 sender
114-12: [ 5] 0.00-30.00 sec 28.8 GBytes 8.24 Gbits/sec 0 sender
114-14: [ 5] 0.00-30.00 sec 28.8 GBytes 8.25 Gbits/sec 18 sender
115-14: [ 5] 0.00-30.00 sec 28.8 GBytes 8.25 Gbits/sec 0 sender
114-13: [ 5] 0.00-30.00 sec 28.8 GBytes 8.25 Gbits/sec 0 sender
115-13: [ 5] 0.00-30.00 sec 28.8 GBytes 8.24 Gbits/sec 8 sender
I'll leave it there for now.