We have two clusters in which we host virtual routers and firewalls. Heavy network traffic causes jitter and sometimes even packet loss with the default LACP OvS configuration so we run a sort of hybrid. The root cause is that Intel X520 network cards support receive side steering where they compute a hash and then pass return traffic that is part of an identified stream back to the same queue that sent the packet.
With the default 'balance-tcp' mode the packets get transmitted and then recirculate OvS to be sent out the balanced interface which causes problems. In the environment with allot of virtual routers and firewalls we run things slightly differently, namely that we use balance-slb together with LACP to periodically rebalance the streams.
Normal OvS LACP:
Code:
ovs_options bond_mode=balance-tcp lacp=active other_config:lacp-time=fast tag=200 vlan_mode=native-untagged
No recirculation:
Code:
ovs_options bond_mode=balance-slb lacp=active other_config:lacp-time=fast other_config:bond-rebalance-interval=60000 tag=200 vlan_mode=native-untagged
NB: This is not the case in the cluster with the 2 x 1 GbE links where we regularly observe corosync having problems...
@spirit: The problem cluster with the 2 x 1 GbE links was running a ping between all three hosts and we observe no network connectivity issues with any other application, Ceph or diagnostic pings.