3x bonded interface slower than 2x bonded???

proxwolfe

Well-Known Member
Jun 20, 2020
501
52
48
49
Hi,

I am experimenting with bonding in preparation for my next cluster build.

I benchmarked a 1gbe connection between the nodes and iperf showed close to 1gbit/s. Which is what I expected.

Then I bonded two 1gbe NICs together (balance-rr) and benchmarked again. Iperf only showed 1.5gbit/s. That is substantially slower than expected.

The big surprise came when I bonded three 1gbe NICs together. Iperf showed only 1.4gbit/s, i.e. less than for two NICs bonded.

Is that normal? Does bonding create that much overhead?

The 1gbe NICs are all on the same quad port card in each host. I am using CAT7 cables and identical switches.

But: the length of the cables is not the same (because I don't have enough of the same length). One connection is with only 1m cables. One connection is with only 2m cables and one connection is with a mix of 1m and 2m cables. Might that have to do with the problem? In other words: Would I get (near) doubled and (near) tripled speeds, if I used only cables of the same lenght?

Thanks!
 
do you have done some aggregation config on your switch? (cisco port-channel,...)

https://www.kernel.org/doc/Documentation/networking/bonding.txt

"
The balance-rr, balance-xor and broadcast modes generally
require that the switch have the appropriate ports grouped together.
The nomenclature for such a group differs between switches, it may be
called an "etherchannel" (as in the Cisco example, above), a "trunk
group" or some other similar variation. For these modes, each switch
will also have its own configuration options for the switch's transmit
policy to the bond. Typical choices include XOR of either the MAC or
IP addresses. The transmit policy of the two peers does not need to
match. For these three modes, the bonding mode really selects a
transmit policy for an EtherChannel group; all three will interoperate
with another EtherChannel group.
"

If not, you'll have a lot of retransmits because of out of order packets reassembly, I'll not be surprised if more links give more retransmits and lower performance.
 
do you have done some aggregation config on your switch? (cisco port-channel,...)
No aggregation config here. My 'el Cheapo' switch doesn't have that. Actually, I hooked up 1 NIC per node via one switch. So there are two (three) separate switches for the bond config. And they provide no particular support. So I take it, 1.5gbit/s isn't that bad, given my setup?

Strange thing is, I also tried the balance-tlb and balance-alb bonding modes where it says in the docs that no switch support is required. But with these modes iperf couldn't connect at all...

So what kind of switch are we talking here? If it has to be fancy one, I might as well invest in 10gbe cards and do without bonding...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!