General bonding question for performance's sake

BuffaloBrent

Member
Jan 14, 2016
19
10
23
50
Louisville, KY, USA
I've currently have two 1Gb ethernet ports on a single bond and bridge that serves all of the 10 VMs that are running. I need to physically move the server, which gives me an opportunity to add 4 more ports for future anticipation.

In general, is it just as well to have all six ports bonded to one bridge and assign that bridge to all VMs, rather than creating separate bridges for specific server(s)?

I've got one physical single server that I've getting ready to virtualize. The database services on this machine receive a lot of network traffic in and out, but nothing that has yet saturated the single 1Gb ethernet connection it currently has. But it's certainly the busiest server in our entire system as far as network traffic is concerned. When I do virtualize this server, would it be better to create it's own bridge with two bonded ports, or just use the single bridge setup as I mentioned previously?

Is there a way to test these two kinds of configurations to determine which would give the better performance?
 
I've currently have two 1Gb ethernet ports on a single bond and bridge that serves all of the 10 VMs that are running. I need to physically move the server, which gives me an opportunity to add 4 more ports for future anticipation.

In general, is it just as well to have all six ports bonded to one bridge and assign that bridge to all VMs, rather than creating separate bridges for specific server(s)?

Rather separate them to more bridges - bonding is mainly for failover, you cannot simply multiply the bandwidth by the number of bond members.

When I do virtualize this server, would it be better to create it's own bridge with two bonded ports, or just use the single bridge setup as I mentioned previously?

As mentioned above, I'd prefer an own bridge.

Is there a way to test these two kinds of configurations to determine which would give the better performance?

Use iperf (if you have already Proxmox VE 4.x there is iperf3, an improved version)


Code:
#install
apt-get install iperf

#have a look into man page
man iperf
 
Rather separate them to more bridges - bonding is mainly for failover, you cannot simply multiply the bandwidth by the number of bond members.

Yes and no.
You use bonding for failover and for load-balancing, which in turn can increase your parallel throughput.
When I do virtualize this server, would it be better to create it's own bridge with two bonded ports, or just use the single bridge setup as I mentioned previously?
In general, is it just as well to have all six ports bonded to one bridge and assign that bridge to all VMs, rather than creating separate bridges for specific server(s)?

You'd have to read up on the available modes for linux native bridges and/or openvswitch.


Some Examples:
  • With a balance-rr mode (via linux native bridge) on a dual 1G link you can expect a 1.4x increase to a single source via tcp, this is due to overhead generated by packets not being sent in order at times.
  • with balance-tcp (via openvswitch) on a multi-1G link you will never get more then 1G to/from a single source/destination. However, if you have multiple combinations of mac, IP and TCP-ports you can expect that all of your links get fully utilised as long as your NumberOfLinks >= Mac-IP-TCP-Combos.
  • there are a bunch more modes you wanna read up on. Probably also wanna read up on openvswitch vs linux native bridge, and when to use a bridge based on one of those possibilities.



regarding single or multiple vmbrX's.
Assuming you do not have the ability to use a SDN-Controller network wide, or a fairly feature rich Switch, you need to manually take QOS into account.

1 Bridge for your Proxmox-Cluster
  • Use a single Link (should be sufficient in most cases, unless you do zfs-sync or migrations without Remote-Storage) or a Bond in Balance-tcp mode (the more stuff you do in parallel via the Cluster-network, the more links you want)

1 Bridge for your External Storage solution, or use Gluster/Ceph/DBRB
  • single-Link or Bond in Balance-rr (if the throughput of a single Link is insufficient) for NFS / ISCSI / etc.
  • single-Link or Bond using Balance-tcp for Gluster, but especially for Ceph (since all Mons/OSD run on separate TCP-Ports on both Public and Cluster network)
1 Bridge for your VM's
  • Single-Link or Bond in Balance-TCP mode. The more VM's and the more Guests you have, the more you will utilise your assigned Links and see an increase in throughput when doing stuff in parallel.





I personally tend to put ALL 1G, 10G or 40G links into a single openvswitch based bridge in ballance-tcp mode and do load-balancing on the router / SDN-controller side. Since it makes it possible to have no "unused" capacities, while maintaining minimum bandwidth levels for specific subnets / vlans.
It is due the fact that i have boatloads of Nodes and/or Clients accessing these Nodes and the Hardware to manage QOS, allowing me to not have to rely on manually created borders.
 
Last edited:
Yes and no.
You use bonding for failover and for load-balancing, which in turn can increase your parallel throughput.


Precisely: load-balancing distributes the traffic to the different physical bond members, but it is not proportional. In case of LACP the decision which physical NIC to use is based on source-destination relation for each packet; i.e. when the majority of the traffic runs between only a few different endpoints, it can happen, that one bond member performs 90% ore more of the traffic (even you have 4 members in the bond, let´s say). Of course, when establishing different networks for different purposes (as alternative to bonding) you have a similar problem, but you can decide where which traffic is performed (and not the complex algorithm of LACP).

Finally, if you have e.g. 6 NICs available I would form 3 bonds with 2 devices each.
 
Thank you for everyone contributing to my better understanding of how all this works.

After performing some networking load testing using the suggested iperf and other tools, it made a negligible difference whether I bonded all ports together, or created individual ones or any combination of such. Now, this isn't the end-all suggestion, as I was only using four outside machines to push the net traffic to my VM. But based off this, I'm just going to make one bond out of all the ports. It'll make things easier for maintenance. I was just more worried that the virtual network layer was going to cause noticeable performance/slowness issues.
 
  • Like
Reactions: richmash
I'm just going to make one bond out of all the ports. It'll make things easier for maintenance. I was just more worried that the virtual network layer was going to cause noticeable performance/slowness issues.

If you have a small number of "VM/s x clients" - you might wanna stick to linux native Bridge in Balance-RR
If you have a bigger set of combos (e.g. 3 VM's with 2 Clients pushing Bandwith in the 125 MB/s area) you definitely wanna use openvswitch + balance-tcp. Safes you cpu-cycles and you will most likely see an increase in parallel throughput, due to ovs load-balancing connections based on Source+ destination + tcp-port combination.

Wiki is here:
https://pve.proxmox.com/wiki/Open_vSwitch
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!