Bonding doesn't appear to be working

rossd

New Member
Nov 16, 2023
28
4
3
I have an intel nuc running proxmox with two ethernet ports (one via usb). I've switched the bridge over to them and using iperf3 shown they both work as expected.

I have configured my network as follows:
Code:
auto eno1
    iface eno1 inet manual
    
    auto enx00e04c9f525e
    iface enx00e04c9f525e inet manual
    #usb 2.5gb
    
    auto bond0
    iface bond0 inet manual
            bond-slaves eno1 enx00e04c9f525e
            bond-miimon 100
            bond-mode 802.3ad
            bond-xmit-hash-policy layer3+4
    
    auto vmbr0
    iface vmbr0 inet static
            address 192.168.86.11/24
            gateway 192.168.86.1
            bridge-ports bond0
            bridge-stp off
            bridge-fd 0

In my unifi router I have configured both the ports as Aggregate.


On the nuc the output of "cat /proc/net/bonding/bond0" is:

Code:
root@pve1:~# cat /proc/net/bonding/bond0
    Ethernet Channel Bonding Driver: v6.5.11-8-pve
    
    Bonding Mode: IEEE 802.3ad Dynamic link aggregation
    Transmit Hash Policy: layer3+4 (1)
    MII Status: up
    MII Polling Interval (ms): 100
    Up Delay (ms): 0
    Down Delay (ms): 0
    Peer Notification Delay (ms): 0
    
    802.3ad info
    LACP active: on
    LACP rate: slow
    Min links: 0
    Aggregator selection policy (ad_select): stable
    System priority: 65535
    System MAC address: 1c:69:7a:68:1c:61
    Active Aggregator Info:
            Aggregator ID: 1
            Number of ports: 2
            Actor Key: 9
            Partner Key: 1001
            Partner Mac Address: e4:38:83:95:57:66
    
    Slave Interface: eno1
    MII Status: up
    Speed: 1000 Mbps
    Duplex: full
    Link Failure Count: 1
    Permanent HW addr: 1c:69:7a:68:1c:61
    Slave queue ID: 0
    Aggregator ID: 1
    Actor Churn State: none
    Partner Churn State: none
    Actor Churned Count: 0
    Partner Churned Count: 0
    details actor lacp pdu:
        system priority: 65535
        system mac address: 1c:69:7a:68:1c:61
        port key: 9
        port priority: 255
        port number: 1
        port state: 61
    details partner lacp pdu:
        system priority: 32768
        system mac address: e4:38:83:95:57:66
        oper key: 1001
        port priority: 1
        port number: 8
        port state: 61
    
    Slave Interface: enx00e04c9f525e
    MII Status: up
    Speed: 1000 Mbps
    Duplex: full
    Link Failure Count: 1
    Permanent HW addr: 00:e0:4c:9f:52:5e
    Slave queue ID: 0
    Aggregator ID: 1
    Actor Churn State: none
    Partner Churn State: none
    Actor Churned Count: 0
    Partner Churned Count: 0
    details actor lacp pdu:
        system priority: 65535
        system mac address: 1c:69:7a:68:1c:61
        port key: 9
        port priority: 255
        port number: 2
        port state: 61
    details partner lacp pdu:
        system priority: 32768
        system mac address: e4:38:83:95:57:66
        oper key: 1001
        port priority: 1
        port number: 7
        port state: 61

I have a vm running on the host which I have ran iperf3 on it as a server, and also I have iperf3 installed directly on the host.

From another nuc (directly connected on the same switch) I'm connecting to the host with bonding, and from my desktop pc I'm running iperf3 against the vm.

This is the output from the nuc, and you can see the impact when I start testing on my desktop pc:


Code:
root@pve0:~# iperf3 -c 192.168.86.11 -t 10000
    Connecting to host 192.168.86.11, port 5201
    [  5] local 192.168.86.10 port 55118 connected to 192.168.86.11 port 5201
    [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
    [  5]   0.00-1.00   sec   113 MBytes   949 Mbits/sec    0    264 KBytes
    [  5]   1.00-2.00   sec   112 MBytes   942 Mbits/sec    0    264 KBytes
    [  5]   2.00-3.00   sec   112 MBytes   942 Mbits/sec    0    279 KBytes
    [  5]   3.00-4.00   sec   112 MBytes   941 Mbits/sec    0    294 KBytes
    [  5]   4.00-5.00   sec   112 MBytes   941 Mbits/sec    0    310 KBytes
    [  5]   5.00-6.00   sec   113 MBytes   944 Mbits/sec    0    310 KBytes
    [  5]   6.00-7.00   sec   112 MBytes   940 Mbits/sec    0    310 KBytes
    [  5]   7.00-8.00   sec  65.4 MBytes   549 Mbits/sec  249   96.2 KBytes
    [  5]   8.00-9.00   sec  66.3 MBytes   556 Mbits/sec  204   97.6 KBytes
    [  5]   9.00-10.00  sec  65.5 MBytes   549 Mbits/sec  178    109 KBytes
    [  5]  10.00-11.00  sec  64.3 MBytes   539 Mbits/sec  165    148 KBytes
    [  5]  11.00-12.00  sec  64.9 MBytes   544 Mbits/sec  203    109 KBytes
    [  5]  12.00-13.00  sec  65.7 MBytes   551 Mbits/sec  183   99.0 KBytes
    [  5]  13.00-14.00  sec  69.3 MBytes   581 Mbits/sec  188    180 KBytes
    [  5]  14.00-15.00  sec  73.0 MBytes   613 Mbits/sec  204   79.2 KBytes
    [  5]  15.00-16.00  sec  69.0 MBytes   579 Mbits/sec  199    115 KBytes
    [  5]  16.00-17.00  sec  60.7 MBytes   509 Mbits/sec  200    103 KBytes
    [  5]  17.00-18.00  sec  60.5 MBytes   507 Mbits/sec  232   96.2 KBytes
    [  5]  18.00-19.00  sec  60.0 MBytes   504 Mbits/sec  181   97.6 KBytes
    [  5]  19.00-20.00  sec  63.3 MBytes   531 Mbits/sec  198    102 KBytes
    [  5]  20.00-21.00  sec  61.2 MBytes   513 Mbits/sec  176   96.2 KBytes
    [  5]  21.00-22.00  sec  62.4 MBytes   524 Mbits/sec  195    112 KBytes
    [  5]  22.00-23.00  sec  63.8 MBytes   535 Mbits/sec  209   99.0 KBytes

I think everything is setup correctly, but I wouldn't have expected the bandwidth to half when I'm hitting iperf3 twice.

Is this evidence I've not set it up correctly, or am I misunderstanding whats going on here?
 
lacp only balance 1 tcp connection/udp stream to 1 link.

so you need multiple connection (iperf -P x) to use multiple links.

and on your switch side, you also need layer3+4, to balance ip+port , ad you only test between 2 ip .
 
There aren’t really any settings on my UniFi switch other than saying aggregation.

I’ve ran with -P 2 and it shows this

Code:
- - - - - - - - - - - - - - - - - - - - - - - -
[  5]  42.00-43.00  sec  58.4 MBytes   490 Mbits/sec    0    337 KBytes       
[  7]  42.00-43.00  sec  53.5 MBytes   449 Mbits/sec    0    305 KBytes       
[SUM]  42.00-43.00  sec   112 MBytes   939 Mbits/sec    0

So it really does look broken :-(
 
I switched to layer 2 hashing and now when I can have iperf3 going at full speed from host a ->host b and a vm on host a-> a vm on host b.

iperf -P 2 still shows the bandwidth halved but I suspect it would for layer 2 hashing.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!