Network problem bond+lacp

ninad

New Member
May 12, 2023
2
0
1
We have a few servers with 2 dual port cards of 25Gbps (make Intel E810-XXV) per server. All these ports are connected to the Extreme X695 switch.
Currently, I am running Proxmox 7.3.3 on all the servers. The strange thing is any one bond from all the servers does not come up or work. After a reboot of the server or switch, some other bond stops working, and the bond which was not working before the reboot starts working.

As per the switch diagnosis, the specific bond (at the switch end) didn't get any LACP packets from the server.
Below is my configuration at the proxmox end

Network config:

auto lo iface lo inet loopback auto eth0 iface eth0 inet manual auto eth1 iface eth1 inet manual auto eth2 iface eth2 inet manual auto eth3 iface eth3 inet manual iface eth4 inet manual iface eth5 inet manual auto bond0 iface bond0 inet manual bond-slaves eth0 eth2 bond-miimon 100 bond-mode 802.3ad bond-xmit-hash-policy layer3+4 #dataNW auto bond1 iface bond1 inet static address 10.205.1.1/24 bond-slaves eth1 eth3 bond-miimon 100 bond-mode 802.3ad #pxclusterNW auto bond0.200 iface bond0.200 inet manual #200NW VLAN auto vmbr0 iface vmbr0 inet static address 10.200.61.1/16 gateway 10.200.250.1 bridge-ports bond0.200 bridge-stp off bridge-fd 0


pveversion -v proxmox-ve: 7.3-1 (running kernel: 5.15.74-1-pve) pve-manager: 7.3-3 (running version: 7.3-3/c3928077) pve-kernel-5.15: 7.2-14 pve-kernel-helper: 7.2-14 pve-kernel-5.15.74-1-pve: 5.15.74-1 ceph-fuse: 15.2.17-pve1 corosync: 3.1.7-pve1 criu: 3.15-1+pve-1 glusterfs-client: 9.2-1 ifupdown2: 3.1.0-1+pmx3 ksm-control-daemon: 1.4-1 libjs-extjs: 7.0.0-1 libknet1: 1.24-pve2 libproxmox-acme-perl: 1.4.2 libproxmox-backup-qemu0: 1.3.1-1 libpve-access-control: 7.2-5 libpve-apiclient-perl: 3.2-1 libpve-common-perl: 7.2-8 libpve-guest-common-perl: 4.2-3 libpve-http-server-perl: 4.1-5 libpve-storage-perl: 7.2-12 libspice-server1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 5.0.0-3 lxcfs: 4.0.12-pve1 novnc-pve: 1.3.0-3 proxmox-backup-client: 2.2.7-1 proxmox-backup-file-restore: 2.2.7-1 proxmox-mini-journalreader: 1.3-1 proxmox-widget-toolkit: 3.5.3 pve-cluster: 7.3-1 pve-container: 4.4-2 pve-docs: 7.3-1 pve-edk2-firmware: 3.20220526-1 pve-firewall: 4.2-7 pve-firmware: 3.5-6 pve-ha-manager: 3.5.1 pve-i18n: 2.8-1 pve-qemu-kvm: 7.1.0-4 pve-xtermjs: 4.16.0-1 qemu-server: 7.3-1 smartmontools: 7.2-pve3 spiceterm: 3.2-2 swtpm: 0.8.0~bpo11+2 vncterm: 1.7-1 zfsutils-linux: 2.1.6-pve1

ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 00:62:0b:54:6c:10 brd ff:ff:ff:ff:ff:ff altname enp99s0f0np0 3: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 00:62:0b:54:6c:11 brd ff:ff:ff:ff:ff:ff altname enp99s0f1np1 4: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether b4:83:51:02:77:c0 brd ff:ff:ff:ff:ff:ff 5: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP group default qlen 1000 link/ether b4:83:51:02:77:c1 brd ff:ff:ff:ff:ff:ff 6: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether b4:83:51:02:77:c0 brd ff:ff:ff:ff:ff:ff permaddr b4:83:51:02:77:98 7: eth3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP group default qlen 1000 link/ether b4:83:51:02:77:c1 brd ff:ff:ff:ff:ff:ff permaddr b4:83:51:02:77:99 8: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether b4:83:51:02:77:c1 brd ff:ff:ff:ff:ff:ff inet 10.205.1.1/24 scope global bond1 valid_lft forever preferred_lft forever inet6 fe80::b683:51ff:fe02:77c1/64 scope link valid_lft forever preferred_lft forever 9: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether b4:83:51:02:77:c0 brd ff:ff:ff:ff:ff:ff inet6 fe80::b683:51ff:fe02:77c0/64 scope link valid_lft forever preferred_lft forever 10: bond0.200@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000 link/ether b4:83:51:02:77:c0 brd ff:ff:ff:ff:ff:ff 11: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether b4:83:51:02:77:c0 brd ff:ff:ff:ff:ff:ff inet 10.200.61.1/16 scope global vmbr0 valid_lft forever preferred_lft forever inet6 fe80::b683:51ff:fe02:77c0/64 scope link valid_lft forever preferred_lft forever

Extreme switch side:

MobaXterm_zLNX4hQiSN.png

Switch side config:
enable sharing 7:1 grouping 6:17,7:1 algorithm address-based L3_L4 lacp enable sharing 7:2 grouping 6:18,7:2 algorithm address-based L3_L4 lacp enable sharing 2:17 grouping 2:17,4:17 algorithm address-based L3_L4 lacp enable sharing 2:1 grouping 2:1,3:17 algorithm address-based L3_L4 lacp enable sharing 6:1 grouping 6:1,7:17 algorithm address-based L3_L4 lacp enable sharing 6:2 grouping 6:2,7:18 algorithm address-based L3_L4 lacp enable sharing 4:1 grouping 3:33,4:1 algorithm address-based L3_L4 lacp enable sharing 4:2 grouping 3:34,4:2 algorithm address-based L3_L4 lacp enable sharing 2:18 grouping 2:18,4:18 algorithm address-based L3_L4 lacp enable sharing 3:18 grouping 2:2,3:18 algorithm address-based L3_L4 lacp
 
Just a suggestion to run cat /proc/net/bonding/bond0 for each bond to see if the additional information would give any hints why it's not working...
 
Thank you so much for your response.
Below is the output of the suggested command. I am not sure what is wrong with it.

Code:
cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v5.15.74-1-pve

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

802.3ad info
LACP active: on
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 8e:24:5c:b3:3c:73
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 1
        Actor Key: 21
        Partner Key: 2001
        Partner Mac Address: f6:ce:48:f6:59:14

Slave Interface: eth0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: b4:83:51:02:77:c0
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 1
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 8e:24:5c:b3:3c:73
    port key: 21
    port priority: 255
    port number: 1
    port state: 61
details partner lacp pdu:
    system priority: 0
    system mac address: f6:ce:48:f6:59:14
    oper key: 2001
    port priority: 0
    port number: 2001
    port state: 61

Slave Interface: eth2
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 3
Permanent HW addr: b4:83:51:02:77:98
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: churned
Partner Churn State: none
Actor Churned Count: 3
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 8e:24:5c:b3:3c:73
    port key: 21
    port priority: 255
    port number: 2
    port state: 5
details partner lacp pdu:
    system priority: 0
    system mac address: f6:ce:48:f6:59:14
    oper key: 3018
    port priority: 0
    port number: 3018
    port state: 13

Code:
cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v5.15.74-1-pve

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

802.3ad info
LACP active: on
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 96:e2:f8:bb:f6:d3
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 1
        Actor Key: 21
        Partner Key: 3018
        Partner Mac Address: f6:ce:48:f6:59:14

Slave Interface: eth1
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 3
Permanent HW addr: b4:83:51:02:77:c1
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: churned
Actor Churned Count: 3
Partner Churned Count: 4
details actor lacp pdu:
    system priority: 65535
    system mac address: 96:e2:f8:bb:f6:d3
    port key: 21
    port priority: 255
    port number: 1
    port state: 13
details partner lacp pdu:
    system priority: 0
    system mac address: f6:ce:48:f6:59:14
    oper key: 3018
    port priority: 0
    port number: 2002
    port state: 5

Slave Interface: eth3
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: b4:83:51:02:77:99
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 2
Partner Churned Count: 3
details actor lacp pdu:
    system priority: 65535
    system mac address: 96:e2:f8:bb:f6:d3
    port key: 21
    port priority: 255
    port number: 2
    port state: 5
details partner lacp pdu:
    system priority: 0
    system mac address: f6:ce:48:f6:59:14
    oper key: 2001
    port priority: 0
    port number: 3017
    port state: 5
 
In addition to @spirit option, I would also double-check if it cabled correctly to the intended ports on the switch. Assuming the partner port number parameter is the encoded port number, you probably connected to the wrong ports. I would start with one bond, confirm it's working fine, then connect another bond...
 
Maybe it will help somebody else.

I wasn´t able to get both lines active. My problem was that i mixed NIC´s. LACP is depending on Driver so probably in most cases not possible to mix Ports on different Network Cards.
 
  • Like
Reactions: joey42

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!