Network problem bond+vlan+bridge

Same issue for me with 2 PCI cards with Broadcom Limited BCM57412 NetXtreme-E 10Gb in a Dell R740xd. No mix of interfaces between PCI cards in a bond come up at boot. The problem seems to be bond0.vlan not up ....
The more starnge is thant I can configure and run my network config on a living system .... I add the third port to the bond, restart networking, and everything is OK, traffic is shared accros the 3 links .... but no network at reboot !
 
Same issue for me with 2 PCI cards with Broadcom Limited BCM57412 NetXtreme-E 10Gb in a Dell R740xd. No mix of interfaces between PCI cards in a bond come up at boot. The problem seems to be bond0.vlan not up ....
The more starnge is thant I can configure and run my network config on a living system .... I add the third port to the bond, restart networking, and everything is OK, traffic is shared accros the 3 links .... but no network at reboot !
can you send your /etc/network/interfaces file ?
 
ens6xxxx : 2 x 10 G on one PCI card
eno1np0, eno2np1, eno3 eno4: the other card with 2 x 10 G + 2 x 1000BasrT (only 10G tested)

my interfaces file:

Code:
auto lo
iface lo inet loopback

auto eno1np0
iface eno1np0 inet manual

iface eno3 inet manual

iface eno4 inet manual

auto ens6f0np0
iface ens6f0np0 inet manual

auto ens6f1np1
iface ens6f1np1 inet manual

iface eno2np1 inet manual

auto bond0
iface bond0 inet manual
    bond-slaves ens6f0np0 ens6f1np1
    bond-miimon 100
    bond-mode 802.3ad
    bond-xmit-hash-policy layer2+3
    mtu 9000
#Full connection on 2 x 10 Go

auto bond0.2601
iface bond0.2601 inet manual
    vlan-id 2601
#HyperAdm

auto bond0.505
iface bond0.505 inet manual
    mtu 9000
    vlan-id 505
#Ceph Cluster

auto bond0.2
iface bond0.2 inet manual
    vlan-id 2
#for migration test

auto bond0.500
iface bond0.500 inet manual
    mtu 9000
    vlan-id 500
#EQL iSCSI

auto bond0.510
iface bond0.510 inet manual
    mtu 9000
    vlan-id 510
#Ceph Public

auto vmbr2601
iface vmbr2601 inet static
    address 172.27.1.91/24
    gateway 172.27.1.1
    bridge-ports bond0.2601
    bridge-stp off
    bridge-fd 0
#HyperAdm Bridge

auto vmbr505
iface vmbr505 inet static
    address 10.1.5.91/24
    bridge-ports bond0.505
    bridge-stp off
    bridge-fd 0
    mtu 9000
#Ceph Cluster

auto vmbr2
iface vmbr2 inet manual
    bridge-ports bond0.2
    bridge-stp off
    bridge-fd 0
#cf bond0.2

auto vmbr500
iface vmbr500 inet static
    address 10.1.1.191/24
    bridge-ports bond0.500
    bridge-stp off
    bridge-fd 0
    mtu 9000
#iSCSI ceph1

auto vmbr510
iface vmbr510 inet static
    address 10.1.10.91/24
    bridge-ports bond0.510
    bridge-stp off
    bridge-fd 0
    mtu 8998
#Ceph Public
 
Last edited:
One note:

you don't need to add " vlan-id X", if an interface is named like "bond0.X".

"vlan-id x" is only if you configure a random name for interface. (and it's supported only by ifupdown2).
not sure about the behaviour with this config.


Do you use ifupdown2 ?
if yes, what is the result of "systemctl status networking" ?

I would like to see if service is correctly enabled.
 
One note:

you don't need to add " vlan-id X", if an interface is named like "bond0.X".

"vlan-id x" is only if you configure a random name for interface. (and it's supported only by ifupdown2).
not sure about the behaviour with this config.
Yes and MTU is not needed in vmbr confs .... but it works
Do you use ifupdown2 ?
Yes ... that the last test I wanted to do, reinstall ifupdown.
if yes, what is the result of "systemctl status networking" ?

I would like to see if service is correctly enabled.
Seems ok
root@ceph1:~# systemctl status networking ● networking.service - Network initialization Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled) Active: active (exited) since Wed 2020-09-30 08:29:01 CEST; 1 weeks 0 days ago Docs: man:interfaces(5) man:ifup(8) man:ifdown(8) Main PID: 1339 (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 4915) Memory: 0B CGroup: /system.slice/networking.service Sep 30 08:28:59 ceph1 systemd[1]: Starting Network initialization... Sep 30 08:28:59 ceph1 networking[1339]: networking: Configuring network interfaces Sep 30 08:29:01 ceph1 systemd[1]: Started Network initialization.
 
Workaround found ...

I've opened a case at Dell to see if problem was known ... but nothing in their databases. Debian / proxmox not really suported, but they suggested a really clever idea ... add the slaves later.

So in interfaces, bond0 has the 2 interfaces on one PCI card, and no modification there

CLI cmd to add slave interface:
ip link set eno1np0 down && ip link set eno1np0 master bond0

First try: I've added these commands as post-up script in one vmbr interface (should be executed after the problem stuff) ==> Failed

Second try: a last command using systemd (I'm not really aware of systemd modifications .... i've followed https://superuser.com/questions/544399/how-do-you-make-a-systemd-service-as-the-last-service-on-boot) ==> Ok

Because of all the ceph stuff on my servers, the add_slave.sh is executed a few minutes after the network comes up with 2 interfaces in the bond and everything ok .... the server has 3 x 10G ports in the bond and no connection lost ....

I can live with this because, this is for 4 servers dedicated to ceph only in my proxmox cluster, and network config should not change often.
 
Last edited:
  • Like
Reactions: bobmc
We had problems with active-backup bonding with BCM57412 NetXtreme-E 10Gb in Dell R740xd and VLAN.

pve-manager/7.0-11/63d82f4e
Ethernet Channel Bonding Driver: v5.11.22-4-pve

Active-backup bonding did't work with VLAN if the bonding was on different network interface cards. After disconnecting the active port there was no switch over on the backup.

It worked if VLAN were not used and if VLAN were used involving ports on the same network card.

Finally we got it working commenting the line "bond-primary" in interfaces (with both VLAN aware Linux bridge and traditional with traditional Linux bridge)

Could be possible for bond-primary directive in web gui to be optional and not compulsory?

Thanks
 
did some tests, active-backup bond with 2 ports:

a) bond/bridge across 2 broadcom network cards - error
b) bond/bridge on single dual port broadcom network card - works
b) bond/bridge across 2 intel network cards - works
c) bond/bridge across broadcom and qlogic network cards - works

it would require some additional testing, as i have different switches on the other end but it does look like an issue with broadcom
I have the same with Broadcom BCM57412 NetXtreme-E 10Gb Ethernet

create bond across 2 diferrents 2 nic = erro
create bond using 1 dual nic = ok
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!