[SOLVED] MAC address flappling on a Cisco Catalyst 2960-S

Oct 13, 2017
9
1
8
Hello,

I've installed Proxmox VE on a server with four Broadcom BCM5709 NICs and configured the bonding via the webinterface. The /etc/network/interfaces looks like this:
Code:
auto lo
iface lo inet loopback

iface enp2s0f0 inet manual

iface enp2s0f1 inet manual

iface enp3s0f0 inet manual

iface enp3s0f1 inet manual

auto bond0
iface bond0 inet manual
        slaves enp2s0f0 enp2s0f1 enp3s0f0 enp3s0f1
        bond_miimon 100
        bond_mode 802.3ad

auto vmbr0
iface vmbr0 inet static
        address  192.168.120.34
        netmask  255.255.255.0
        gateway  192.168.120.254
        bridge_ports bond0.120
        bridge_stp off
        bridge_fd 0
#Hypervisor VLAN

auto vmbr1
iface vmbr1 inet manual
        bridge_ports bond0.110
        bridge_stp off
        bridge_fd 0
#LAB & Test VLAN

On the switch I have this configuration:
Code:
port-channel load-balance src-dst-mac
!
...
!
interface Port-channel2
 description Testserver / Proxmox Hypervisor
 switchport mode trunk
 spanning-tree portfast trunk
 spanning-tree bpduguard enable
!
...
!
interface GigabitEthernet1/0/21
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/22
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/23
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/24
 switchport mode trunk
 channel-group 2 mode active
!

On the switch I'm getting these messages:
Code:
Oct 13 12:36:43.000: %SW_MATM-4-MACFLAP_NOTIF: Host 0026.5555.78fa in vlan 120 is flapping between port Gi1/0/22 an    d port Gi1/0/21
Oct 13 12:37:33.009: %SW_MATM-4-MACFLAP_NOTIF: Host 0026.5555.78fa in vlan 120 is flapping between port Gi1/0/21 an    d port Gi1/0/22
Oct 13 12:37:51.339: %SW_MATM-4-MACFLAP_NOTIF: Host 0026.5555.78fa in vlan 120 is flapping between port Gi1/0/22 an    d port Gi1/0/23
Oct 13 12:37:19.408: %SW_MATM-4-MACFLAP_NOTIF: Host 0026.5555.78fa in vlan 120 is flapping between port Gi1/0/23 an    d port Gi1/0/24

And in the syslog on the server I have these lines:
Code:
Oct 13 14:21:39 proxmox-bettembourg-02 kernel: vmbr0: received packet on bond0.120 with own address as source address (addr:00:26:55:55:78:fa, vlan:0)
Oct 13 14:21:39 proxmox-bettembourg-02 kernel: vmbr0: received packet on bond0.120 with own address as source address (addr:00:26:55:55:78:fa, vlan:0)
Oct 13 14:21:39 proxmox-bettembourg-02 kernel: vmbr0: received packet on bond0.120 with own address as source address (addr:00:26:55:55:78:fa, vlan:0)

What's wrong? Have I missed something?

Gilles
 
The etherchannel looks OK on the switch:
Code:
show etherchannel 2 summary
Flags:  D - down        P - bundled in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator

        M - not in use, minimum links not met
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port


Number of channel-groups in use: 2
Number of aggregators:           2

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SD)         LACP      Gi1/0/21(I) Gi1/0/22(I) Gi1/0/23(I)
                                 Gi1/0/24(I)

But I fund something strange on the hypervisor:
Code:
cat /sys/class/net/bond0/bonding/mode
balance-rr 0

Shouldn't this be 802.3ad?

Regards,
Gilles
 
The etherchannel does NOT look OK on the switch. Have a look at the flags.
The LAG doesn't work: the portchannel is down, and each physical port acts independently.

The fault is on the Linux side, as the LAG must be in LACP/802.3ad mode.
So, in fact, "it doesn't work at all" :)

The difference I see here are some
auto enp2s0f0
lines for each member...
 
What I meant is that the config looks OK. In the mean time I did a little change on the linux machine, I replaced the line:
Code:
bond_mode 802.3ad
to
Code:
bond_mode 4
and after restarting the network I have:
Code:
# cat /sys/class/net/bond0/bonding/mode
802.3ad 4
and on the switch:
Code:
#show etherchannel 2 summary
Flags:  D - down        P - bundled in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator

        M - not in use, minimum links not met
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port


Number of channel-groups in use: 2
Number of aggregators:           2

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SU)         LACP      Gi1/0/21(P) Gi1/0/22(P) Gi1/0/23(P)
                                 Gi1/0/24(P)

So my problem is solved but I do not understand why the hypervisor does not accept the line:
Code:
bond_mode 802.3ad

I'm using exactly the same config on a linux jessie without any problem.
 
Yes, your problem is solved ; why the config wasn't allowed? Good question, as "it works here"...
Code:
root@hv-co-01-pareq6:~# grep mode /etc/network/interfaces
bond_mode 802.3ad

root@hv-co-01-pareq6:~# cat /sys/class/net/bond0/bonding/mode
802.3ad 4

root@hv-co-01-pareq6:~# cat /etc/debian_version 
9.2
 
Right now I'm using the No-Subscription repository. Next week, as soon as I receive a subscription key for this machine, I'll try with the packages from the enterprise repository.
 
Tomorrow I've activated the subscription of the machine, updated all the packages and performed a reboot. With this line in the
/etc/network/interfaces file:
Code:
bond_mode 802.3ad
The interfaces weren't bonded together after the reboot. I simply restarted the network with:
Code:
/etc/init.d/networking restart
without replacing the 802.3ad with a 4 and it worked:
Code:
#show etherchannel 2 summary
Flags:  D - down        P - bundled in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator

        M - not in use, minimum links not met
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port


Number of channel-groups in use: 2
Number of aggregators:           2

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SU)         LACP      Gi1/0/21(P) Gi1/0/22(P) Gi1/0/23(P)
                                 Gi1/0/24(P)
So I have a simple workaround but I'd like to know why it doesn't work automatically.
 
Tonight I've upgraded the IOS from the very old 12.2.55-SE3 to the suggested 15.0. This change didn't solve the problem. Could it be a problem with the used Broadcom NICs?
Code:
# lspci | grep -i ethernet
02:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
 
Last edited:
There's nothing wrong on the Cisco side, the issue is on the Linux side: you configure mode X and it chooses to use mode Y.
Why? That's the question :p
But I'm not sure it's a proxmox issue ; it can well be a more generic debian issue, maybe a race in the networkwing scripts...
 
I tried different bonding modes, but after every reboot I ended up with round-robin.
After googling a bit I found this advisory from Hewlett Packard:
https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c03001127
I looks like the modules must be loaded in a proper order, so I adapted the recommendation an my /etc/modules now lookes like this:
Code:
:~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
bnx2
bonding
-> problem solved :)
 
  • Like
Reactions: Symbol

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!