Duplicate MAC Addresses Generated for Bonded Interfaces on Identical Server Hardware

May 30, 2020
2
0
6
Chicago
Greetings fellow Proxmox users and moderators. First I want to thank the Proxmox team and developers for creating an amazing product. I have been a Proxmox user for only about 2 years but a Linux admin for about 15 years. Regrettably I have not created a forum account until now, however I appreciate how the moderators and participants always attempt to be helpful no matter the level of proficiency.

Recently, I have encountered an issue which (though I have found a workaround) I have been unable to find the root cause. I believe this issue to likely be with Debian Buster and not Proxmox specifically, however since I have encountered it with Proxmox 6.1 and 6.2 systems I figured I would post here first.

In particular, I have upgraded two identical, clustered servers (svr-lf-pve1 and svr-lf-pve2) from Proxmox 5.4 to 6.1 and then to 6.2 thereafter. Specifically, they are two SuperMicro Super X10DRT-P (X10DRT-P) servers in a single SuperServer 1028TP-DC0R (SYS-1028TP-DC0R) chassis. Each server has two onboard gigabit interfaces (eno1 and eno2) as well as a PCIe Ethernet card (AOC-SGP-i2) with dual gigabit interfaces (ens1f0 and ens1f1) for a total of 4 gigabit Ethernet interfaces per server. All network interfaces have unique hardware MAC addresses set by the manufacturer. Below are related PVE and package versions installed:


root@svr-lf-pve1:/tmp# pveversion
pve-manager/6.2-4/9824574a (running kernel: 5.4.41-1-pve)
root@svr-lf-pve1:/tmp# dpkg -l | grep openvswitch
ii openvswitch-common 2.12.0-1 amd64 Open vSwitch common components
ii openvswitch-switch 2.12.0-1 amd64 Open vSwitch switch implementations
root@svr-lf-pve1:/tmp# dpkg -l | grep ifupdown
ii ifupdown 0.8.35+pve1 amd64 high level tools to configure network interfaces
root@svr-lf-pve1:/tmp# dpkg -l | grep ifenslave
ii ifenslave 2.9 all configure network interfaces for parallel routing (bonding)


I am bonding the two onboard interfaces utilizing Open vSwitch to provide redundant connectivity to virtual machines, and am bonding the two PCIe interfaces for redundant NFS storage connectivity utilizing Linux kernel bonding. This configuration worked on Proxmox 5.x flawlessly for about 2 years. Attached are the /etc/network/interface configs for each server.

So after upgrading both host servers to Proxmox 6.1 I noticed sever network connectivity issues, and (in a truly WTF moment) I eventually discovered that the VLAN interfaces both servers had each been assigned the same MAC address, and also the bond interfaces on both servers had also been assigned the same MAC address:


root@svr-lf-pve1:/tmp# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a8 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a9 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000

link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000

link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e6:32:13:5f:f6:0a brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a8 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 9a:9e:1b:c8:4a:26 brd ff:ff:ff:ff:ff:ff
9: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether e2:ca:20:cf:12:e0 brd ff:ff:ff:ff:ff:ff
10: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000

link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
11: vlan20@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000

link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff

root@svr-lf-pve2:/tmp# ip link

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c0 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c1 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000

link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000

link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e6:32:13:5f:f6:0a brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c0 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000

link/ether 9a:9e:1b:c8:4a:26 brd ff:ff:ff:ff:ff:ff
9: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether e2:ca:20:cf:12:e0 brd ff:ff:ff:ff:ff:ff
10: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
11: vlan20@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000

link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff


Digging a bit deeper to find how the MAC addresses for the bond interfaces are being assigned:

root@svr-lf-pve1:/tmp# cat /sys/class/net/bond0/addr_assign_type
3
root@svr-lf-pve1:/tmp# cat /sys/class/net/bond1/addr_assign_type
3


And from the following Linux kernel documentation the addr_assign_type 3 apparently means "set using dev_set_mac_address":

https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-net

As you can see from my /etc/network/interface configs the workaround I have implemented thus far was to assign a unique hwaddress to each bond interface and the VLAN3 OVS port/interface. However, I am a bit dumbfounded how Debian Buster automatically assigns the same exact MAC address to bonds on completely different servers / hardware? Also I could not find where and/or how the dev_set_mac_address is being generated? Shouldn't this be a randomly generated address? Could this be a kernel bug not generating unique randomized MAC addresses when the hardware is exactly the same between to servers?


Any insight would be greatly appreciated.

Thank you in advance for your consideration,
Ryan Covietz
Meta Krypt LLC
 

Attachments

Below is the /etc/network/interfaces config for each server (for those who don't want to have to download the attachments):


root@svr-lf-pve1:/tmp# cat /etc/network/interfaces
auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

allow-vmbr0 bond0
iface bond0 inet manual
ovs_bonds eno1 eno2
ovs_type OVSBond
ovs_bridge vmbr0
ovs_options other_config:lacp-time=fast lacp=active bond_mode=balance-tcp vlan_mode=trunk trunks=2,3,6,15
# hwaddress fa:ca:de:00:00:01 (workaround)

allow-ovs vmbr0
iface vmbr0 inet manual
ovs_type OVSBridge
ovs_mtu 1500
ovs_ports bond0 vlan2 vlan3 vlan6 vlan15

allow-vmbr0 vlan3
iface vlan3 inet static
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_options tag=3
ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
ovs_mtu 1500
# hwaddress fa:ca:de:00:03:01 (workaround)
address 10.183.3.21
netmask 255.255.255.0
gateway 10.183.3.5

auto ens1f0
iface ens1f0 inet manual

auto ens1f1
iface ens1f1 inet manual

auto bond1
iface bond1 inet manual
bond-mode 802.3ad
bond-miimon 100
bond-xmit_hash_policy layer3+4
bond-lacp_rate fast
bond-slaves ens1f0 ens1f1
# hwaddress fa:ca:de:01:00:01 (workaround)
mtu 9000

auto vlan20
iface vlan20 inet static
vlan-raw-device bond1
address 10.183.20.21
netmask 255.255.255.0
mtu 9000



root@svr-lf-pve2:/tmp# cat /etc/network/interfaces
auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

allow-vmbr0 bond0
iface bond0 inet manual
ovs_bonds eno1 eno2
ovs_type OVSBond
ovs_bridge vmbr0
ovs_options other_config:lacp-time=fast lacp=active bond_mode=balance-tcp vlan_mode=trunk trunks=2,3,6,15
# hwaddress fa:ca:de:00:00:02 (workaround)

allow-ovs vmbr0
iface vmbr0 inet manual
ovs_type OVSBridge
ovs_mtu 1500
ovs_ports bond0 vlan2 vlan3 vlan6 vlan15

allow-vmbr0 vlan3
iface vlan3 inet static
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_options tag=3
ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
ovs_mtu 1500
# hwaddress fa:ca:de:00:03:02 (workaround)
address 10.183.3.23
netmask 255.255.255.0
gateway 10.183.3.5

auto ens1f0
iface ens1f0 inet manual

auto ens1f1
iface ens1f1 inet manual

auto bond1
iface bond1 inet manual
bond-mode 802.3ad
bond-miimon 100
bond-xmit_hash_policy layer3+4
bond-lacp_rate fast
bond-slaves ens1f0 ens1f1
# hwaddress fa:ca:de:01:00:02 (workaround)
mtu 9000

auto vlan20
iface vlan20 inet static
vlan-raw-device bond1
address 10.183.20.23
netmask 255.255.255.0
mtu 9000
 
This is strange,

my linux 802.3ad bond mac address are always the mac address of one on the physical interface in the bond. (then both interface + bond have this mac adddress).

if you are testing, maybe can you try to install "ifupdown2" package? (I'm using it currently)
 
Not sure if this was ever solved but we found Bullseye changed the "addr_assign_type" for bonded interfaces from 2 to 3 (see https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-net ) and that was generating MAC addresses from "/etc/machine-id" which was identical on our servers due to them being built from a golden image. Simply running
Code:
rm -f /etc/machine-id && dbus-uuidgen --ensure=/etc/machine-id
and rebooting solved this for us.
 
Debian Bullseye has machine-id on two places, it's probably good idea to keep them in sync.

When clonning servers, I do something like
Code:
rm /etc/machine-id /var/lib/dbus/machine-id
/usr/bin/dbus-uuidgen --ensure
/bin/systemd-machine-id-setup
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!