Greetings fellow Proxmox users and moderators. First I want to thank the Proxmox team and developers for creating an amazing product. I have been a Proxmox user for only about 2 years but a Linux admin for about 15 years. Regrettably I have not created a forum account until now, however I appreciate how the moderators and participants always attempt to be helpful no matter the level of proficiency.
Recently, I have encountered an issue which (though I have found a workaround) I have been unable to find the root cause. I believe this issue to likely be with Debian Buster and not Proxmox specifically, however since I have encountered it with Proxmox 6.1 and 6.2 systems I figured I would post here first.
In particular, I have upgraded two identical, clustered servers (svr-lf-pve1 and svr-lf-pve2) from Proxmox 5.4 to 6.1 and then to 6.2 thereafter. Specifically, they are two SuperMicro Super X10DRT-P (X10DRT-P) servers in a single SuperServer 1028TP-DC0R (SYS-1028TP-DC0R) chassis. Each server has two onboard gigabit interfaces (eno1 and eno2) as well as a PCIe Ethernet card (AOC-SGP-i2) with dual gigabit interfaces (ens1f0 and ens1f1) for a total of 4 gigabit Ethernet interfaces per server. All network interfaces have unique hardware MAC addresses set by the manufacturer. Below are related PVE and package versions installed:
root@svr-lf-pve1:/tmp# pveversion
pve-manager/6.2-4/9824574a (running kernel: 5.4.41-1-pve)
root@svr-lf-pve1:/tmp# dpkg -l | grep openvswitch
ii openvswitch-common 2.12.0-1 amd64 Open vSwitch common components
ii openvswitch-switch 2.12.0-1 amd64 Open vSwitch switch implementations
root@svr-lf-pve1:/tmp# dpkg -l | grep ifupdown
ii ifupdown 0.8.35+pve1 amd64 high level tools to configure network interfaces
root@svr-lf-pve1:/tmp# dpkg -l | grep ifenslave
ii ifenslave 2.9 all configure network interfaces for parallel routing (bonding)
I am bonding the two onboard interfaces utilizing Open vSwitch to provide redundant connectivity to virtual machines, and am bonding the two PCIe interfaces for redundant NFS storage connectivity utilizing Linux kernel bonding. This configuration worked on Proxmox 5.x flawlessly for about 2 years. Attached are the /etc/network/interface configs for each server.
So after upgrading both host servers to Proxmox 6.1 I noticed sever network connectivity issues, and (in a truly WTF moment) I eventually discovered that the VLAN interfaces both servers had each been assigned the same MAC address, and also the bond interfaces on both servers had also been assigned the same MAC address:
root@svr-lf-pve1:/tmp# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a8 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a9 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e6:32:13:5f:f6:0a brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a8 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 9a:9e:1b:c8:4a:26 brd ff:ff:ff:ff:ff:ff
9: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether e2:ca:20:cf:12:e0 brd ff:ff:ff:ff:ff:ff
10: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
11: vlan20@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
root@svr-lf-pve2:/tmp# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c0 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c1 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e6:32:13:5f:f6:0a brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c0 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 9a:9e:1b:c8:4a:26 brd ff:ff:ff:ff:ff:ff
9: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether e2:ca:20:cf:12:e0 brd ff:ff:ff:ff:ff:ff
10: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
11: vlan20@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
Digging a bit deeper to find how the MAC addresses for the bond interfaces are being assigned:
root@svr-lf-pve1:/tmp# cat /sys/class/net/bond0/addr_assign_type
3
root@svr-lf-pve1:/tmp# cat /sys/class/net/bond1/addr_assign_type
3
And from the following Linux kernel documentation the addr_assign_type 3 apparently means "set using dev_set_mac_address":
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-net
As you can see from my /etc/network/interface configs the workaround I have implemented thus far was to assign a unique hwaddress to each bond interface and the VLAN3 OVS port/interface. However, I am a bit dumbfounded how Debian Buster automatically assigns the same exact MAC address to bonds on completely different servers / hardware? Also I could not find where and/or how the dev_set_mac_address is being generated? Shouldn't this be a randomly generated address? Could this be a kernel bug not generating unique randomized MAC addresses when the hardware is exactly the same between to servers?
Any insight would be greatly appreciated.
Thank you in advance for your consideration,
Ryan Covietz
Meta Krypt LLC
Recently, I have encountered an issue which (though I have found a workaround) I have been unable to find the root cause. I believe this issue to likely be with Debian Buster and not Proxmox specifically, however since I have encountered it with Proxmox 6.1 and 6.2 systems I figured I would post here first.
In particular, I have upgraded two identical, clustered servers (svr-lf-pve1 and svr-lf-pve2) from Proxmox 5.4 to 6.1 and then to 6.2 thereafter. Specifically, they are two SuperMicro Super X10DRT-P (X10DRT-P) servers in a single SuperServer 1028TP-DC0R (SYS-1028TP-DC0R) chassis. Each server has two onboard gigabit interfaces (eno1 and eno2) as well as a PCIe Ethernet card (AOC-SGP-i2) with dual gigabit interfaces (ens1f0 and ens1f1) for a total of 4 gigabit Ethernet interfaces per server. All network interfaces have unique hardware MAC addresses set by the manufacturer. Below are related PVE and package versions installed:
root@svr-lf-pve1:/tmp# pveversion
pve-manager/6.2-4/9824574a (running kernel: 5.4.41-1-pve)
root@svr-lf-pve1:/tmp# dpkg -l | grep openvswitch
ii openvswitch-common 2.12.0-1 amd64 Open vSwitch common components
ii openvswitch-switch 2.12.0-1 amd64 Open vSwitch switch implementations
root@svr-lf-pve1:/tmp# dpkg -l | grep ifupdown
ii ifupdown 0.8.35+pve1 amd64 high level tools to configure network interfaces
root@svr-lf-pve1:/tmp# dpkg -l | grep ifenslave
ii ifenslave 2.9 all configure network interfaces for parallel routing (bonding)
I am bonding the two onboard interfaces utilizing Open vSwitch to provide redundant connectivity to virtual machines, and am bonding the two PCIe interfaces for redundant NFS storage connectivity utilizing Linux kernel bonding. This configuration worked on Proxmox 5.x flawlessly for about 2 years. Attached are the /etc/network/interface configs for each server.
So after upgrading both host servers to Proxmox 6.1 I noticed sever network connectivity issues, and (in a truly WTF moment) I eventually discovered that the VLAN interfaces both servers had each been assigned the same MAC address, and also the bond interfaces on both servers had also been assigned the same MAC address:
root@svr-lf-pve1:/tmp# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a8 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a9 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e6:32:13:5f:f6:0a brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a9:a8 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 9a:9e:1b:c8:4a:26 brd ff:ff:ff:ff:ff:ff
9: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether e2:ca:20:cf:12:e0 brd ff:ff:ff:ff:ff:ff
10: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
11: vlan20@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
root@svr-lf-pve2:/tmp# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c0 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c1 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e6:32:13:5f:f6:0a brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:1a:a7:c0 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 9a:9e:1b:c8:4a:26 brd ff:ff:ff:ff:ff:ff
9: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether e2:ca:20:cf:12:e0 brd ff:ff:ff:ff:ff:ff
10: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
11: vlan20@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 32:b2:10:f6:36:5f brd ff:ff:ff:ff:ff:ff
Digging a bit deeper to find how the MAC addresses for the bond interfaces are being assigned:
root@svr-lf-pve1:/tmp# cat /sys/class/net/bond0/addr_assign_type
3
root@svr-lf-pve1:/tmp# cat /sys/class/net/bond1/addr_assign_type
3
And from the following Linux kernel documentation the addr_assign_type 3 apparently means "set using dev_set_mac_address":
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-class-net
As you can see from my /etc/network/interface configs the workaround I have implemented thus far was to assign a unique hwaddress to each bond interface and the VLAN3 OVS port/interface. However, I am a bit dumbfounded how Debian Buster automatically assigns the same exact MAC address to bonds on completely different servers / hardware? Also I could not find where and/or how the dev_set_mac_address is being generated? Shouldn't this be a randomly generated address? Could this be a kernel bug not generating unique randomized MAC addresses when the hardware is exactly the same between to servers?
Any insight would be greatly appreciated.
Thank you in advance for your consideration,
Ryan Covietz
Meta Krypt LLC