Hey all,
I just upgraded a three node cluster (new set up) and the cluster links (2* 10G LACP) stopped working.
I can see that the MAC-Adresses of all cluster links (2* 10G, LACP Bond, Bridge) on the three servers are identical now. After new installation...the same.
It is just the cluster links with Intel 10G cards that shows this behaivor. I also have 2* 10G LACP bonding and bridges for Management and Storage with Broadcom cards that still work fine.
Is it a bug?
What is the best way to fix?
@netzwerkcluster-server01:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-19-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph: 17.2.7-pve1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx5
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.5
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-3
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-7
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1
@netzwerkcluster-server01:~# cat /proc/net/bonding/bond2
Ethernet Channel Bonding Driver: v6.2.16-19-pve
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0
802.3ad info
LACP active: on
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 98:b7:85:55:22:11 <- same on all nodes
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2
Actor Key: 15
Partner Key: 11
Partner Mac Address: 80:db:17:3b:49:00
Slave Interface: ens1f0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 98:b7:85:55:22:11
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 98:b7:85:55:22:11
port key: 15
port priority: 255
port number: 1
port state: 61
details partner lacp pdu:
system priority: 127
system mac address: 80:db:17:3b:49:00
oper key: 11
port priority: 127
port number: 3
port state: 63
Slave Interface: ens1f1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 98:b7:85:55:22:12
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 98:b7:85:55:22:11
port key: 15
port priority: 255
port number: 2
port state: 61
details partner lacp pdu:
system priority: 127
system mac address: 80:db:17:3b:49:00
oper key: 11
port priority: 127
port number: 6
port state: 63
@netzwerkcluster-server01:~# cat /etc/network/interfaces
auto bond0
iface bond0 inet manual
bond-slaves eno5 eno6
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
#Management
auto bond1
iface bond1 inet manual
bond-slaves eno3 eno4
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
#VM-Netzwerk
auto bond2
iface bond2 inet manual
bond-slaves ens1f0 ens1f1
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
#Cluster-Link
auto vmbr0
iface vmbr0 inet static
address 10.10.4.41/24
gateway 10.10.4.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
#Management
auto vmbr1
iface vmbr1 inet manual
bridge-ports bond1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#VM-Netzwerk
auto vmbr2
iface vmbr2 inet static
address 172.31.0.1/24
bridge-ports bond2
bridge-stp off
bridge-fd 0
#Cluster-Link
I just upgraded a three node cluster (new set up) and the cluster links (2* 10G LACP) stopped working.
I can see that the MAC-Adresses of all cluster links (2* 10G, LACP Bond, Bridge) on the three servers are identical now. After new installation...the same.
It is just the cluster links with Intel 10G cards that shows this behaivor. I also have 2* 10G LACP bonding and bridges for Management and Storage with Broadcom cards that still work fine.
Is it a bug?
What is the best way to fix?
@netzwerkcluster-server01:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-19-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph: 17.2.7-pve1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx5
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.5
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-3
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-7
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1
@netzwerkcluster-server01:~# cat /proc/net/bonding/bond2
Ethernet Channel Bonding Driver: v6.2.16-19-pve
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0
802.3ad info
LACP active: on
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 98:b7:85:55:22:11 <- same on all nodes
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2
Actor Key: 15
Partner Key: 11
Partner Mac Address: 80:db:17:3b:49:00
Slave Interface: ens1f0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 98:b7:85:55:22:11
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 98:b7:85:55:22:11
port key: 15
port priority: 255
port number: 1
port state: 61
details partner lacp pdu:
system priority: 127
system mac address: 80:db:17:3b:49:00
oper key: 11
port priority: 127
port number: 3
port state: 63
Slave Interface: ens1f1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 98:b7:85:55:22:12
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 98:b7:85:55:22:11
port key: 15
port priority: 255
port number: 2
port state: 61
details partner lacp pdu:
system priority: 127
system mac address: 80:db:17:3b:49:00
oper key: 11
port priority: 127
port number: 6
port state: 63
@netzwerkcluster-server01:~# cat /etc/network/interfaces
auto bond0
iface bond0 inet manual
bond-slaves eno5 eno6
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
#Management
auto bond1
iface bond1 inet manual
bond-slaves eno3 eno4
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
#VM-Netzwerk
auto bond2
iface bond2 inet manual
bond-slaves ens1f0 ens1f1
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
#Cluster-Link
auto vmbr0
iface vmbr0 inet static
address 10.10.4.41/24
gateway 10.10.4.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
#Management
auto vmbr1
iface vmbr1 inet manual
bridge-ports bond1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#VM-Netzwerk
auto vmbr2
iface vmbr2 inet static
address 172.31.0.1/24
bridge-ports bond2
bridge-stp off
bridge-fd 0
#Cluster-Link