I've been working with a vendor of ours to help speed up a cluster deployment however we've been running into a wall with the LAGs forming. I left Corosync out of the mix below since there is no issue there seeing its in Active/Standby. But after setup for the Ceph and VM network we found that only one node would form the LAG at a time where as the others wouldn't not send LACP PDUs from the Proxmox nodes. This has been confirmed via the Cisco switch logs and the ports going into a suspension status because of the lack of response from the other three nodes. The linux configuration from I saw during the session with the vendor is matched between each Proxmox node and the example of how they have it configured is below which was pulled from the admin guide. Now for the LAG formation behaviors we seen. Node 1 forms bundle, node 2-4 does not, restart the network services on all for nodes in linux then node 1 stops, node 2 forms, 3-4 never forms. As another test I bounce the port channel instead and then node 4 forms yet nodes 1-3 never form. I've never seen anything like this and I am at a loss.
Layer 1
Cisco
Te1/0/1, Te1/0/2, Te1/0/5, Te1/0/6, Te1/0/9, Te1/0/10, Te1/0/13, Te1/0/14, Po100 (VM Network)
Te1/0/3, Te1/0/4, Te1/0/7, Te1/0/8, Te1/0/11, Te1/0/12, Te1/0/15, Te1/0/16, Po103 (Ceph/Storage Network)
Proxmox 4-Node Cluster
Node 1
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Node 2
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Node 3
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Node 4
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Layer 2:
Cisco
Port-Channel100
description BLAH
switchport mode trunk
switchport trunk allowed vlan add vlan 100,199
Port-Channel103
description BLAH
switchport mode access
switchport access vlan 103
interface TenGigEthernet1/0/1,2,5,6,9,10,13,14
description BLAH
switchport mode trunk
switchport trunk allowed vlan add 100,199
channel-group 100 mode active
interface TenGigEthernet1/0/3,4,7,8,11,12,15,16
description BLAH
switchport mode access
switchport access vlan 103
channel-group 103 mode active
Proxmox 4-Node Custer (Mind you I can't get into it currently due to credential handoff from vendor but this is how I know its configured)
auto lo
iface lo inet loopback
iface eno1 inet manual
iface eno2 inet manual
auto bond0
iface bond0 inet manual
bond-slaves eno1 eno2
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
auto vmbr0
iface vmbr0 inet static
address 10.10.10.2/24
gateway 10.10.10.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
Layer 1
Cisco
Te1/0/1, Te1/0/2, Te1/0/5, Te1/0/6, Te1/0/9, Te1/0/10, Te1/0/13, Te1/0/14, Po100 (VM Network)
Te1/0/3, Te1/0/4, Te1/0/7, Te1/0/8, Te1/0/11, Te1/0/12, Te1/0/15, Te1/0/16, Po103 (Ceph/Storage Network)
Proxmox 4-Node Cluster
Node 1
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Node 2
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Node 3
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Node 4
NIC 1 Port 1, NIC 2 Port 1 - Bond 0 (VM Network)
NIC 1 Port 2, NIC 2 Port 2 - Bond 1 (Ceph/Storage Network)
Layer 2:
Cisco
Port-Channel100
description BLAH
switchport mode trunk
switchport trunk allowed vlan add vlan 100,199
Port-Channel103
description BLAH
switchport mode access
switchport access vlan 103
interface TenGigEthernet1/0/1,2,5,6,9,10,13,14
description BLAH
switchport mode trunk
switchport trunk allowed vlan add 100,199
channel-group 100 mode active
interface TenGigEthernet1/0/3,4,7,8,11,12,15,16
description BLAH
switchport mode access
switchport access vlan 103
channel-group 103 mode active
Proxmox 4-Node Custer (Mind you I can't get into it currently due to credential handoff from vendor but this is how I know its configured)
auto lo
iface lo inet loopback
iface eno1 inet manual
iface eno2 inet manual
auto bond0
iface bond0 inet manual
bond-slaves eno1 eno2
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
auto vmbr0
iface vmbr0 inet static
address 10.10.10.2/24
gateway 10.10.10.1
bridge-ports bond0
bridge-stp off
bridge-fd 0