[SOLVED] Inconsistent issue with multiple vlans on LACP bond with Catalyst 3650

alexinspokane

New Member
Sep 27, 2024
1
0
0
Turned out it was a problem on the Cisco end, I don't see where in running-config this is defined but evidently besides defining each vlan interface you also have to do "vlan XX" to activate the layer 2 vlan support for that ID.

I think should should be a pretty simple thing to configure, but I've been beating my head against it for a full day now. I think the issue is on the proxmox side rather than the switch side since I've done similar labs with vmware in the past and don't recall having such an issue.
  • I have 3 (new, v8.2.7) proxmox nodes in a cluster named px1, px2, px3
  • I'm using a Cisco Catalyst 3650 switch
  • Each Proxmox node uplinks eno1 to an access-mode port in vlan 1001 (for management)
  • Each Proxmox node uplinks ens2f0,1,2,3 to a separate LACP channel-group on the Catalyst
  • The LACP groups are trunks that tag every vlan
    • 100 = proxmox clustering
    • 17 = office lab uplink for internet
    • 1000,101 = ceph pub/cluster

On the Proxmox side I configured it as follows:

Code:
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

auto ens2f0
iface ens2f0 inet manual

auto ens2f1
iface ens2f1 inet manual

auto ens2f2
iface ens2f2 inet manual

auto ens2f3
iface ens2f3 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves ens2f0 ens2f1 ens2f2 ens2f3
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3
#uplink

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.1/24
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0
#oob mgmt

auto vlan17
iface vlan17 inet static
        address 10.0.17.40/24
        gateway 10.0.17.252
        vlan-raw-device bond0
#office

auto vlan100
iface vlan100 inet static
        address 192.168.3.1/24
        vlan-raw-device bond0
#pxcluster

auto vlan101
iface vlan101 inet static
        address 192.168.4.1/24
        vlan-raw-device bond0
#cphclst

auto vlan1000
iface vlan1000 inet static
        address 192.168.2.1/24
        vlan-raw-device bond0
#cphpub

source /etc/network/interfaces.d/*

  • Vlan 17 works. I can get to the internet and hosts can ping each-other.
  • Vlan 100 works. Proxmox cluster is set up and hosts can ping each-other.
  • Vlan 1001 (native eno1) works. I can manage the hosts with my laptop cabled to another access port on the catalyst.
  • Vlan 101 and 1000 don't work. I can't ping the switch or other hosts.

The ARP table on a proxmox node looks like this, it's not resolving MAC for vlan 101 and 1000:

Code:
Address                  HWtype  HWaddress           Flags Mask            Iface
192.168.1.2              ether   4c:d9:8f:ab:4b:1e   C                     vmbr0            #px2 
192.168.1.3              ether   4c:d9:8f:ab:4e:0e   C                     vmbr0            #px3
192.168.1.123            ether   98:e7:43:36:1f:32   C                     vmbr0            #laptop
192.168.1.250            ether   00:5d:73:d0:ea:f5   C                     vmbr0            #switch
10.0.17.41               ether   b0:26:28:95:8f:88   C                     vlan17            #px2
10.0.17.42               ether   b0:26:28:93:84:94   C                     vlan17            #px3
10.0.17.43               ether   00:5d:73:d0:ea:e0   C                     vlan17            #switch
10.0.17.252              ether   00:50:56:b4:16:ac   C                     vlan17            #gateway
192.168.3.2              ether   b0:26:28:95:8f:88   C                     vlan100            #px2
192.168.3.3              ether   b0:26:28:93:84:94   C                     vlan100            #px3
192.168.3.250            ether   00:5d:73:d0:ea:d1   C                     vlan100            #switch
192.168.2.2                      (incomplete)                              vlan1000            #px2
192.168.2.3                      (incomplete)                              vlan1000            #px3
192.168.2.250                    (incomplete)                              vlan1000            #switch
192.168.4.2                      (incomplete)                              vlan101            #px2
192.168.4.3                      (incomplete)                              vlan101            #px3
192.168.4.250                    (incomplete)                              vlan101            #swtch

Bizarrely, from the switch, I can ping all the hosts on every vlan, but it always shows the vlan 101/1000 addresses as being on vlan 100:

Code:
Protocol Address Age (min) Hardware Addr Type Interface
Internet  10.0.17.40              0   b026.2893.88a8  ARPA   Vlan17
Internet  10.0.17.41             53   b026.2895.8f88  ARPA   Vlan17
Internet  10.0.17.42             53   b026.2893.8494  ARPA   Vlan17
Internet  10.0.17.43              -   005d.73d0.eae0  ARPA   Vlan17
Internet  192.168.1.1             0   d094.6664.0bec  ARPA   Vlan1001
Internet  192.168.1.2             0   4cd9.8fab.4b1e  ARPA   Vlan1001
Internet  192.168.1.3             0   4cd9.8fab.4e0e  ARPA   Vlan1001
Internet  192.168.1.250           -   005d.73d0.eaf5  ARPA   Vlan1001
Internet  192.168.2.1           195   b026.2893.88a8  ARPA   Vlan100
Internet  192.168.2.2           208   b026.2895.8f88  ARPA   Vlan100
Internet  192.168.2.3           203   b026.2893.8494  ARPA   Vlan100
Internet  192.168.2.250           -   005d.73d0.eadf  ARPA   Vlan1000
Internet  192.168.3.1             0   b026.2893.88a8  ARPA   Vlan100
Internet  192.168.3.2             0   b026.2895.8f88  ARPA   Vlan100
Internet  192.168.3.3             0   b026.2893.8494  ARPA   Vlan100
Internet  192.168.3.250           -   005d.73d0.ead1  ARPA   Vlan100
Internet  192.168.4.1           212   b026.2893.88a8  ARPA   Vlan100
Internet  192.168.4.2           227   b026.2895.8f88  ARPA   Vlan100
Internet  192.168.4.3           191   b026.2893.8494  ARPA   Vlan100
Internet  192.168.4.250           -   005d.73d0.eac1  ARPA   Vlan101

On the switch the config is pretty simple:
Code:
interface Port-channel1
 description Proxmox01
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
!
interface Port-channel2
 description Proxmox02
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
!
interface Port-channel3
 description Proxmox03
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
!
interface GigabitEthernet1/0/1
 description LACP-Proxmox01
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 1 mode active
!
interface GigabitEthernet1/0/2
 description LACP-Proxmox01
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 1 mode active
!
interface GigabitEthernet1/0/3
 description LACP-Proxmox01-Ceph
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 1 mode active
!
interface GigabitEthernet1/0/4
 description LACP-Proxmox01-Ceph
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 1 mode active
!
interface GigabitEthernet1/0/5
 description LACP-Proxmox02
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/6
 description LACP-Proxmox02
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/7
 description LACP-Proxmox02-CEPH
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/8
 description LACP-Proxmox02-CEPH
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/9
 description LACP-Proxmox03
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 3 mode active
!
interface GigabitEthernet1/0/10
 description LACP-Proxmox03
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 3 mode active
!
interface GigabitEthernet1/0/11
 description LACP-Proxmox03-CEPH
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 3 mode active
!
interface GigabitEthernet1/0/12
 description LACP-Proxmox03-CEPH
 switchport trunk allowed vlan 17,100,101,1000,1001
 switchport mode trunk
 channel-group 3 mode active
!
interface GigabitEthernet1/0/13
 switchport access vlan 1001
 switchport mode access
!
interface GigabitEthernet1/0/14
 switchport access vlan 1001
 switchport mode access
!
interface GigabitEthernet1/0/15
 switchport access vlan 1001
 switchport mode access
!
interface GigabitEthernet1/0/16
 switchport access vlan 1001
 switchport mode access
!
interface GigabitEthernet1/0/25
 switchport access vlan 17
 switchport mode access
!
interface Vlan1
 no ip address
 shutdown
!
interface Vlan17
 description office net
 ip address 10.0.17.43 255.255.255.0
!
interface Vlan100
 ip address 192.168.3.250 255.255.255.0
!
interface Vlan101
 description ceph cluster
 ip address 192.168.4.250 255.255.255.0
!
interface Vlan1000
 description Ceph
 ip address 192.168.2.250 255.255.255.0
!
interface Vlan1001
 description OOB-MGMT
 ip address 192.168.1.250 255.255.255.0
!

I've tried a lot of things to get it to work, besides rebooting everything, like instead of defining vlanXX I tried bridges with dot notation, adding a vlan-aware bridge for bond0 then defining vlans on that. Using a separate LACP bond just for 1000,101. Same behavior. Switch can always reach 1000 and 101 on vlan 100 (apparently) but hosts can't reach the switch or each-other. Vlan 17 and 100 seem to work fine otherwise.

The one thing I notice is that the MAC address for each vlan interface is the same, and maybe that's causing an issue? I tried setting individual mac addresses for each vlan in /etc/network/interfaces but that didn't seem to matter.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!