PVE + Cumulus Linux network configuration

RobFantini

Famous Member
May 24, 2012
2,009
102
133
Boston,Mass
Hello, we have a pair of Mellanox model sn2700 on order with Cumulus linux on order. We'll be replacing a a stack of netgear m5300 . In advance I am researching how to configure pve + the switches.
we have a 7 node cluster . each node is:
- running 4 ceph nvme osd's , 5-6 vm's and 2-3 pct.
- has a connectx-4 and a connectx-5 nic. one for ceph the other for vmbr .

the switches will be set as a mlag .
one of the switches is here and cumulus license should be sent soon.
---------------------------------------------------------------

I am working on network interfaces and have come up with this to try:

on switch:
Code:
###  Cumulus  add test vlans :
net add vlan 130 ip address 10.1.130.1/24
net add vlan 131 ip address 10.1.131.1/24
net add vlan 132 ip address 10.1.132.1/24
 
### Cumulus /etc/network/interfaces
auto bridge
iface bridge
    bridge-ports swp1-swp8
    bridge-vids 130-132
    bridge-pvid 1
    bridge-vlan-aware yes

at pve:
Code:
### PVE    /etc/network/interfaces
auto bond3
iface bond3 inet manual
        bond-slaves enp4s0f0 enp4s0f1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3
        mtu 9000

auto vmbr3
iface vmbr3 inet static
        address 10.1.130.3/24
        bridge-ports bond3
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 130-132
        mtu 9000

-------------------------------------------

I doubt that is close to a perfect configuration. If anyone has suggestions to improve please reply.

Also any general advice on using cumulus will be appreciated. their documentation is very good and easy to understand.
 
My guess (haven't explicitly tried it) is that a LACP bond would have to be configured per LACP bond - just like with any other switch (Cisco, HP,..)

you can have more than 2 links per LACP bond, but 2 partners (switch + PVE) would need one bond between them

I hope this helps!
 
My guess (haven't explicitly tried it) is that a LACP bond would have to be configured per LACP bond - just like with any other switch (Cisco, HP,..)

you can have more than 2 links per LACP bond, but 2 partners (switch + PVE) would need one bond between them

I hope this helps!
I thought so.
so 8 bonds , one per pve node.
then use those bonds at the bridge.
Code:
### Cumulus /etc/network/interfaces
..
auto bridge
iface bridge
    bridge-ports bond1 bond2 bond3 bond4 bond5 bond6 bond7 bond8
    bridge-vids 130-132
    bridge-pvid 1
    bridge-vlan-aware yes
..
 
yes mlag will be done when 2ND switch gets here in a couple of weeks.
so our 1ST tests will bond to switch ports on the same switch.
 
regarding routing,
at the mlag link there is a mention of 'vfr'
also I've seen mention of FFR in forums here and other places.

could someone suggest a safe/best direction to start off with for routing?
 
also - we are really just starting out to set up the two switches. static routing link was good for connection to other switches.

now we are working on connections for ceph storage network and vm's . so we are using these and guidance from Nvidia tech support: mlag https://docs.cumulusnetworks.com/cumulus-linux-42/Layer-2/Multi-Chassis-Link-Aggregation-MLAG/ and vrr which I did not know existed https://docs.cumulusnetworks.com/cumulus-linux-42/Layer-2/Virtual-Router-Redundancy-VRR-and-VRRP/ .
in the end we should have a list cli commands to type in at the switch. And probably some adjustments to pve interfaces file.
 
I do come from an Arista/Juniper Background, and the part about bridges messes with my head.
We have 2x Switches and each has an uplink from the DC and a down link for my Juniper Ethernet switch, so I can use ipm and so on.
Additionally, I have 3x Proxmox nodes where VMs will run one dayo_O.
Running cumulus 4.4 because in 5.1 the net commands did not work properly...
As far as I understood, you have to have the "bridge" where u add all the bonds and the peerelink" ... So far so done. But i don't see any traffic.
Even if it states that uplink is up.
I like the idea of Debian on my switch, but I did not know it will be suge a headache.
I'll paste the configs soon.
 
I also liked the idea of Debian on a switch - until Nvidia bought Mellonax and dropped software upgrades [ stuck on deban 10 ] for many switches including our two very expensive ones. they are speed 40 and not 100 so just dropped .

I could post our working interface files so you have an example of what works.
 
here are the /etc/network/interface files for our lagged two switches
 

Attachments

  • interfaces-mel1-2022-09-16.txt
    11.5 KB · Views: 4
  • interfaces-mel2-2022-09-16.txt
    11.5 KB · Views: 2
Ahh i see.
You did it the "traditional" way.
I guess i need to try that too.
Config is same on both.

Code:
source /etc/network/interfaces.d/*.intf

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0
    address 192.168.16.11/24
    gateway 192.168.16.1
    vrf mgmt

auto swp1
iface swp1

auto swp2
iface swp2

auto swp3
iface swp3

auto swp4
iface swp4

auto swp17
iface swp17

auto swp18
iface swp18

auto swp21
iface swp21

auto swp22
iface swp22

auto bond1
iface bond1
    bond-slaves swp18
    bridge-access 100
    clag-id 1

auto bond2
iface bond2
    bond-slaves swp17
    clag-id 2

auto bond3
iface bond3
    bond-slaves swp1
    clag-id 3

auto bond4
iface bond4
    bond-slaves swp2
    clag-id 4

auto bond5
iface bond5
    bond-slaves swp3
    clag-id 5

auto bond6
iface bond6
    bond-slaves swp4
    clag-id 6

auto bridge
iface bridge
    bridge-ports bond1 bond2 bond3 bond4 bond5 peerlink bond6
    bridge-vids 0-1000
    bridge-vlan-aware yes

auto mgmt
iface mgmt
    address 127.0.0.1/8
    address ::1/128
    vrf-table auto

auto peerlink
iface peerlink
    bond-slaves swp21 swp22

auto peerlink.4093
iface peerlink.4093
    address 172.16.239.249/29
    clagd-backup-ip 172.16.238.250
    clagd-peer-ip 172.16.239.250
    clagd-priority 2048
    clagd-sys-mac 44:38:39:FF:40:A1

cumulus@switchhohmann:mgmt:~$
 
So what you recommend?
Behind bond2, I do have an ex3300 with trunk bond configured.
Behind that, I do have 2x pfsenses.
 

Attachments

  • logdump.txt
    502.4 KB · Views: 1
  • swschurr.txt
    1.7 KB · Views: 0
  • swhohmann.txt
    1.5 KB · Views: 1
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!