Promox + Ceph Cluster Network Settings

judexzhu

Member
Aug 22, 2018
12
0
6
43
Hi everyone,

I'm working on building a new Proxmox cluster with Ceph for production.

Each node I have 4 x 10G NICs. But, unfortunately, the switches are not supported stack. So I've chosen to use Balance-ALB(mode 6) for the bonding. All the 10G ports on the switches have been set only to trunk mode.

Current I have following networks for the cluster:

172.25.5.0/24 for Cluster manager vlanID: 5(Native VLAN)
172.25.7.0/24 for Ceph Cluster Network vlanid: 7
172.25.9.0/24 for Ceph Public Network vlanid: 9

so on each node, below is the network configuration from "/etc/network/interfaces"

Code:
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage part of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

iface enp5s0f0 inet manual

iface enp5s0f1 inet manual

auto enp130s0f0
iface enp130s0f0 inet manual

auto enp130s0f1
iface enp130s0f1 inet manual

auto enp131s0f0
iface enp131s0f0 inet manual

auto enp131s0f1
iface enp131s0f1 inet manual

auto enp130s0f0.9
iface enp130s0f0.9 inet manual

auto enp130s0f1.9
iface enp130s0f1.9 inet manual

auto enp131s0f0.9
iface enp131s0f0.9 inet manual

auto enp131s0f1.9
iface enp131s0f1.9 inet manual

auto enp130s0f0.7
iface enp130s0f0.7 inet manual

auto enp130s0f1.7
iface enp130s0f1.7 inet manual

auto enp131s0f0.7
iface enp131s0f0.7 inet manual

auto enp131s0f1.7
iface enp131s0f1.7 inet manual

auto bond0
iface bond0 inet manual
    slaves enp130s0f0 enp130s0f1 enp131s0f0 enp131s0f1
    bond_miimon 100
    bond_mode balance-alb

auto bond1
iface bond1 inet static
    address  172.25.9.21
    netmask  255.255.255.0
    gateway  172.25.9.1
    slaves enp130s0f0.9 enp130s0f1.9 enp131s0f0.9 enp131s0f1.9
    bond_miimon 100
    bond_mode balance-alb

auto bond2
iface bond2 inet static
    address  172.25.7.21
    netmask  255.255.255.0
    slaves enp130s0f0.7 enp130s0f1.7 enp131s0f0.7 enp131s0f1.7
    bond_miimon 100
    bond_mode balance-alb

auto vmbr0
iface vmbr0 inet static
    address  172.25.5.21
    netmask  255.255.255.0
    gateway  172.25.5.1
    bridge_ports bond0
    bridge_stp off
    bridge_fd 0


My question:

1. Is this configuration good enough?
2. do I still need to separate the corosync network with ringX_addr
3. do I need multicast? or "echo 0 > /sys/class/net/vmbr0/bridge/multicast_snooping"?
4. Anything else I should improve for the network configuration?

Any advice and suggestions will be greatly appreciated! Thank you in advance!
 
With none stacking switches I had always bad experiences with bonding modes other then Active-Backup (mode 1). Maybe you will have more luck with mode 6.

But I'm very sure that it will work with bonding the VLAN's together, you will need to bond the underlying interfaces and then define the VLAN's on the bond. You already have bond0, remove bond1 and bond2 and do this:

auto bond0.7
iface bond0.7 inet manual

and so on
 
@Klaus Steinberger

You're absolutely right! Without LACP, the only bond mode choice for the cluster is "active-backup"

I found following information from the admin book

If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using the corresponding bonding mode (802.3ad). Otherwise you should generally use the active-backup mode.
If you intend to run your cluster network on the bonding interfaces, then you have to use active-passive mode on the bonding interfaces, other modes are unsupported.

So I've made the 4x10G to 2 bonds, then separate the ceph network and the proxmox cluster network.

Code:
auto lo
iface lo inet loopback

iface enp5s0f0 inet manual

iface enp5s0f1 inet manual

iface enp130s0f0 inet manual

iface enp130s0f1 inet manual

iface enp131s0f0 inet manual

iface enp131s0f1 inet manual

auto bond0
iface bond0 inet manual
    slaves enp131s0f0 enp131s0f1
    bond_miimon 100
    bond_mode active-backup

auto bond1
iface bond1 inet manual
    slaves enp130s0f0 enp130s0f1
    bond_miimon 100
    bond_mode active-backup

# Ceph Public Network
auto bond0.9
iface bond0.9 inet static
    address  172.25.9.21
    netmask  255.255.255.0

# Ceph Storage Network
auto bond0.7
iface bond0.7 inet static
    address  172.25.7.21
    netmask  255.255.255.0

# Corosync Cluster Network
auto bond1.8
iface bond1.8 inet static
    address  172.25.8.21
    netmask  255.255.255.0

# Proxmox Management Network
auto vmbr0
iface vmbr0 inet static
    address  172.25.5.21
    netmask  255.255.255.0
    gateway  172.25.5.1
    bridge_ports bond1
    bridge_stp off
    bridge_fd 0

Though I have 2 more questions!

1. I'd like the outside network to access both the cluster management IP and the Ceph Cluster from various VLAN, but only 1 gateway I can set, what should I do?

2. can I set ceph network bond to mode 6 to increase speed?


Anyway, huge help, really really appreciated for it. Thank you so much
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!