Question about adding an addtional network to Ceph

sidereus

Member
Jul 25, 2019
45
7
13
53
I have a spare 10G interfaces at all cluster nodes. Planning to add this interface to Ceph. What is the best way to do it?
I see following choises:
1) Add an additional network to Ceph. It will mean stop a service, delete all monitors and create it again, right?
2) Just create LACP (802.3ad) link aggregation, using 2 10G interfaces at the each cluster node. Question is, will LACP increase a Ceph performance, or not?
 
What is your current network setup for ceph? I'm not sure but it sounds like you have both ceph and corosync running on the same network if ceph isn't already separated to another network.

I'm not completely sure about LACP, but if you have corosync and ceph running on the same network & interface; you should see a fair amount of improvement when separating ceph to its own network because corosync is making lots of requests to ensure quorum at all times
 
What is your current network setup for ceph? I'm not sure but it sounds like you have both ceph and corosync running on the same network if ceph isn't already separated to another network.

I'm not completely sure about LACP, but if you have corosync and ceph running on the same network & interface; you should see a fair amount of improvement when separating ceph to its own network because corosync is making lots of requests to ensure quorum at all times
No. All is separated. Question is, if LACP increase Ceph performance or not?
Currently it is
Bash:
root@asr4:~# dd if=/dev/zero of=/mnt/pve/cephfs/aaaa bs=1M status=progress count=500000
524121276416 bytes (524 GB, 488 GiB) copied, 1230 s, 426 MB/s
500000+0 records in
500000+0 records out
There are all together 26 HDD OSD at 4 cluster nodes.
Bash:
root@asr4:~# cat /etc/network/interfaces

# network interface settings; autogenerated

# Please do NOT modify this file directly, unless you know what

# you're doing.

#

# If you want to manage parts of the network configuration manually,

# please utilize the 'source' or 'source-directory' directives to do

# so.

# PVE will preserve these directives, but will NOT read its network

# configuration from sourced files, so do not attempt to move any of

# the PVE managed interfaces into external files!


auto lo

iface lo inet loopback


iface ens2f0 inet manual

    mtu 1500


auto eno1

iface eno1 inet static

    address 192.168.120.4/24

#cluster network


iface eno2 inet manual


auto ens2f1

iface ens2f1 inet static

    address 192.168.121.4/24

    mtu 9000

#Ceph


auto ens2f2

iface ens2f2 inet static

    address 192.168.122.4/24

    mtu 9000

#migration


iface ens2f3 inet manual


auto vmbr0

iface vmbr0 inet static

    address 172.16.104.224/24

    gateway 172.16.104.1

    bridge-ports ens2f0

    bridge-stp off

    bridge-fd 0

    bridge-vlan-aware yes

    bridge-vids 2-4094

    mtu 1500

Bash:
root@asr4:~# cat /etc/pve/datacenter.cfg
keyboard: en-us
migration: network=192.168.122.1/24,type=insecure
Bash:
root@asr4:~# cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: asr1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.120.1
    ring1_addr: 192.168.122.1
  }
  node {
    name: asr2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 192.168.120.2
    ring1_addr: 192.168.122.2
  }
  node {
    name: asr3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 192.168.120.3
    ring1_addr: 192.168.122.3
  }
  node {
    name: asr4
    nodeid: 4
    quorum_votes: 1
    ring0_addr: 192.168.120.4
    ring1_addr: 192.168.122.4
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: asr
  config_version: 4
  interface {
    knet_link_priority: 50
    linknumber: 0
  }
  interface {
    knet_link_priority: 20
    linknumber: 1
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}
Bash:
root@asr4:~# cat /etc/ceph/ceph.conf

[global]

     auth_client_required = cephx

     auth_cluster_required = cephx

     auth_service_required = cephx

     cluster_network = 192.168.121.0/24

     fsid = abcd1234-1234-4321-a1b2-a1b2c3d4e5f6

     mon_allow_pool_delete = true

     mon_host = 192.168.121.1 192.168.121.2 192.168.121.4 192.168.121.3

     osd_pool_default_min_size = 2

     osd_pool_default_size = 3

     public_network = 192.168.121.0/24


[client]

     keyring = /etc/pve/priv/$cluster.$name.keyring


[mds]

     keyring = /var/lib/ceph/mds/ceph-$id/keyring


[mds.asr1]

     host = asr1

     mds_standby_for_name = pve


[mds.asr2]

     host = asr2

     mds_standby_for_name = pve


[mds.asr3]

     host = asr3

     mds_standby_for_name = pve


[mds.asr4]

     host = asr4

     mds_standby_for_name = pve


[mon.asr1]

     public_addr = 192.168.121.1


[mon.asr2]

     public_addr = 192.168.121.2


[mon.asr3]

     public_addr = 192.168.121.3


[mon.asr4]

     public_addr = 192.168.121.4
 
My understanding is that OSDs are communicating directly with each other so bonding should lead to an increase I'm actual bandwidth.

I note though that you don't have a separate public and cluster network. You have a two corosync networks, an external network, which I assume are used by the vms, a separate migration network and just one ceph network. If you don't want the ceph public traffic to be on the proxmox public network, you could also use the 10gb interface to separate the ceph front and back networks. I.e. one network for serving clients (in this case qemu) and one network for the communication between osds (write distribution, recovery, scrubbing etc).
 
  • Like
Reactions: BunkerHosted
Thank You for the advice. Now configured bond 2x10Gb for both public and private Ceph networks. Will see.
Code:
iface bond0 inet manual
        bond-slaves ens4f1 ens4f2
        bond-miimon 100
        bond-mode balance-tlb
        mtu 9000
Also created a bridge there to provide Ceph inside VMs, for example mount Cephfs in VMs:
Code:
iface vmbr1 inet static
        address 192.168.121.5/24
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        mtu 9000
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!