Proxmox cluster scaling best practices

Discussion in 'Proxmox VE: Installation and configuration' started by Paspao, Apr 25, 2019.

  1. Paspao

    Paspao Member

    Joined:
    Aug 1, 2017
    Messages:
    32
    Likes Received:
    1
    Hello,

    I have a 5 node PVE cluster with Ceph (2 OSD every node on Intel DC SSD on dedicated 10G network).

    I am actually using it for LXC containers and my idea is to always keep one node empty to move containers in case of node failure.

    I will need to scale up in the near future probably to a total of 21 servers.

    What are the best practices and bottlenecks to consider while scaling?

    Will my 10GB network become the bottleneck adding nodes ? ( I have dual 10GB adapter so I could use bonding to get 20GB)

    I could create separate smaller clusters (3 cluster of 7 nodes) but I will need more hot standby servers.

    Any suggestion is appreciated.

    Thank you.
    P.
     
  2. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    4,683
    Likes Received:
    309
    Hi,

    generall if you will scale up to 21 nodes I would split into two clusters.
    About 7 nodes for storage(Ceph) and the rest for computation(LXC/KVM).

    Resources management is better to handle.
    Also, you can use cheaper CPUs for the Ceph cluster.

    Assume 21 nodes with 2 OSD per node is the same as 7 nodes with 6 OSD.
    This means you need a CPU with 8 threads.
    Ceph will benefit if you use a CPU with high clock cycles.

    The next thing is you should separate the cluster and the private network of ceph.
    It would use 40 GBit nics on the Ceph cluster and 10GBits to link to the computation Cluster.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. Paspao

    Paspao Member

    Joined:
    Aug 1, 2017
    Messages:
    32
    Likes Received:
    1
    Hi Wolfgang,

    thank you for your reply.

    My plan was to always use Hyper-converged clusters and not to split Ceph from Compute even for easier and progressive growth.

    In my 5 nodes I have seen peaks of 20 MB/s and 2000 Iops on Ceph network and I think there room to grow.

    One thing is not very clear to me is if adding nodes will increase traffic on Ceph network and with which progression rate, as adding nodes and OSDs allows to share peaks of load.

    Thank you.
    P.
     
  4. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    4,683
    Likes Received:
    309
    generally, traffic stays the same.
    The most traffic comes from the data and not the management.
    The traffic normally gets more distributed and a single node has less network traffic.
    Monitors produce a lot of packages but on a huge cluster, the recommendation is max 5 mons.
    The osd itself does produce minimal traffic, the most traffic at osd level come from the write/read procedure and this stays the same.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. Paspao

    Paspao Member

    Joined:
    Aug 1, 2017
    Messages:
    32
    Likes Received:
    1
    Thanks a lot !

    So I can think of scaling from 5 to 9 nodes (and from 10 to 18 OSDs) without any specific worries.

    Best regards
    P.
     
  6. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    4,683
    Likes Received:
    309
    Adding the new OSD will produce extra network load, because of rebalancing.
    So do this on a weekend or in nonproductive time.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    Paspao likes this.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice