Proxmox/Ceph hardware and setup

jslanier

Well-Known Member
Jan 19, 2019
49
0
46
42
Hey everyone,
I am getting away from VMWare/Veeam/Compellent SAN and moving to Proxmox/PBS/Ceph. The plan is for 5 hosts, each with 2 Xeon Gold 6338 (64 threads each), 768 GB DDR4 3200mhz, 3 6.4TB Micron 9300 MAX U.2 nvme, 100Gbit network for Ceph (2x 100G LACP), and 25 Gbit for Proxmox Cluster (4x 25G LACP).

I tried Ceph with consumer SATA SSDs recently and performance was abysmal. What kind of performance should I expect with this setup? I plan on using 3 replicas. Also, what other performance recommendations do you have? I plan to use the pg autoscaler as well.

Thanks,
Stan
 
What kind of performance should I expect with this setup?
Have you seen the Ceph Benchmark paper from 2020? That might give you some indication.

In the end it is always a "it depends" ;)

If you plan to do some benchmarks before the cluster goes into production, start early and on every step: benchmark the SSD directly, then with Ceph, then inside VMs, ...

Some hints: use a large MTU for the Ceph network, if you can, give the PVE Cluster (Corosync) more than one link as it can switch by itself between them, and if possible, give it at least one dedicated physical network for itself, a 1Gbit should be plenty for that. A stable PVE cluster communication is essential, especially if you want to use the PVE HA functionality.

The pg autoscaler is on by default in recent Ceph versions. To help it, set the "target_ratio" settings for your pools. If you only plan on one pool, it can consume all space in the cluster, and that value can be "1". If you have multiple pools, do an estimate and set it accordingly. Otherwise, the autoscaler will check the current space usage of the pool and will scale the pg_num later on if needed, causing some rebalancing.

Also be aware that the autoscaler will only change the pg_num once the current and optimal are off by a factor of 3. For everything below that you will have to change it manually.


I tried Ceph with consumer SATA SSDs recently and performance was abysmal.
Not surprising since, especially in the consumer area, the SSD market ranges from decent to absolute cheap crap, and it is a science of its own to figure out what is what. Getting SSDs with power loss protection is a good filter to avoid the worst.
 
  • Like
Reactions: itNGO
Hey everyone,
I am getting away from VMWare/Veeam/Compellent SAN and moving to Proxmox/PBS/Ceph. The plan is for 5 hosts, each with 2 Xeon Gold 6338 (64 threads each), 768 GB DDR4 3200mhz, 3 6.4TB Micron 9300 MAX U.2 nvme, 100Gbit network for Ceph (2x 100G LACP), and 25 Gbit for Proxmox Cluster (4x 25G LACP).

I tried Ceph with consumer SATA SSDs recently and performance was abysmal. What kind of performance should I expect with this setup? I plan on using 3 replicas. Also, what other performance recommendations do you have? I plan to use the pg autoscaler as well.

Thanks,
Stan
It's considered best practice to have 2 physically separate cluster (Corosync) links obviously connected to 2 different switches. Corosync wants low latency not bandwidth, so 2x1GbE.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!