Proxmox + Ceph drive configuration

philliphs

New Member
May 22, 2021
12
0
1
39
Hi everyone. I'm a newbie both in Proxmox and Ceph.
I'm building home lab consisting of some old hardware consisting of 3 identical nodes.
HP Z420
E5 2630L
32GB ram
1 NVMe 500gb (standard WD blue)
1 SATA SSD 500gb samsung evo 870
1 SATA SSD 120GB cheap boot disk

I'm planning to implement HA on the 3 nodes.
All data will be on ceph pool. I will be running Windows server essentials VM running MySQL database also serving as AD as well as file server.
My question is whether to:
1. Create 2 pools on Ceph, 1 NVMe pool (possibly OS Vdisk), and 1 Sata SSD pool (possibly SQL database and file storage).
2. Create 1 big pool of 6 drives and let ceph manages all mixed OSDs. Would Ceph be able to recognize which drive is faster and utilize it more?

Thanks in advance.

EDIT: I'm only running 1gigabit network, but will utilize 3 separate nics on each of them to separate the normal network, public ceph, and cluster ceph
 
Last edited:
Considering your setup, you have 120GB Boot Disk per server ( which i believe is used for deploying proxmox + ceph)
Total 3 NVMe Disks of 500GB each
Total 3 SSD Disks of 500GB each
Now if you combine them in ceph, it will result in 6 OSD ( 3 NVME + 3 SSD)
ceph can allow mix use of different disks based on class

You have two options one pool exclusively of NVME on which you will get 3 X 500GB RAW capacity and RF=3 you will get around 500GB total usable capacity and same for SSD Pool

Another option is to use both NVME and SSD in One pool, in that case you will get total capacity as 500GB X 6 and total capacity per pool to be 1500GB


Now coming to network, 1Gbps is low bandwidth setup maximum throughput is 120MB/s and as per node is having 2 Disks (one SSD + one NVME) in full load condition max write is 60MB/s.

if you are ok with this performance, Go ahead. if you can combine 3 NICS in Lagg it would be better
 
Considering your setup, you have 120GB Boot Disk per server ( which i believe is used for deploying proxmox + ceph)
Total 3 NVMe Disks of 500GB each
Total 3 SSD Disks of 500GB each
Now if you combine them in ceph, it will result in 6 OSD ( 3 NVME + 3 SSD)
ceph can allow mix use of different disks based on class

You have two options one pool exclusively of NVME on which you will get 3 X 500GB RAW capacity and RF=3 you will get around 500GB total usable capacity and same for SSD Pool

Another option is to use both NVME and SSD in One pool, in that case you will get total capacity as 500GB X 6 and total capacity per pool to be 1500GB


Now coming to network, 1Gbps is low bandwidth setup maximum throughput is 120MB/s and as per node is having 2 Disks (one SSD + one NVME) in full load condition max write is 60MB/s.

if you are ok with this performance, Go ahead. if you can combine 3 NICS in Lagg it would be better
Thanks for your explanation. Does that mean if I combine 6 OSDs into one pool, the performance of the NVMEs won't be bottlenecked by the SSDs? Can Ceph automatically optimize the OSDs based on the class?

I am aware of the 1Gbps limitation. I would plan to try it first see if the performance is usable. But combining the 3 NICS definitely is interesting, I didn't think about it before.
Would it be better combining the 3 NICS or have them stand on separate functions, as proxmox network, public ceph and cluster network?
 
It won't cause any problem just ensure proper weights are assigned to nvme
Weight of nvme must be greater than ssd
 
One more question. If I increase the NVME weight, would that mean the NVME drives will reach near full (or full) ratios, this causing the whole pool to get stuck regardless of the SSD is still for example at 50% capacities?
 
In ceph normally weights are assigned based on the size of the disk, for example if your disk size is 1.92 TB and after right sizing OSD size as shown in osd tree is 1.75TB, you will see a weight as 1.75

Now in your case both SSD and NVME are with 500G capacity so after right sizing let us assume you get usable space as 470GB you will see a weight as 0.47 for each OSD. In this scenario, ceph will assume all disks are of equal capacity.

Now once you assign higher weight to NVME because NVME has higher IOPS and throughput, ceph shall deploy more number of PG's in NVME but catch is you have disk with same capacity, this leads to very good performance till you reach near full ratio and because of higher weight you have more chances of NVME getting full early also


So normally such scenarios, having two different pools or having two different crush rules can help

There is always a tradeoff between performance and capacity

Please note enabling balancer will ensure equal placement of data with a tradeoff between performance and capacity
 
Last edited:
Right. That's what I was confused about. Means that if I start to manually adjust weight, I won't get full capacity. Means I'm stuck with with two different pools, or one pool with NVMEs bottlenecked.

I thought with the introduction of device class, ceph has the ability to fill up faster drive first and slower drives later.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!