Ceph PG #

brucexx

Renowned Member
Mar 19, 2015
216
7
83
Proxmox 7.4.16.

I am getting confused by all the numbers. I have 24 OSDs , SSD 1.46TB across 4 nodes, 3 replicas , total size of the pool 12TB and it is going to be 80-85% full. I did the calculation from ceph calc and it gets me 800 , rounded 1024 PG which is also the number that Ceph "recommends" for 10-50 OSDs but also specifies in the documentation that an OSD should have anywhere between 50 to 100 PGs. When I looked at the # of PGs per OSD in the web itnerface under OSD/PGs I got numbers between 117-130 per OSD which is above recommended 100. I edited the pool and put 512 and I see the number of OSDs is getting smaller. Is there anybody who could recommend which I should use 512 or 1024 with the 24 OSDs ? I would prefer not to use autoscaler.

Thank you
 
Hello,

117-130 PGs per OSD is absolutely fine.

The autoscaler does a good job at computing an optimal number of PGs, it will set the number of PGs to its computed optimal number if both values differ by a factor of 3. Is there any reason you don't want to use it?

By the way, running out of space with Ceph is the number one thing you want to avoid with Ceph. When a OSD is ~85% full you will get a Ceph warning. At ~95% all IO on the entire pool will be blocked to ensure your data is protected. Do note that should a single OSD fail, its data will have to be restored in other OSDs, this can quickly push the used % above one the thresholds above.
 
Thank you , so the 1024 PGs would be preferred being the calculated value with the ceph pg calculator ?

I am warming up to the autoscaler , I have it running on a smaller cluster and it is just working I guess. I am just not sure how it is making the adjustments and how they affect the usability and efficiency of the cluster (during business hours for example). Could you elaborate on this ? I try to avoid piece of automation that can go crazy on a production main cluster.

Thank you for the advice on the 85% use , it is going to be max if I need to move things around , otherwise it is going to not exceed 80%.
 
Thank you , so the 1024 PGs would be preferred being the calculated value with the ceph pg calculator ?
Yes.
I am warming up to the autoscaler , I have it running on a smaller cluster and it is just working I guess. I am just not sure how it is making the adjustments and how they affect the usability and efficiency of the cluster (during business hours for example). Could you elaborate on this ? I try to avoid piece of automation that can go crazy on a production main cluster.
As I said, it will only set the number of PGs to the optimal number of PGs if one of them is at least three times bigger than the other, e.g. if the optimal number of PGs is 64 and the current number of PGs is 16 or the other way around.

Thank you for the advice on the 85% use , it is going to be max if I need to move things around , otherwise it is going to not exceed 80%.
Do note that not all OSDs are going to be exactly at 80% if you are using 80% of the raw pool capacity. It is entirely possible you can have one or two OSDs at 90% usage. A single VM clone or a OSD having a failure could bring the entire cluster down at that point.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!