Using auto scaler for the first time

brucexx

Renowned Member
Mar 19, 2015
260
9
83
I tested the auto scaling option on a test system and now I am using it in production. I have 5 nodes and 30 osd (all ssds), I setup the target size to 80% of the total size of the pool.

Ceph shows the pool has 512 pg and that the optimal # of pgs is 1024 , the autoscaler is on , I checked the sttus with pool autoscale-status command and it shows autoscale "on", but the number of pool pgs is not increasing and staying @ 512. The pool is only 27% full at the moment, is this why the number of pgs stays @ 512 and it will increase as the pool gets more full ?

These numbers seem a little misleading , can anybody elaborate if this is intended behavior ?

Thank you
 
The autoscaler will only start changing the pg_num automatically if the current and optimal differ by a factor of 3 or more. With just 2x like you have, you'll have to set it manually.
 
so i should turn it off and set it up manually to 1024 ? With 27% should I do this gradually and increase by 128 (or perhaps 64) and wait , then increase by another 64 or 128 ?

Thank you
 
how many pools have you got? If you only have one pool (we can ignore .mgr), then set the target_ratio to any value -> tells the autoscaler this pool is expected to consume all the space.
Then either leave the autoscaler set to on or warn. If the discrepancy gets too large, it will change the pg_num by itself if set to on. If it isn't or if set to warn, it will inform you that another pg_num would be better.

This way, it can react to situation where you might have more OSDs or more pools. For the latter, consider setting the target_ratio accordingly so the autoscaler knows what the expectation is.
 
I have 1 pool and will have just one pool in this cluster. 5 node cluster, 30 osds (1.6TB drives). I might add within 2 years one node with 6 additional osds but it is not 100%. I will have about 75-80% of the cluster full. The cluster I just decommissioned had pgs assigned statically with 24 osd and one pool only - 1024. I think I will turn off autoscaler or set to warn and use 1024 or 2048.

Is the 100 pgs per osd something I should consider ? My cluster is not full , I have about 27% and now started offloading which brought it to 24%. I have hard drive space and can offload then increase the number of pgs. BTW can I just incrementally add the pgs in the Proxmox GUI in the pool edit window, is it that easy ?

thank you
 
The ~100 PGs/OSD is still a good rule of thumb. In some cases more might improve performance, but you would need to benchmark that.

Each change in the pg_num will cause a rebalance, so I would just set it to the target number you want, instead of manually adding additional incremental steps.
 
Thank you Aaron this is all good advice. Should the target ration be increased gradually , lets say from 0.0 now to 0.2 , then 0.5 then 0.7etc. up to 1. I assume that 1 is the final ratio in my case as this is going to be only one pool in this cluster , is that a correct assumption?

Thank you
 
after exhaustive reading about pgs and how to calculate them , I decided to turn off the autoscaler (mainly because it can start rebalancing during the business hours) , I set the pgs to 2048 (was tempted to use 4096 per the 100pgs per ssd) . I have enough resources CPU/RAM to handle the OSDs. metadata overhead, enterprise SAS SSDs wth 10 DWPD, 50 Gbps network for ceps private and public. So now it tells me that there are too many pgs per ssd as the cluster is empty. Still , considering 4096 per cep 100 pgs per sod recommendation - Aaron , what do you think about 4096 pgs in this hardware scenario ?
 
I think a few things need to be cleared up here.
The target_ratio is a ratio between all the pools that have it set. For example, for a single pool, any value will be considered 100%. Two pools, both set to 1 -> the autoscaler will calculate the pg_num per pool with the assumption that each pool will consume roughly 50% of the space.
That's why I recommend, with multiple pools, to set the target ratios between 0.0 and 1.0, or 0 - 100 to make it easier for us humans to think in percentages.

If you have a replicated pool with a size of 3, you will have pg_num * 3 replicas in total.
So for 30 OSDs and with ~100 PGs per OSD -> 30 * 100 / 3 = 1000 -> 1024, since a pg_num needs to be a power of two.

If the autoscaler starts to rebalance, it will do that slowly, as in, you will see that the pg_num will slowly go up or down to not affect production.
If that is bringing your cluster to the limit already, well... maybe it should be improved. Because what if a disk fails and Ceph needs to recover? Will that impact production too much as well?

I hope that helps :)
 
Thank you again. I did not think about that:

If you have a replicated pool with a size of 3, you will have pg_num * 3 replicas in total.
30 OSDs and with ~100 PGs per OSD -> 30 * 100 / 3 = 1000 -> 1024

, that makes more sense now. In other words I forgot about the * 3 replicas.

fortunately my pool in empty so I can change it at will now. I will enable back the auto scaler and watch what is happening as I am filling it out.

Will report back.

Thank you
 
Last edited:
So I set it back to "on" from "warn" and now the warning of "too many pgs per old" disappeared with Health status being OK. The number of pgs is still showing 2048. Is this because of what you wrote in the first post: "The autoscaler will only start changing the pg_num automatically if the current and optimal differ by a factor of 3 or more" ?

I set it up to 1024 and autoscaler is bringing it slowly down to 1024.

Thank you
 
Last edited:
If the optimal is 1024 and the current is 2048 then yeah, you will have to set it manually. Depending on how much data there already is, it will take longer or shorter, but you should see the pg_num go down gradually. :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!