[SOLVED] Some ceph related questions (autoscale)

Tobbe

Member
Oct 4, 2021
21
7
8
Hello.

I got a small pve cluster setup and i'm testing out ceph to see if it is something i want to use or not.
The question i have is about the pg_autoscaler.

If i look at the list of pools and check the column "Optimal # of PGs" this is listed as n/a and if i hover over it i get a tooltip with: "Needs pg_autoscale module enabled"
So i then check if this is the case or not:
Code:
# ceph mgr module enable pg_autoscaler
module 'pg_autoscaler' is already enabled (always-on)
And according to proxmox autoscale mode is set to on on the pools already so i should be ok.
and "ceph osd pool autoscale-status" produces no output.

Any idea why?
I must have missed something.
Clearly the message in the tooltip is wrong because it is always enabled so it is not that helpful.
Target size and target ratio i understood to be optional, probably good to set it to something but i think the auto scaler should work anyway, or?
At least produce some kind output.
 
Do you by any chance have several device classes and rules that force a pool to only use one class? If so, do you still have at least one pool using the default "replicated_rule" which does not make any distinction between device classes? If so, that would be an explanation and you should assign this pool also a rule that limits it to a device class.
 
That was exactly the problem, thanks.
i had the built in pool device_health_metrics set to use replicated_rule and the other pools was using device class based rules.
 
  • Like
Reactions: Deepen Dhulla
Cool. I went ahead and marked the thread as solved.

The problem is, that in that situation, with one pool still spanning all device classes, the autoscaler cannot determine how many PGs it will have for each device class and how that affects the pg_num for the other pools in that device class.
 
  • Like
Reactions: Deepen Dhulla
Cool. I went ahead and marked the thread as solved.

The problem is, that in that situation, with one pool still spanning all device classes, the autoscaler cannot determine how many PGs it will have for each device class and how that affects the pg_num for the other pools in that device class.
Old thread, but great advice. Thank you @aaron. This was my issue also. Could this be added to the Proxmox Wiki? I searched the Ceph documentation and didn't see mention of this issue.
 
Hello, what class should the .mgr pool be assigned to remove it from replicated_rule?
Whichever you feel fine with. It is small, won't take up a lot of data, and also won't see a lot of load.
 
  • Like
Reactions: scasrl
Cool. I went ahead and marked the thread as solved.

The problem is, that in that situation, with one pool still spanning all device classes, the autoscaler cannot determine how many PGs it will have for each device class and how that affects the pg_num for the other pools in that device class.
hello Arron do you have a section in the wiki added for that , i dont find the correct command or option to edit.
we have the default replicated and since we added a new Class for a specific Pool , we have the same issue
 
i dont find the correct command or option to edit.
we have the default replicated and since we added a new Class for a specific Pool , we have the same issue
If you have a somewhat recent Proxmox VE version, you can change the rule for a pool by editing it. Node -> Ceph -> Pools, then either double click the pool or mark it and then hit the edit button. Make sure that the "Advanced" checkbox next to the OK button is enabled.

If that is not what you meant, please explain :)
 
If you have a somewhat recent Proxmox VE version, you can change the rule for a pool by editing it. Node -> Ceph -> Pools, then either double click the pool or mark it and then hit the edit button. Make sure that the "Advanced" checkbox next to the OK button is enabled.

If that is not what you meant, please explain :)
I have a new pool in place with a specific class and a new replication rule.

The autoscaler was not working until I removed my initial pool that was using the default ssd class and default replication Rule.

If I recreate a second pool with another new class and a other replication rule that I will create, will the autoscaler stop working again?

Or this issue appen only when a pool is created with the default replication rule ?
 
The "issue" of the autoscaler not being able to calculate the optimal number of PGs is as soon as you have at least one pool using a device specific rule, while there is another pool using a rule that does not select a device class.

So once you use device classes, you need to assign each and every pool a rule that will select a device class. You can do that with existing pools as well. Ceph will then move the data to the OSDs with the selected device class.
 
The "issue" of the autoscaler not being able to calculate the optimal number of PGs is as soon as you have at least one pool using a device specific rule, while there is another pool using a rule that does not select a device class.

So once you use device classes, you need to assign each and every pool a rule that will select a device class. You can do that with existing pools as well. Ceph will then move the data to the OSDs with the selected device class.
ok so as long as we define new class and new rule to them and do not use the default replication rule that do not have any specific class associated we shoud not face this situation again right ?
 
ok so as long as we define new class and new rule to them and do not use the default replication rule that do not have any specific class associated we shoud not face this situation again right ?
Yep.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!