[SOLVED] Ceph crush_rules, device_health_metrics pool

Hi,

On my 3-node cluster I set up ceph using a custom device class (sas900 to identify my sas 900GB devices and put them all in one single pool), waiting for new pools to be created when new devices with different classes will be added to the nodes. I created a custom crush rule (replicated_sas900), associated the pool to the rule and renamed the pool. Everything went smoothly.
I will create new dedicated crush rules as new device type/dimension will arrive to the nodes.
The device_health_metrics pool is still on the default replicated_rule crush rule.

Now, I am trying to figure out some behiviours:
1) if I set the sas900 pool with autoscaler "on", it seems to never finish to recalculate the pgs, creating a high load on the storage
2) I cannot remove the default replicated_rule crush rule, being used by the device_health_metrics pool. I'd like to only have dedicated crush rules.

So: is it normal for the autoscaler continuing to work without end? Will it find a stable pg number? And, can I (should I) change the crush rule for the device_health_metrics pool? To wich one of the three or four dedicated crush rules?
 
As you saw, the autoscaler will slowly change the number of PGs to not cause a large rebalance, which could have an impact on performance.

If I understand the situation correctly, you only have one kinde of device class in use now. In that situation, it really doesn't matter which rule the device_health_metrics pool gets.

But once you have more device classes in use, each pool needs to be assigned to one. If the device_health_metrics pool will still have the default "replicated_rule" assigned, the autoscaler won't be able to determine the pg_num. This is because at least one pool would span multiple device classes. Without a clear distinction which pools will share a device class, it is impossible for the autoscaler to come to a result.

Therefore you should assign a device class specific rule to all pools. For the device_health_metrics it shouldn't matter which device class you assign it to.
 
  • Like
Reactions: Urbaman

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!