Ceph OSD rebalancing

Spiros Pap

Well-Known Member
Aug 1, 2017
87
1
48
44
Hi all,

I have a setup of 3 proxmox servers (7.3.6) running Ceph 16.2.9. I have 21 SSD OSDs, 12*1.75TB,9*0.83TB. On these OSDs I have one pool with replication 1 (one copy).
I have set the pg_autoscale_mode to 'on' and the resulting PGs of the pool are 32.
My problem is that the OSDs are very badly inbalanced:
Code:
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS
 0    ssd  1.74599   0.90002  1.7 TiB  1.1 TiB  1.1 TiB  3.2 MiB  2.7 GiB  647 GiB  63.80  1.77   13      up
 5    ssd  0.82999   0.90002  850 GiB  757 GiB  756 GiB   37 MiB  1.2 GiB   93 GiB  89.05  2.47    7      up
 6    ssd  0.82999   1.00000  850 GiB  8.2 GiB  7.8 GiB  9.5 MiB  434 MiB  842 GiB   0.96  0.03    4      up
 7    ssd  0.82999   1.00000  850 GiB  381 GiB  381 GiB   20 MiB  625 MiB  469 GiB  44.87  1.24    5      up
16    ssd  1.74599   0.90002  1.7 TiB  1.3 TiB  1.3 TiB  3.6 MiB  2.7 GiB  483 GiB  72.99  2.02   12      up

As you can see I have an OSD at 89% while another OSD based on the same kind of disk is at 0.96% full. Apart from this two extremes, the other OSDs are also not balanced.

What can i do about this? Is it dangerous if an OSD fills up? Isn't ceph going to get space from another OSD that has available space?

Thanx,
Spiros
 
On these OSDs I have one pool with replication 1 (one copy).
WHY? Unless you really don't care about your data, setting size/min_size to the default values 3/2 is what it should be.

What is the output of ceph osd df tree?

Did you set a target ratio for the pool? Otherwise, the autoscaler can only use the current used space, which will be rather empty. With a target_ratio set, the autoscaler knows, how much space the different pools are expected to take up in the end. With that, it can calculate the pg_num for the pools better. If you only have one pool, any target_ratio will be considered as 100%, since there are no other target_ratios to weigh it against.
 
Well, I chose 2/1 because I usualy see raid5 for SSDs on enterprise storages, but Ok, your comment is noted (about the 3/2 size/min_size).

I have only one pool on the ssds, so the target_ratio should really be 100%.

The output in my posting was part from the "ceph osd df tree" command.
Now, is there a good explanation why one disk is not really used? and is there a way to force ceph to rebalance data?

One more thing. Yesterday I transfered a few loads to this ceph SSD pool. Then i noticed the attached graph that shows the ceph pool size.
ceph.png
How can you explain that the total size went from 12TB to just below 8TB? How it possible for the total size to change while the pool fills up? Then, at 15:00, I deleted a 1.4TB disk of a VM and the total went back to 12TB...... Magic (to me)...

Regards,
Sp
 
Well this tend to be funny:
I did a "ceph osd set-require-min-compat-client luminous" which allowed OSDs to be able to rebalance and then a "ceph osd reweight-by-utilization" to make it rebalance OSDs. The result was:
Before:
Code:
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS
 5    ssd  0.82999   0.90002  850 GiB  437 GiB  436 GiB   20 MiB  758 MiB  413 GiB  51.44  1.73    6      up
 6    ssd  0.82999   1.00000  850 GiB  8.2 GiB  7.8 GiB  9.5 MiB  435 MiB  842 GiB   0.96  0.03    4      up

After:
Code:
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META      AVAIL    %USE   VAR   PGS  STATUS
 5    ssd  0.82999   0.85004  850 GiB  581 GiB  580 GiB   25 MiB   1.1 GiB  269 GiB  68.40  2.29    7      up
 6    ssd  0.82999   1.00000  850 GiB  8.2 GiB  7.8 GiB  9.5 MiB   435 MiB  842 GiB   0.97  0.03    4      up



Instead of osd.5 to free up space and osd.6 to fill up, the opposite happened!!!! Is this a ceph bug or what?
 
What is the output of ceph osd df tree?
And ceph -s please.

Without a full picture, it is hard to say what is going on.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!