Ceph available space

NetUser · Jan 4, 2022

Hi everyone,

sorry to bother you again with the same question over and over (yes, i've read like a ton of messages but none can fill my doubt), so here i go with my question:

'Ceph osd df' gives me 5.1TiB (1.73 cause of replicas) of available space, but when i try to move a disk into ceph, it only gives me 780GB of avail space (in the hypervisor).
Where is my 1.73-0.780=~1TB of space?

My cluster is composed of 5 nodes, 15 OSDs 1.6TB each, with 3xreplica

user@mycluster:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 ssd 1.45549 1.00000 1.5 TiB 1.0 TiB 1.0 TiB 3.2 MiB 2.2 GiB 417 GiB 72.00 0.94 6 up
1 ssd 1.45549 1.00000 1.5 TiB 1.0 TiB 1.0 TiB 3.9 MiB 2.2 GiB 423 GiB 71.62 0.93 6 up
2 ssd 2.14999 1.00000 1.5 TiB 894 GiB 892 GiB 5.3 MiB 1.9 GiB 596 GiB 59.99 0.78 5 up
3 ssd 1.75000 1.00000 1.5 TiB 1.2 TiB 1.2 TiB 6.9 MiB 2.6 GiB 244 GiB 83.65 1.09 8 up
4 ssd 1.45549 1.00000 1.5 TiB 1.2 TiB 1.2 TiB 3.9 MiB 2.5 GiB 238 GiB 84.06 1.10 7 up
5 ssd 1.14999 1.00000 1.5 TiB 1.0 TiB 1.0 TiB 4.2 MiB 2.2 GiB 423 GiB 71.61 0.93 6 up
6 ssd 1.45549 1.00000 1.5 TiB 1.2 TiB 1.2 TiB 3.3 MiB 2.5 GiB 241 GiB 83.81 1.09 7 up
7 ssd 1.29999 1.00000 1.5 TiB 1.2 TiB 1.2 TiB 8.2 MiB 2.5 GiB 242 GiB 83.79 1.09 7 up
8 ssd 1.09999 1.00000 1.5 TiB 1.2 TiB 1.2 TiB 6.5 MiB 2.7 GiB 241 GiB 83.81 1.09 8 up
9 ssd 1.84999 1.00000 1.5 TiB 1.1 TiB 1.0 TiB 4.1 MiB 2.2 GiB 413 GiB 72.26 0.94 6 up
10 ssd 1.20000 1.00000 1.5 TiB 1.0 TiB 1.0 TiB 6.0 MiB 2.2 GiB 420 GiB 71.80 0.94 6 up
11 ssd 1.25000 1.00000 1.5 TiB 1.2 TiB 1.2 TiB 3.7 MiB 2.8 GiB 242 GiB 83.77 1.09 7 up
12 ssd 2.09999 1.00000 1.5 TiB 1.0 TiB 1.0 TiB 6.3 MiB 2.2 GiB 419 GiB 71.89 0.94 7 up
13 ssd 1.45549 1.00000 1.5 TiB 1.2 TiB 1.2 TiB 3.3 MiB 2.8 GiB 240 GiB 83.91 1.09 7 up
14 ssd 1.45549 1.00000 1.5 TiB 1.0 TiB 1.0 TiB 5.3 MiB 2.2 GiB 421 GiB 71.77 0.94 6 up
TOTAL 22 TiB 17 TiB 17 TiB 74 MiB 36 GiB 5.1 TiB 76.65
MIN/MAX VAR: 0.78/1.10 STDDEV: 7.30

So, here are the questions:
1. Is it possible to calculate, somehow, from the 'ceph osd df' output, how to get that 780GB of avail space?
2. Are the misplaced PGs (too few) causing this? How is it possible that misplaced PGs eat up 5.1TiB /3(replicas)=1.73
3. I know i have too few PGs, right now it's 32. If i want to slowly increment it to 256, how does it impact my storage and network performance on the production pool?
4. Is it possible to increment PGs like this: 32-64-96-128-160-192-224-256 or i have to do 32-64-128-256?

gurubert · Jan 4, 2022

How are the OSDs distributed across the failure zone (I assume host)?

Has each host 3 OSDs? If this is imbalanced the total capacity may not be available after replication.

The PGs in a pool always have to be a power of two. You can increase them to 256. The number of PGs do not have an effect on performance, but on data distribution. The more PGs the finer data can be distributed. Just monitot number of PGs per OSD. There should be between 100 and 200 PGs per OSD.

NetUser · Jan 4, 2022

I don't understand (sorry) what do you mean with failure zone.
Each host has 3 osd, correct. What do you mean with "this is imbalanced"?
Thanks for the answers, and yes, about PGs it's the same thing i was thinking. My concern is that the increasing of PGs will let ceph move data->disk IO and network transfers, and I'd like to know how those movements will affect the performances of my VMs running on the pool (which is only one, the production one)

NetUser · Jan 5, 2022

RIght now i'm more focused on the performance issues i can have during the PG increase, in particular:

1. Is it possible to limit the disk I/O of my SSDs dedicated to the increase of PGs, reserving some disk I/O for my production?
2. Is it possible to STOP or PAUSE the PG increment process if i see it's decreasing too much my production VMs performances?

Many thanks!

gurubert · Jan 5, 2022

The recovery operations already have a lower priority within the Ceph cluster than the client operations.
You can pause it by setting the flags norebalance and nobackfill with "ceph osd set".

PS: Please do learn a little more about Ceph before using it in production.

NetUser · Jan 5, 2022

gurubert said:
The recovery operations already have a lower priority within the Ceph cluster than the client operations.

You can pause it by setting the flags norebalance and nobackfill with "ceph osd set".

PS: Please do learn a little more about Ceph before using it in production.

I agree with you, unfortunately we relied on a supplier not as able as we thought.

Thanks for the info tho, have a great day!

Search

Search

Ceph available space

NetUser

Member

gurubert

Distinguished Member

NetUser

Member

NetUser

Member

gurubert

Distinguished Member

NetUser

Member

We value your privacy