[SOLVED] Ceph : delta in reported total space

Altinea

Active Member
Jan 12, 2018
33
7
28
41
Hello,
We now have a well working Ceph cluster with 5 nodes (PVE5.4, Ceph luminous).

OSDs are distributed like this :
* host1 :
osd.12 : 1750 GB
osd.13 : 1750 GB
* host2 :
osd.0 : 894 GB
osd.1 : 1750 GB
* host3 :
osd.5 : 1750 GB
* host4 :
osd.4 : 1750 GB
osd.5 : 894 GB
* host 5 :
osd.14 : 1750 GB
osd.6 : 1750 GB

So, the raw storage would be 14038 GB. With a size=2 on the pool, I expected a total 'usable' storage of around 7000 GB.

Actually, the ceph df is reporting things like this :
Code:
    NAME              ID     USED        %USED     MAX AVAIL     OBJECTS
    ceph-ssd-fast     1      4.46TiB     71.01       1.82TiB     1169441
So that makes 'only' 6280 GB. Of course, I'll keep enough space for data rebalancing in case of a node failure (as always, NEVER go to 100% disk usage in a Ceph cluster).

Does someone have a clue about the missing 800GB of RAW storage ?
Could it be because host3 has only one OSD and lower total storage ?

I'm using http://florian.ca/ceph-calculator/ to calculate the 'safe' cluster size and it reports me 7019 GB of 'risky' cluster size (basically RAW/2).

Thanks for your help,
Julien
 
Last edited:
Can you please post the complete ceph df detail?

So, the raw storage would be 14038 GB. With a size=2 on the pool, I expected a total 'usable' storage of around 7000 GB.
That is dangerous, as in a failure condition the remaining copy might be in-flight and a subsequent failure might lead to data loss.
 
Here's the df detail output :
Code:
GLOBAL:
    SIZE        AVAIL       RAW USED     %RAW USED     OBJECTS 
    38.5TiB     17.8TiB      20.6TiB         53.64       2.70M 
POOLS:
    NAME              ID     QUOTA OBJECTS     QUOTA BYTES     USED        %USED     MAX AVAIL     OBJECTS     DIRTY     READ        WRITE       RAW USED 
    ceph-ssd-fast     1      N/A               N/A             4.42TiB     68.24       2.06TiB     1159526     1.16M     6.78GiB     1.08GiB      8.84TiB 
    ceph-hdd          4      N/A               N/A             5.89TiB     55.08       4.80TiB     1545262     1.55M      237MiB      172MiB      11.8TiB

We have 2 pools. I'm only interested by the ssd-fast pool. And since yesterday, I added another 894 GB OSD in host3.
I observed rebalancing process : USED bytes didn't changed at all but %USED moved a lot down and up. I think it's mainly a question about utilization of each OSD.
When I ran
# ceph osd test-reweight-by-utilization
and
ceph osd reweight-by-utilization

USED didn't changed but I gained some AVAIL storage. Something is over my understanding with the crush map.

And yes, size=2 is somewhat dangerous, just as in RAID5 vs RAID6. But rebalancing starts as soon as the OSD is removed from the pool. Capacity vs security is always something to consider.

Julien
 
And which amount are you missing now?
 
Code:
ID CLASS    WEIGHT  REWEIGHT SIZE    USE     AVAIL   %USE  VAR  PGS
6 ssd-fast 1.74649  1.00000 1.75TiB  956GiB  832GiB 53.45 1.00  54
14 ssd-fast 1.74649  0.90002 1.75TiB 1.11TiB  655GiB 63.39 1.18  64
2 ssd-fast 0.87320  1.00000  894GiB  566GiB  328GiB 63.32 1.18  32
4 ssd-fast 1.74649  0.95001 1.75TiB 1.09TiB  670GiB 62.53 1.17  63
5 ssd-fast 1.74649  0.85004 1.75TiB 1.17TiB  585GiB 67.27 1.25  68
15 ssd-fast 0.87320  1.00000  894GiB  444GiB  450GiB 49.65 0.93  25
3      hdd 3.63860  1.00000 3.64TiB 1.87TiB 1.77TiB 51.27 0.96  81
7      hdd 3.63860  1.00000 3.64TiB 1.79TiB 1.84TiB 49.31 0.92  78
0 ssd-fast 0.87320  0.95001  894GiB  547GiB  347GiB 61.21 1.14  31
1 ssd-fast 1.74649  1.00000 1.75TiB  993GiB  796GiB 55.50 1.04  56
8      hdd 2.72890  1.00000 2.73TiB 1.31TiB 1.42TiB 48.10 0.90  57
9      hdd 5.45789  1.00000 5.46TiB 2.97TiB 2.49TiB 54.39 1.01 129
10      hdd 2.72890  1.00000 2.73TiB 1.27TiB 1.46TiB 46.41 0.87  55
11      hdd 5.45789  1.00000 5.46TiB 2.58TiB 2.88TiB 47.24 0.88 112
12 ssd-fast 1.74649  0.90002 1.75TiB 1.00TiB  761GiB 57.47 1.07  58
13 ssd-fast 1.74649  0.95001 1.75TiB 1.06TiB  708GiB 60.44 1.13  61
                       TOTAL 38.5TiB 20.6TiB 17.9TiB 53.62

considering only ssd-fast :
So, 14932 GB RAW. With size=2, I would expect to have ~7466 GB of USED + MAX AVAIL in df. And it 'only' reports 6480 GB. That makes around 1 TB in the void ;-)

But as I reported below, that's probably due to some OSD with +USED higher than others (VAR) and 'playing' with reweight allows me to recover some free space after rebalancing.

Best regards,
Julien
 
The distribution is a little bit uneven. The PGs per OSD should be close to a 100.
https://ceph.io/pgcalc/

I also recommend to upgrade, since not only PVE 5.4 gets EoL in June. The upgrade will also bring Ceph Nautilus, which will show more accurately what space is available by device class.
 
The difference is probably due to uneven distribution, yes.

I just finished the migration from DRBD to Ceph and upgrade of all 'old' PVE5.2 nodes to latest PVE5.4. The upgrade to PVE6 and Nautilus is the next move. But it has to be carefully planned and tested as we're in a production environnement.

Thanks for your advices,
Julien
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!