[SOLVED] Ceph pool size and OSD data distribution

lifeboy

Renowned Member
Note: This is more of an effort to understand the system works, than to get support. I know PVE 5 is not supported anymore...

I have a 7 node cluster which is complaining that:
Code:
root@s1:~# ceph -s
  cluster:
    id:     a6092407-216f-41ff-bccb-9bed78587ac3
    health: HEALTH_WARN
            1 nearfull osd(s)
            4 pool(s) nearfull
 
  services:
    mon: 3 daemons, quorum sm1,2,s5
    mgr: s1(active), standbys: s5, sm1
    mds: cephfs-1/1/1 up  {0=s1=up:active}, 2 up:standby
    osd: 23 osds: 23 up, 23 in
 
  data:
    pools:   4 pools, 1312 pgs
    objects: 1.26M objects, 4.64TiB
    usage:   11.4TiB used, 8.48TiB / 19.8TiB avail
    pgs:     1312 active+clean
 
  io:
    client:   542KiB/s wr, 0op/s rd, 32op/s wr

I see that the distribution of data over the OSD is very uneven. Particularly on host S1 there are 6 SAS 300GB SAS drives that are identical in spec, yet one is more than 89% in use, while another is on jus tover 40% in use. What causes this?

Code:
root@s1:~# ceph osd df tree
ID  CLASS WEIGHT   REWEIGHT SIZE    USE     DATA    OMAP    META    AVAIL   %USE  VAR  PGS TYPE NAME    
 -1       19.82628        - 19.8TiB 11.4TiB 11.3TiB 2.18GiB 32.9GiB 8.48TiB 57.25 1.00   - root default  
 -2        6.36676        - 6.37TiB 3.04TiB 3.03TiB  653MiB 11.3GiB 3.33TiB 47.76 0.83   -     host hp1  
  3   hdd  0.90959  1.00000  931GiB  422GiB  420GiB 84.6MiB 2.21GiB  509GiB 45.35 0.79 143         osd.3
  4   hdd  0.68210  1.00000  699GiB  265GiB  264GiB 66.7MiB  957MiB  433GiB 37.95 0.66  94         osd.4
  6   hdd  0.68210  1.00000  699GiB  308GiB  307GiB 64.7MiB  988MiB  390GiB 44.15 0.77  99         osd.6
  7   hdd  0.68210  1.00000  699GiB  346GiB  345GiB 74.4MiB  988MiB  353GiB 49.51 0.86 109         osd.7
 16   hdd  0.90959  1.00000  931GiB  461GiB  460GiB  103MiB 1.13GiB  470GiB 49.51 0.86 145         osd.16
 19   hdd  0.90959  1.00000  931GiB  516GiB  514GiB 96.2MiB 2.06GiB  415GiB 55.40 0.97 140         osd.19
 22   hdd  0.68210  1.00000  699GiB  290GiB  288GiB 68.9MiB 1.91GiB  408GiB 41.55 0.73  98         osd.22
 24   hdd  0.90959  1.00000  931GiB  505GiB  504GiB 94.8MiB 1.17GiB  426GiB 54.21 0.95 150         osd.24
 -3        1.63440        - 1.63TiB 1.07TiB 1.06TiB  236MiB 5.77GiB  582GiB 65.22 1.14   -     host s1  
 10   hdd  0.27240  1.00000  279GiB  152GiB  151GiB 19.9MiB 1004MiB  127GiB 54.35 0.95  44         osd.10
 11   hdd  0.27240  1.00000  279GiB  114GiB  113GiB 43.3MiB  981MiB  165GiB 40.91 0.71  63         osd.11
 12   hdd  0.27240  1.00000  279GiB  180GiB  179GiB 41.4MiB  983MiB 98.6GiB 64.66 1.13  58         osd.12
 13   hdd  0.27240  1.00000  279GiB  190GiB  189GiB 33.8MiB  990MiB 89.4GiB 67.96 1.19  52         osd.13
 14   hdd  0.27240  1.00000  279GiB  249GiB  248GiB 48.6MiB  975MiB 30.0GiB 89.26 1.56  67         osd.14
 15   hdd  0.27240  1.00000  279GiB  207GiB  206GiB 49.2MiB  975MiB 72.0GiB 74.17 1.30  60         osd.15
 -4        2.72888        - 2.73TiB 1.71TiB 1.70TiB  279MiB 4.47GiB 1.02TiB 62.64 1.09   -     host s2  
  9   hdd  1.81929  1.00000 1.82TiB 1.15TiB 1.15TiB  196MiB 2.35GiB  685GiB 63.21 1.10 390         osd.9
 17   hdd  0.90959  1.00000  931GiB  573GiB  571GiB 83.3MiB 2.12GiB  359GiB 61.50 1.07 181         osd.17
 -6        1.81929        - 1.82TiB 1.24TiB 1.24TiB  203MiB 2.34GiB  594GiB 68.12 1.19   -     host s4  
 18   hdd  1.81929  1.00000 1.82TiB 1.24TiB 1.24TiB  203MiB 2.34GiB  594GiB 68.12 1.19 407         osd.18
 -7        2.72888        - 2.73TiB 1.73TiB 1.72TiB  341MiB 3.48GiB 1.00TiB 63.25 1.10   -     host s5  
  2   hdd  1.81929  1.00000 1.82TiB 1.09TiB 1.09TiB  203MiB 2.06GiB  747GiB 59.89 1.05 368         osd.2
 20   hdd  0.90959  1.00000  931GiB  652GiB  650GiB  138MiB 1.42GiB  280GiB 69.96 1.22 215         osd.20
-15        2.72888        - 2.73TiB 1.41TiB 1.41TiB  307MiB 2.98GiB 1.32TiB 51.76 0.90   -     host s6  
  0   hdd  1.81929  1.00000 1.82TiB  923GiB  921GiB  182MiB 1.81GiB  940GiB 49.56 0.87 358         osd.0
  1   hdd  0.90959  1.00000  931GiB  523GiB  522GiB  125MiB 1.18GiB  408GiB 56.18 0.98 187         osd.1
 -5        1.81918        - 1.82TiB 1.16TiB 1.15TiB  211MiB 2.56GiB  679GiB 63.56 1.11   -     host sm1  
  5   hdd  0.90959  1.00000  931GiB  558GiB  557GiB  116MiB 1.23GiB  373GiB 59.94 1.05 182         osd.5
  8   hdd  0.90959  1.00000  931GiB  626GiB  624GiB 95.5MiB 1.33GiB  306GiB 67.18 1.17 198         osd.8
                      TOTAL 19.8TiB 11.4TiB 11.3TiB 2.18GiB 32.9GiB 8.48TiB 57.25                        
MIN/MAX VAR: 0.66/1.56  STDDEV: 12.03

Is this maybe better asked in the ceph forums?

thanks

Roland
 
Last edited:
The capacity distribution is very uneven. In theory Ceph can handle that but in practice the algorithm produces these outliers that you see.

You should try to distribute your OSDs better across the nodes. In this case I would start to swap some between "hp1" and "s1".
 
The capacity distribution is very uneven. In theory Ceph can handle that but in practice the algorithm produces these outliers that you see.

You should try to distribute your OSDs better across the nodes. In this case I would start to swap some between "hp1" and "s1".
I'm afraid the drive tech of these machines is substantial different. The S1 is a Sunfire X4150 with 2.5" SAS drives, whereas the HP is a ProLiant DL320s G1 with 5.25" SATA drives :-)
I'm going to try to adjust the weight of the OSD that's too full to see if I can bring it down that way. This is after all my dev/test/backup cluster...

Furthermore, I suppose nothing much changed in the distribution algorithm in the newer versions of ceph, or did it?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!