CEPH pools have too many placement groups

mojsiuk · Dec 21, 2020

Hi, we have new installation 3 nides ceph cluster on Proxmox 6.3. Each node has 3 OSD. We create new pool using GUI with default 128Pgs. Now we have Health Warn pools have too many placement groups.

Detail

1 pools have too many placement groupsPool POOL_CEPH has 128 placement groups, should have 32

Strange value, below our df tree

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 7.85962 - 7.9 TiB 1.2 TiB 1.2 TiB 36 MiB 9.0 GiB 6.7 TiB 15.13 1.00 - root default
-3 2.61987 - 2.6 TiB 406 GiB 403 GiB 14 MiB 3.0 GiB 2.2 TiB 15.13 1.00 - host pve11
0 ssd 0.87329 1.00000 894 GiB 180 GiB 179 GiB 5.5 MiB 1018 MiB 714 GiB 20.18 1.33 57 up osd.0
1 ssd 0.87329 1.00000 894 GiB 117 GiB 116 GiB 3.4 MiB 1021 MiB 777 GiB 13.08 0.86 37 up osd.1
2 ssd 0.87329 1.00000 894 GiB 108 GiB 107 GiB 5.2 MiB 1019 MiB 786 GiB 12.13 0.80 35 up osd.2
-5 2.61987 - 2.6 TiB 406 GiB 403 GiB 12 MiB 3.0 GiB 2.2 TiB 15.13 1.00 - host pve12
3 ssd 0.87329 1.00000 894 GiB 105 GiB 104 GiB 5.0 MiB 1019 MiB 789 GiB 11.78 0.78 34 up osd.3
4 ssd 0.87329 1.00000 894 GiB 142 GiB 141 GiB 4.0 MiB 1020 MiB 752 GiB 15.90 1.05 45 up osd.4
5 ssd 0.87329 1.00000 894 GiB 158 GiB 157 GiB 2.5 MiB 1022 MiB 736 GiB 17.71 1.17 50 up osd.5
-7 2.61987 - 2.6 TiB 406 GiB 403 GiB 10 MiB 3.0 GiB 2.2 TiB 15.13 1.00 - host pve13
6 ssd 0.87329 1.00000 894 GiB 148 GiB 147 GiB 4.5 MiB 1019 MiB 746 GiB 16.57 1.10 47 up osd.6
7 ssd 0.87329 1.00000 894 GiB 134 GiB 133 GiB 3.2 MiB 1021 MiB 761 GiB 14.94 0.99 43 up osd.7
8 ssd 0.87329 1.00000 894 GiB 124 GiB 123 GiB 2.5 MiB 1022 MiB 770 GiB 13.88 0.92 39 up osd.8
TOTAL 7.9 TiB 1.2 TiB 1.2 TiB 36 MiB 9.0 GiB 6.7 TiB 15.13
MIN/MAX VAR: 0.78/1.33 STDDEV: 2.60

What we can do? How many PG's is best solution for 9-OSD, and one main data pool?

aaron · Dec 21, 2020

mojsiuk said:
What we can do?

The autoscaler will take the current usage of the pool for its calculation. Right now I guess it is quite empty. You can view the current detailed status and also set expected pool sizes to help the autoscaler to estimate the needed PGs.

See the Ceph docs: https://docs.ceph.com/en/latest/rados/operations/placement-groups/

mojsiuk said:
How many PG's is best solution for 9-OSD, and one main data pool?

There is no one size fits all answer. You can try the PG calculator on the Ceph website (https://ceph.io/pgcalc/). Use the "All in one" use case and adjust the parameters accordingly to get an idea.

chengkinhung · Apr 16, 2021

Hi, I just installed Ceph octopus(15.2.10) in ProxMox, and see this warning message:

Code:

1 pools have too many placement groups
Pool ceph-data has 128 placement groups, should have 32

root@ASLCC6:~# ceph -v
ceph version 15.2.10 (c5cb846e8e85c920ff64c704524419fc9d2b8c34) octopus (stable)

# pveversion
pve-manager/6.3-6/2184247e (running kernel: 5.4.106-1-pve)

I installed ceph in 3 nodes, each node created 1 osd (total 3 osd), and created 1 pool only, this settings running well without issue in old Ceph nautilus (14.2.19).

I notice there is default pool: device_health_metrics , is possible this additional pool interfere with ceph's monitoring feature ?

Code:

# pveceph pool ls
┌───────────────────────┬──────┬──────────┬────────┬───────────────────┬─────────────────┬────────────────────┬──────────────┐
│ Name                  │ Size │ Min Size │ PG Num │ PG Autoscale Mode │ Crush Rule Name │             %-Used │         Used │
╞═══════════════════════╪══════╪══════════╪════════╪═══════════════════╪═════════════════╪════════════════════╪══════════════╡
│ ceph-data             │    3 │        2 │    128 │ warn              │ replicated_rule │ 0.0288491807878017 │ 492493776408 │
├───────────────────────┼──────┼──────────┼────────┼───────────────────┼─────────────────┼────────────────────┼──────────────┤
│ device_health_metrics │    3 │        2 │      1 │ on                │ replicated_rule │                  0 │            0 │
└───────────────────────┴──────┴──────────┴────────┴───────────────────┴─────────────────┴────────────────────┴──────────────┘

aaron · Apr 16, 2021

chengkinhung said:
Hi, I just installed Ceph octopus(15.2.10) in ProxMox, and see this warning message:

With the empty pool having 128 PGs, the autoscaler calculates the currently needed number of PGs and if it is different to the current value by a factor of 3, it will warn or just change (depends on the PG Autoscale mode).

Code:

ceph osd pool autoscale-status

You can set a target size or target ratio for the pool so the autoscaler knows what is expected.

The device_health_metrics pool is used internally by Ceph to keep track of the state of the physical disks.

chengkinhung · Apr 17, 2021

@aaron, Thanks

Search

Search

CEPH pools have too many placement groups

mojsiuk

Active Member

aaron

Proxmox Staff Member

chengkinhung

Renowned Member

aaron

Proxmox Staff Member

chengkinhung

Renowned Member

We value your privacy