CEPH pools have too many placement groups

mojsiuk

Active Member
Feb 10, 2019
12
1
43
45
Poland
Hi, we have new installation 3 nides ceph cluster on Proxmox 6.3. Each node has 3 OSD. We create new pool using GUI with default 128Pgs. Now we have Health Warn pools have too many placement groups.

Detail

1 pools have too many placement groupsPool POOL_CEPH has 128 placement groups, should have 32

Strange value, below our df tree

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 7.85962 - 7.9 TiB 1.2 TiB 1.2 TiB 36 MiB 9.0 GiB 6.7 TiB 15.13 1.00 - root default
-3 2.61987 - 2.6 TiB 406 GiB 403 GiB 14 MiB 3.0 GiB 2.2 TiB 15.13 1.00 - host pve11
0 ssd 0.87329 1.00000 894 GiB 180 GiB 179 GiB 5.5 MiB 1018 MiB 714 GiB 20.18 1.33 57 up osd.0
1 ssd 0.87329 1.00000 894 GiB 117 GiB 116 GiB 3.4 MiB 1021 MiB 777 GiB 13.08 0.86 37 up osd.1
2 ssd 0.87329 1.00000 894 GiB 108 GiB 107 GiB 5.2 MiB 1019 MiB 786 GiB 12.13 0.80 35 up osd.2
-5 2.61987 - 2.6 TiB 406 GiB 403 GiB 12 MiB 3.0 GiB 2.2 TiB 15.13 1.00 - host pve12
3 ssd 0.87329 1.00000 894 GiB 105 GiB 104 GiB 5.0 MiB 1019 MiB 789 GiB 11.78 0.78 34 up osd.3
4 ssd 0.87329 1.00000 894 GiB 142 GiB 141 GiB 4.0 MiB 1020 MiB 752 GiB 15.90 1.05 45 up osd.4
5 ssd 0.87329 1.00000 894 GiB 158 GiB 157 GiB 2.5 MiB 1022 MiB 736 GiB 17.71 1.17 50 up osd.5
-7 2.61987 - 2.6 TiB 406 GiB 403 GiB 10 MiB 3.0 GiB 2.2 TiB 15.13 1.00 - host pve13
6 ssd 0.87329 1.00000 894 GiB 148 GiB 147 GiB 4.5 MiB 1019 MiB 746 GiB 16.57 1.10 47 up osd.6
7 ssd 0.87329 1.00000 894 GiB 134 GiB 133 GiB 3.2 MiB 1021 MiB 761 GiB 14.94 0.99 43 up osd.7
8 ssd 0.87329 1.00000 894 GiB 124 GiB 123 GiB 2.5 MiB 1022 MiB 770 GiB 13.88 0.92 39 up osd.8
TOTAL 7.9 TiB 1.2 TiB 1.2 TiB 36 MiB 9.0 GiB 6.7 TiB 15.13
MIN/MAX VAR: 0.78/1.33 STDDEV: 2.60

What we can do? How many PG's is best solution for 9-OSD, and one main data pool?
 
What we can do?

The autoscaler will take the current usage of the pool for its calculation. Right now I guess it is quite empty. You can view the current detailed status and also set expected pool sizes to help the autoscaler to estimate the needed PGs.

See the Ceph docs: https://docs.ceph.com/en/latest/rados/operations/placement-groups/

How many PG's is best solution for 9-OSD, and one main data pool?
There is no one size fits all answer. You can try the PG calculator on the Ceph website (https://ceph.io/pgcalc/). Use the "All in one" use case and adjust the parameters accordingly to get an idea.
 
Hi, I just installed Ceph octopus(15.2.10) in ProxMox, and see this warning message:

Code:
1 pools have too many placement groups
Pool ceph-data has 128 placement groups, should have 32

root@ASLCC6:~# ceph -v
ceph version 15.2.10 (c5cb846e8e85c920ff64c704524419fc9d2b8c34) octopus (stable)

# pveversion
pve-manager/6.3-6/2184247e (running kernel: 5.4.106-1-pve)

I installed ceph in 3 nodes, each node created 1 osd (total 3 osd), and created 1 pool only, this settings running well without issue in old Ceph nautilus (14.2.19).

I notice there is default pool: device_health_metrics , is possible this additional pool interfere with ceph's monitoring feature ?

Code:
# pveceph pool ls
┌───────────────────────┬──────┬──────────┬────────┬───────────────────┬─────────────────┬────────────────────┬──────────────┐
│ Name                  │ Size │ Min Size │ PG Num │ PG Autoscale Mode │ Crush Rule Name │             %-Used │         Used │
╞═══════════════════════╪══════╪══════════╪════════╪═══════════════════╪═════════════════╪════════════════════╪══════════════╡
│ ceph-data             │    3 │        2 │    128 │ warn              │ replicated_rule │ 0.0288491807878017 │ 492493776408 │
├───────────────────────┼──────┼──────────┼────────┼───────────────────┼─────────────────┼────────────────────┼──────────────┤
│ device_health_metrics │    3 │        2 │      1 │ on                │ replicated_rule │                  0 │            0 │
└───────────────────────┴──────┴──────────┴────────┴───────────────────┴─────────────────┴────────────────────┴──────────────┘
 
Last edited:
Hi, I just installed Ceph octopus(15.2.10) in ProxMox, and see this warning message:
With the empty pool having 128 PGs, the autoscaler calculates the currently needed number of PGs and if it is different to the current value by a factor of 3, it will warn or just change (depends on the PG Autoscale mode).
Code:
ceph osd pool autoscale-status

You can set a target size or target ratio for the pool so the autoscaler knows what is expected.

The device_health_metrics pool is used internally by Ceph to keep track of the state of the physical disks.
 
  • Like
Reactions: hbokh

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!