Problem with CEPH after upgrade

Digitaldaz

Renowned Member
Feb 19, 2014
57
6
73
I have just upgraded my three node cluster and now ceph is reporting a health warning.

I did the upgrade by moving all vms of each machine, doing an apt-get dist-upgrade and then rebooting.

After each came back up, ceph showed degraded and eventually went back to clean.

I now have a health warning:

HEALTH_WARN too many PGs per OSD (256 > max 200)

But I should only have 1024 in total with 12 OSD

ceph -s gives:

cluster:
id: fb702694-18de-46d5-bf0a-49fdfa29ba27
health: HEALTH_WARN
too many PGs per OSD (256 > max 200)

services:
mon: 3 daemons, quorum pve1,pve2,pve3
mgr: pve1(active), standbys: pve2, pve3
osd: 12 osds: 12 up, 12 in

data:
pools: 2 pools, 1024 pgs
objects: 181k objects, 722 GB
usage: 2128 GB used, 8601 GB / 10729 GB avail
pgs: 1024 active+clean

io:
client: 211 kB/s wr, 0 op/s rd, 34 op/s wr

Can anyone explain what is happening here please.

TIA
Daz
 
Search the forum, there are similar posts (search for "HEALTH_WARN too many PGs per OSD")
 
The calculator still tells me that for two pools I should have 512pg for each pool when I have 12 OSD which is exactly what I have.
 
If your pool has size = 3, then each osd has (1024 * 3 / 12) = 256 placegroups.
Now you'll have to:
- add a new node with 4 osds (or add 4 osds to existing nodes), so there will be (1024 * 3 / 16) = 192 pg per osd (and this is the best way);
- change variable 'mon pg warn max per osd' to some more then 256 (or to 0 to disable this warning at all).
 
  • Like
Reactions: Digitaldaz
If your pool has size = 3, then each osd has (1024 * 3 / 12) = 256 placegroups.
Now you'll have to:
- add a new node with 4 osds (or add 4 osds to existing nodes), so there will be (1024 * 3 / 16) = 192 pg per osd (and this is the best way);
- change variable 'mon pg warn max per osd' to some more then 256 (or to 0 to disable this warning at all).

Thank you, now this makes much more sense.