Too few PGs per OSD (28 < min 30)

luigi

Active Member
Feb 4, 2019
24
1
43
55
Hi,

On my proxmox installation I have the following ceph warning message:

too few PGs per OSD (28 < min 30)

How can I increase PGs per OSD?

Thanks
Luigi
 
You need to increase the number of PGs in your pools.

Tell us a few details about your CEPH, number of OSDs, pools, etc. pp. then we can tell you whether that makes sense or not.
 
Hi,

in picture attached you can see that we have two pool. One for SSD disks (for VMs) and another one for HDD disks (for archive). Each pool has 128 PGs.The SSD pool has in total 15 SSD disks (5 SSDdisks x 3 nodes).

The warning message is on SSD pool.

Thanks, Luigi
 

Attachments

  • PGs.png
    PGs.png
    10.7 KB · Views: 8
Um, so it now looks to me that your CEPH is in a very critical state and will soon go into read-only mode.

Please post the output of ceph osd df tree.
 
Hi,

in attached required informations.

Thanks, Luigi
 

Attachments

  • ceph-df.jpg
    ceph-df.jpg
    330.5 KB · Views: 5
Okay, doesn't look as bad as the first screenshot. But you have a big difference between some of the OSDs, you should fix that, because that also has a significant influence on your cluster fill level.

See here: https://docs.ceph.com/en/latest/rados/operations/balancer/

But you won't get rid of the message even with that. Given the level of SSDs, I think you could also go to 256 - 512 PGs. With the HDDs, 256 PGs won't be an issue either.

CEPH recommends no more than 100 PGs per OSD. But you shouldn't turn everything up now, it should always stay within limits, as the PGs can also have a significant influence on your performance. But also note that increasing the PGs initially puts a lot of load on the cluster as the data is redistributed. So you should do this pool by pool and activate the balancer beforehand so that you have a good starting position.
 
Okay.
I have proxmox 6.2.4 with ceph version 14.2.9.

Thanks, Luigi
Update! Version specific knowledge for these old versions is fading away.
https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0
https://pve.proxmox.com/wiki/Upgrade_from_7_to_8

The guides for the Ceph upgrades are also linked in these guides. Once you run a recent version, set a target ratio for these pools so that the (new) autoscaler can determine the right pg_num for the pools. AFAICT you have one pool per device class. So any target ratio will result in the autoscaler calculating with the assumption that the pool will use the full space in its device class.

Once you update, you will see a new pool, first called "device_health_metrics" and once you are on Ceph Reef (17), it will be renamed to ".mgr". You need to assign a device class specific rule to this pool as well for the autoscaler to work.
 
Hi aaron,
I read the first link (upgrade from 6.xto7.0), there are a lot of steps in order to upgrade proxmox and ceph.

Do I need to shut down all VMs when I perform these steps?
Usually, how long time this upgrade takes?
If something goes wrong, can I restart quickly and turn on the VMs?

Thanks, Luigi
 
Maybe a bit of a hated opinion but...

I would recommend to just create a backup of all your VMs/Containers and store the backups on a external storage device.
Then do a wipe and clean install of Proxmox VE 8.

Given that your system is already 2 major versions behind, in my eyes it is safer and quicker (given that you need time for both the 6 to 7 and the 7 to 8 update) to just do a full VM/Container backup and then to a wipe and install Proxmox VE 8.

EDIT
I would say that it is also safer since you do not transfer over any version specified configurations that might or might not brake newer versions.
Even though it is (as far as I know) officially supported to update the OS between major updates, I always try to just clean install between major version. (And always do with Debian major version changes since this is also relevent in your case.)
 
Last edited:
  • Like
Reactions: aaron
Hi aaron,
I read the first link (upgrade from 6.xto7.0), there are a lot of steps in order to upgrade proxmox and ceph.

Do I need to shut down all VMs when I perform these steps?
Usually, how long time this upgrade takes?
If something goes wrong, can I restart quickly and turn on the VMs?

Thanks, Luigi
You can live migrate the VMs away, update one node, migrate back, continue with the next node. Keep in mind to follow each step exactly and don't skip versions.

When doing live migrations, keep in mind that within the same version, or from an older PVE to a newer, it should always work. Live migrating from a newer to an older though might not work all the time.

@Daniel_Dog has a point though, if you don't mind the downtime and can back up all VMs, starting with a completely fresh installation could be faster than doing 2 PVE major version upgrades and 4 Ceph version upgrades.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!