Change in Ceph layout increases number of required PGs

René Pfeiffer

Active Member
Oct 29, 2018
20
4
43
Vienna
web.luchs.at
Hello!
We run a Proxmox cluster comprising four nodes. It is connected to an external Ceph cluster, also with four nodes. All nodes are equally distributed between two physical locations. We would like to change the CRUSH map in order to tell Ceph which nodes are at which physical location. I tried to set the rack and datacenter labels to mark how the nodes are distributed. Whenever I set the location1 and location2 labels, then Ceph reports 166% misplaced PGs and stops serving I/O requests. If I reset the location, then the cluster works again. I researched the status message, and the documentation said that some CRUSH map modifications require increasing the number of PGs. Is there a way to set the different location labels and not change the number of PGs?

The Ceph cluster runs „version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus“ with Docker containers. An upgrade is scheduled, but won't be done before the next three months.

Best,
René.
 
The Ceph cluster runs „version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus“ with Docker containers.
This means that the Ceph cluster is not deployed on Proxmox VE directly, right?

What CRUSH rule is assigned to that pool? Since it is still on Octopus, it can't be a stretch cluster (that was introduced in Pacific).

Do you see a lot of recovery traffic once you set the location? Then the Crush Rule might have some things set where data should be located to achieve better fault tolerance. If the recovery traffic is too high, that can cause issues for client operations.

What is the status `ceph -s` of the pool once you set the location in the CRUSH map? That should give you some hints as well.
 
Right, Ceph was deployed separately. The upgrade is planned to enable stretch cluster mode.

When I set the location, there is no recovery traffic. I think the bug is in the CRUSH rules, because we have one pool for SSDs and one for HDDs. I will prepare more information and post it here.
 
  • Like
Reactions: aaron

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!