I still don't get Ceph...


Well-Known Member
Jun 20, 2020

I have this little three node PVE/Ceph cluster.

And due to performance issues I got around to swap my Ceph OSD SSDs once again. I outed one OSD in one of the nodes and it Ceph started rebalancing/remapping/backfilling (as expected). After rebalancing/remaping/backfilling was done, I stopped the OSD and - I think - nothing happened (or maybe it did some more rebalancing/backfilling - not sure). Anyways, I waited a couple of hours and there was no further action, the Ceph circle was all green.

Now when I destroyed the OSD, Ceph started rebalancing/remaping/backfilling again. That was unexpected. I thought that after an OSD was stopped and the cluster has done its thing, it would not rely on it anymore (so that it would not care about the destruction of such OSD anymore).

Can someone help me out why it is behaving like it is?

note that you can control whether rebalancing (and other stuff) happens via flags (but it is of course also affected by your replication settings and crush rules, i.e., with 3/2 and failure domain host, you might not always see recovery/rebalancing, even when OSDs are removed).
You have changed the cluster topology when destroying the OSD.
Every change to the cluster topology makes the CRUSH algorithm relocate some placement groups.
Hmm, and do you know what PGs are relocated and why?
note that you can control whether rebalancing (and other stuff) happens via flags (but it is of course also affected by your replication settings and crush rules, i.e., with 3/2 and failure domain host, you might not always see recovery/rebalancing, even when OSDs are removed).
Thanks, I am aware of that and I guess I want Ceph to do what has to be done. That's okay for me.

It is just that I had expected all the necessary rebalancing etc. to happen after I downed and outed the OSD. With that done, I expected the OSD to be gone in the eyes of Ceph. And that destroying wouldn't trigger any further action from Ceph.