Ceph deep-scrubbing performance optimization

hauke_laging

New Member
Apr 1, 2025
1
0
1
Berlin
linux.hauke-laging.de
We are using Ceph on three nodes (10g). There are one HDD pool (3 OSDs per node) and one NVMe pool. The NVMes are used for the HDD WALs and DBs, too. For each OSD osd_max_scrubs is set to 1.

During the deep scrubbing phases (I have limited this to some hours during the night) the cluster is terribly slow. I do not have benchmark results but even booting a VM takes forever. I am trying to better understand how deep-scrubbing works so that I may be able to improve the settings.

In my understanding a deep scrub causes the PG to be read on all nodes and is "started" on the main OSD for that PG. I do not know whether the "client scrubbing" of the replica PGs is counted against the OSD scrub maximum and shown in the ceph -s numbers (I guess not as I sometimes see just one PG being scrubbed which seems not to make sense).

I wonder if it can happen that on the same OSD three deep-scrubbing operations can be active at the same time (which would be terrible for the performance): One having been initiated on this OSD and two initiated on other OSDs which have a PG replica on this OSD. In that case it might improve the performance a lot to set the nodeepscrub flags on all OSDs of two nodes (per scrubbing period).

Is my understanding correct?