CephFS constant high write I/O to the metadata pool

Jul 23, 2021
4
0
21
48
Hi,

I'm seeing constant 25-50MB/s writes to the metadata pool even when all clients and the cluster is idling and in clean state. This surely can't be normal?

Even with this write rate everything does run smooth but it annoys to see this constant high write happening.

The setup:
- a hyperconverged 3 node core cluster (1 active and 2 standby MDS) + 6 storage nodes
- alltogether 12 SSD OSDs, 24 HDD OSDs, 12 NVMe OSDs
- Ceph 16.2.9

Any ideas where to look at?
Cheers!
o.
 
Hi, yes - in my case it was a symptom of bunch of systems doing simultaneous updatedb for locate and those traversing onto cephfs mounts.

Here's my post in ceph-users and a link to that thread:

"Hi - mostly as a note to future me and if anyone else looking for the same
issue...

I finally solved this a couple of months ago. No idea what is wrong with
Ceph but the root cause that was triggering this MDS issue was that I had
several workstations and a couple servers where the updatedb of "locate"
was getting run by daily cron exactly the same time every night causing
high momentary strain on the MDS which then somehow screwed up the metadata
caching and flushing creating this cumulative write io.

The thing to note here is that there's a difference with "locate" and
"mlocate" packages. The default config (on Ubuntu atleast) of updatedb for
"mlocate" does skip scanning cephfs filesystems but not so for "locate"
which happily ventures onto all of your cephfs mounts :|"

https://www.spinics.net/lists/ceph-users/msg83113.html

Cheers!