After upgrading to Ceph Reef, getting DB spillover.

BloodBlight · Apr 28, 2024

I have a slightly unusual configuration, but nothing too crazy I don't think. I have 2 OSDs (HDDs) per DB/WAL device. The DB is an SSD and the WAL is a tiny optane, this is on the 3 node cluster using erasure encoding and a custom (something I cooked up) balancing algorithm. Okay, "weirdness" out of the way.

After upgrading to Ceph Reef (not right away, about two days after), I now get this warning:

Code:

4 OSD(s) experiencing BlueFS spillover

Ceph health detail shows:

Code:

[WRN] BLUEFS_SPILLOVER: 4 OSD(s) experiencing BlueFS spillover
     osd.3 spilled over 3.2 GiB metadata from 'db' device (19 GiB used of 45 GiB) to slow device
     osd.4 spilled over 4.3 GiB metadata from 'db' device (19 GiB used of 45 GiB) to slow device
     osd.5 spilled over 3.1 GiB metadata from 'db' device (20 GiB used of 45 GiB) to slow device
     osd.6 spilled over 4.5 GiB metadata from 'db' device (18 GiB used of 45 GiB) to slow device

Note that the amount spilling over is very small compared to the size of the DB and their respective free space (each having over 50% free).

I checked bluestore_max_alloc_size, it is set to 0 on all OSDs....

Ideas? Things to check?

BloodBlight · Apr 30, 2024

Small update to this. OSD 3 seems to have reduced its spillage, while the other increased. There is no replication at this time, and very little IO being done in general... OSD 5 seems to have increased by more than a GB of DB.. This seems really odd.

Code:

[WRN] BLUEFS_SPILLOVER: 4 OSD(s) experiencing BlueFS spillover
     osd.3 spilled over 3.1 GiB metadata from 'db' device (20 GiB used of 45 GiB) to slow device
     osd.4 spilled over 4.6 GiB metadata from 'db' device (19 GiB used of 45 GiB) to slow device
     osd.5 spilled over 4.3 GiB metadata from 'db' device (19 GiB used of 45 GiB) to slow device
     osd.6 spilled over 4.7 GiB metadata from 'db' device (18 GiB used of 45 GiB) to slow device

BloodBlight · May 14, 2024

And suddenly after several weeks, the error just cleared. No explanation..

nh2 · Aug 6, 2024

The underlying issue might be this:

https://tracker.ceph.com/issues/44509#note-8

I could fix it on my cluster running

Bash:

ceph tell 'osd.*' compact

which reduced the spillover to a few KiB per OSD, and then the script linked above.

Search

Search

After upgrading to Ceph Reef, getting DB spillover.

BloodBlight

Member

BloodBlight

Member

BloodBlight

Member

nh2

New Member

We value your privacy