Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

SteveITS · Apr 29, 2025

Just to ask are we sure this is a problem? They added a warning for slowness but has something actually gotten slower or are they just alerting on the same behavior now?

Petr Svacina · Apr 29, 2025

A still have no alerts for 5 days ...

wuwzy · Apr 30, 2025

There have been 8 OSD errors today, and the speed has not been found to be particularly slow.

wuwzy · May 13, 2025

Half a month has passed, and this problem still exists.

Petr Svacina · May 13, 2025

wuwzy: You were right, from time to time, osd is slow... clean itself ... come again ... , etc ...

aufwiz · May 13, 2025

The new hotfix will be coming.
https://github.com/rook/rook/discussions/15403#discussioncomment-12423878

wuwzy · May 14, 2025

aufwiz said:
The new hotfix will be coming.
https://github.com/rook/rook/discussions/15403#discussioncomment-12423878

Thank you for your attention and reply. I have used this（bdev_async_discard_threads > 1 ） and it did not solve the problem. Maybe I need to wait for a while longer.

Maggi · May 14, 2025

Hi,

Same issue here after update from 8.3.5 to 8.4.1 and ceph from 19.2.0 to 19.2.1

Any help?

Thanks

wuwzy · May 15, 2025

Maggi said:
Hi,

Same issue here after update from 8.3.5 to 8.4.1 and ceph from 19.2.0 to 19.2.1

Any help?

Thanks

For now, it seems all we can do is wait.

gabrielcosi · May 24, 2025

Same issue, following

SteveITS · Wednesday at 20:30

I don't think we've seen this alert for a few weeks now.

FWIW I saw this post, but did not change any of our (19.2.1) settings:

R

Post in thread 'ceph after upgrade to 18.2.6 - observed slow operation indications in BlueStore'

Tuesday at 12:56

Try following ceph documentation on BLUESTORE_SLOW_OP_ALERT (https://docs.ceph.com/en/reef/rados/operations/health-checks/#bluestore-slow-op-alert).

Default is 86400 seconds and 1 slow op. This will trigger a warning if >1 slow ops occurs within 24 hours. Which is quite common for old HDDs.

This worked for my cluster:

Bash:

ceph config set class:hdd bluestore_slow_ops_warn_lifetime 60
ceph config set class:hdd bluestore_slow_ops_warn_threshold 10

SteveITS · Thursday at 21:55

One possibly related note, especially for those with multiple OSD classes, we set our few remaining HDDs to primary-affinity 0, so the primary read would always be from an SSD.

View:

Code:

ceph osd tree

Set:

Code:

ceph osd primary-affinity osd.12 0

aufwiz · Friday at 03:00

SteveITS said:
I don't think we've seen this alert for a few weeks now.

FWIW I saw this post, but did not change any of our (19.2.1) settings:

R

Post in thread 'ceph after upgrade to 18.2.6 - observed slow operation indications in BlueStore'

Tuesday at 12:56

Try following ceph documentation on BLUESTORE_SLOW_OP_ALERT (https://docs.ceph.com/en/reef/rados/operations/health-checks/#bluestore-slow-op-alert).

Default is 86400 seconds and 1 slow op. This will trigger a warning if >1 slow ops occurs within 24 hours. Which is quite common for old HDDs.

This worked for my cluster:

Bash:

ceph config set class:hdd bluestore_slow_ops_warn_lifetime 60 ceph config set class:hdd bluestore_slow_ops_warn_threshold 10

ronyodil

It is helpful for me.

spirit · Friday at 08:03

SteveITS said:
One possibly related note, especially for those with multiple OSD classes, we set our few remaining HDDs to primary-affinity 0, so the primary read would always be from an SSD.

you have mixed ssd and hdd in the same pool ???

unixe · Friday at 12:45

I see the same on a stage cluster after upgrading from 18.2.4-pve1 (everything was fine there) to 18.2.7-pve1.
Following.

SteveITS · Friday at 16:32

spirit said:
mixed ssd and hdd in the same pool ?

We have some remaining SAS 10k drives. On the prior platform they had a read/write cache SSD, which we're using for DB/WAL. They'll get replaced eventually.

unixe said:
to 18.2.7-pve1

The warning didn't exist until recently.

wassupluke · 2025-05-31T03:10:59+0200

I had this issue slow operations in bluestore issue as well and had failed to resolve it with the fixes addressed here, just like the rest of you. I think I'm finding out the power supply to that machine was failing. I've just replaced it and no immediate issues (which I was having with bluestore, and disk I/O in general, on the old PSU). I highly doubt this helps anyone else, just sharing my observation in case it does help someone.

spirit · 2025-05-31T14:37:08+0200

SteveITS said:
We have some remaining SAS 10k drives. On the prior platform they had a read/write cache SSD, which we're using for DB/WAL. They'll get replaced eventually.

For this specific case, I think this is normal to have random slow ops error, as your pg and replicats can be on different slorage speed. (so, a primary write on fast sdd, will always wait for replicat on slow hdd), and for read, it's really russian roulette.
(Personally, I'll do 2 differents pools, to be sure to have have random behaviour)

SteveITS · 2025-05-31T15:16:45+0200

> for read, it's really russian roulette

For read, by default it’s random but affinity is set 0 on the HDDs per my post above, so now all SSD reads.

I’m not saying you’re wrong with the rest, just that I haven’t seen this thread’s warning in a few weeks (both before and after that change, which was a week or so ago). So the warning is not really a factor for us, I guess. (And my point/question above was, since there wasn’t a visible warning before, was there actually a problem and people weren’t aware of it? Or is there just a new warning text people are concerned about and nothing has changed performance wise)

It’s early so I’m not at higher math levels but a “slow write” chance would be dependent on number of hosts with HDDs at all, and also the relative number of HDDs vs SSDs in them? (Host with all SSD would have zero chance of course; host with say 2/6 has 33% chance)

Not sure I follow how a separate HDD pool would result in random behavior…doesn’t matter anyway for us as they are only in two servers so not enough for 3/2 replication by themselves.

I was pointing out the affinity setting because if there are less random reads on a drive I’d expect it to be a bit “faster” for writes due to less head seeking.

arteck · 2025-06-01T07:47:20+0200

i see the same message and i only have ssd's. the strange thing is that the message goes away after 4-6 days and comes back after 7 days. without rebooting or anything else

my cluster includes 5 devices and the cluster is running since 5 yeras.. the last reef version had the problem but not all those before it

Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

Active Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

New Member

Well-Known Member

New Member

Active Member

Active Member

Member

Distinguished Member

New Member

Active Member

New Member

Distinguished Member

Active Member

Member

We value your privacy