Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

I have upgraded too to 19.2.2 before weekend, but no luck:

HEALTH_WARN: 2 OSD(s) experiencing slow operations in BlueStore
osd.9 observed slow operation indications in BlueStore
osd.15 observed slow operation indications in BlueStore
 
I experience all of the mentioned BlueStore warnings... Additionally taking snapshots is taking forever now.
This used to be a matter of seconds, now it takes minutes + the snaptrim process loads the cpu for a very long time.

I feel, this is introduced in ceph 19.2.2. I will try and have proper data for this. Just sharing for now, maybe others experience this also.
 
After data recovery, and upgrading to 19.2.2, I am now getting this one of my pure SSD class pools (no wall or db, basic replication only):


Code:
[WRN] BLUESTORE_SLOW_OP_ALERT: 1 OSD(s) experiencing slow operations in BlueStore
     osd.9 observed slow operation indications in BlueStore
[WRN] DB_DEVICE_STALLED_READ_ALERT: 1 OSD(s) experiencing stalled read in db device of BlueFS
     osd.9 observed stalled read indications in DB device

As best as I can tell, this disk is perfectly healthy. :/
what is your ssd model ?
 
The problem still exists, and it is not solved. I can only look at the error message and pray that it will not crash. Haha.
 
  • Like
Reactions: waltar
little infor from here

after change crucial CT240BX500SSD1 to WD blue is the problem DONE
 
Hi there!

As far as I can read the docs, it's not a disk failure, it's not even an error condition.

This feature was introduced in Reef with 18.2.5 and Squid with 19.2.1.

You can find the documentation here.
German speaking blog post here.

You can adjust the two variables to your needs:
Code:
ceph config set global bluestore_slow_ops_warn_lifetime 21600
ceph config set global bluestore_slow_ops_warn_threshold 5

I would be very careful changing them both or setting thresholds too low, but you're the expert in your environment.
For my stage cluster, it solves the unnecessary noises, but I'm also not dealing with performance issues there, so who knows...
Now I can go forward with prod.

Regards and happy hacking,
Marianne
 
Hi there!

As far as I can read the docs, it's not a disk failure, it's not even an error condition.

This feature was introduced in Reef with 18.2.5 and Squid with 19.2.1.

You can find the documentation here.
German speaking blog post here.

You can adjust the two variables to your needs:
Code:
ceph config set global bluestore_slow_ops_warn_lifetime 21600
ceph config set global bluestore_slow_ops_warn_threshold 5

I would be very careful changing them both or setting thresholds too low, but you're the expert in your environment.
For my stage cluster, it solves the unnecessary noises, but I'm also not dealing with performance issues there, so who knows...
Now I can go forward with prod.

Regards and happy hacking,
Marianne
I have to say, that I also found this documentation and I did set bluestore_slow_ops_warn_threshold per problematic OSD and warning is gone !

So it is really seems to be a feature ....
 
I have to say, that I also found this documentation and I did set bluestore_slow_ops_warn_threshold per problematic OSD and warning is gone !

So it is really seems to be a feature ....
Ya, I still think something is still up... I have three SSD OSDs across hosts, different types, brands and controller types all reporting this. I mean MAYBE all thee are being bogged down enough to delay IO for more than a second, but (it not that busy)... Maybe that is some sort of round trip time that includes processes outside of actual read/writes...

Wish the docs said what these values are measured in. "bluestore_slow_ops_warn_threshold" Seems to be defaulted to 1, so I assume 1 second. It looks like the default for "bluestore_slow_ops_warn_lifetime" is only 600 (10 minutes? hours?).

Will experiment here.
 
Observed for quite some time, it does happened only on SSD, but never happen to NVMe and HDD OSD.
I'm still using ceph: 17.2.8-pve2.

when happen, run in cli:

ceph config set osd.x bluestore_slow_ops_warn_threshold 120

error go away.

However, it come back again randomly on different SSD OSD, even plugged in the new one.
 
Yes, I'm using 7200 RPM HDD's and had to set the following as a workaround:
Bash:
ceph config set class:hdd bluestore_slow_ops_warn_lifetime 21600
ceph config set class:hdd bluestore_slow_ops_warn_threshold 320
 
I see that 19.2.3 has been released. I don't know when it will be available. Will it solve this problem?
 
I think I was able to trigger these warning with minor network disruptions to the cluster network (<<5s; LACP bond renegotiated) and in my case I don't think it's anything to do with the drives. It seems like the performance monitoring is just on a hair trigger and any network hickup will raise this warning too? As noted above, restarting the OSD does clear the warning until something else happens.

I see multiple references to this being a new Ceph feature (e.g. we got more observability, nothing new is broken?). Does sound useful to flag drives with the wrong firmware for array operations that are going offline for multiple seconds on a failed read doing retries.
 
On my server it has disappeared... Strange
Usually it disappears on Monday and Tuesday, and comes back on Wednesday, Thursday, and Friday. This is positively correlated with how busy your system is.