[SOLVED] Issue with a spikes in OSD apply latency max

Ting

Member
Oct 19, 2021
107
5
23
57
Hi,

I have 8 nodes HA system, ceph running on 6 nodes with 12 osd. Once a week, I will see a spike in OSD apply latency max reading (see attached picture), wondering how to trouble shoot it, which node and which osd causing this? Any help would be much appreciated.

1718406200186.png
 
I think I found my answer.

In gui / osd page, I discovered two osds has longest latency, replaced those two disks, then it seems very thing is good now.