ceph osd apply latency is high

Edwin Ye

New Member
Mar 30, 2019
2
0
1
38
Hello Sirs.

Has there anyone encountered the same issue as mine?

I found one of OSDs in our production proxmox CEPH cluster environment which had high apply latency(around 500ms.)
It caused our CEPH cluster performance to degrade. After I restarted the OSD, the cluster performance is back to normal.

Why does one OSD with high apply latency will cause a whole ceph cluster performance to degrade?
How to fix this issue, please?
If I need to monitor all OSDs apply latency, how many milliseconds will be a best practice threshold?

Thank you in advance.

Edwin.
 
Why does one OSD with high apply latency will cause a whole ceph cluster performance to degrade?
As Ceph is a distributed storage, alle Ceph services are connected with eachother and the weakest link will determine the performance of the cluster.

How to fix this issue, please?
This needs monitoring and checking the health of all involved subsystems.

If I need to monitor all OSDs apply latency, how many milliseconds will be a best practice threshold?
As low as possible. This depends on your hardware and performance requirements. To get performance counters, see the link.
https://access.redhat.com/documenta...tml/administration_guide/performance_counters
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!