ceph latency spikes 2-3 times per day

RobFantini · Aug 14, 2019

Hello
we are using zabbix graphs to monitor ceph latency. see attached example .
currently we have just seven 2-TB P3700 nvme drives active.

at the time of spikes there is very little activity by users or cronjobs. zabbix network graphs show below average activity at the time of most spikes.

To try to see if there is bad hardware, we'd like to set up per osd latency history data. Does anyone have suggestions on how to do so?

Alwin · Aug 14, 2019

How do you gather the data? The Ceph manager should be providing this data already.

Tho have a quick look you can issue the following command on the respective nodes.

Code:

ceph daemon osd.<ID> dump_historic_slow_ops

RobFantini · Aug 14, 2019

Alwin said:
How do you gather the data? The Ceph manager should be providing this data already.

the data is sent by ceph. i followed parts of https://docs.ceph.com/docs/master/mgr/zabbix/ . only use template from debian package. at pve

Code:

# ceph zabbix config-show
{"zabbix_port": 10051, "zabbix_host": "10.1.3.55", "identifier": "ceph-pve.localdomain.com", "zabbix_sender": "/usr/bin/zabbix_sender", "interval": 60}

i have a local wiki page with close to complete setup info including pic of zabbix config. let me know if wanted.

Tho have a quick look you can issue the following command on the respective nodes. [CODE said:
ceph daemon osd.<ID> dump_historic_slow_ops[/CODE]

thanks for that!

RobFantini · Aug 14, 2019

I have a couple of questions related to tracker. i did a search and am unsure..

Code:

# ceph daemon osd.0 dump_historic_slow_ops
op_tracker tracking is not enabled now, so no ops are tracked currently, even those get stuck. Please enable "osd_enable_op_tracker", and the tracker will start to track new ops received afterwards.

so it need to be enabled in ceph.conf withthis at the osd section

Code:

osd_enable_op_tracker = "true"

questions:
1- does the need to be set?

Code:

# at global section
debug optracker = 0/0

2- could you remind me how to push those setting in a running ceph system or do I need to restart services?

RobFantini · Aug 14, 2019

to apply those settings:

Code:

ceph tell osd.*  injectargs '--osd_enable_op_tracker=true'

Alwin · Aug 14, 2019

Did you resolve all questions? Or still some open?

Search

Search

ceph latency spikes 2-3 times per day

RobFantini

Famous Member

Attachments

Alwin

Proxmox Retired Staff

RobFantini

Famous Member

RobFantini

Famous Member

RobFantini

Famous Member

Alwin

Proxmox Retired Staff