IO Usage in ceph

Volker Lieder · Mar 9, 2018

Hi,
we are using proxmox 5.1-36 and ceph luminous.
Sometimes we see higher io latency on some pgs and if i view iostat -x 1 in shell, i see one hardware drive from ceph with high awaits and %util.
Is it possible to identify the vm which is the cause for the awaits and io usage?

Best regards,
Volker

Alwin · Mar 9, 2018

Well, you can find out which client is doing heavy IO but this is more in generell. It sounds to me that there are some disks that are slower then others or are more congested. How does your crushmap and osd tree look like? Are there mixed disks in a root bucket?

Volker Lieder · Mar 12, 2018

Hi Alwin,
the usage jumps between different OSDs and everytime i see it, its on all 3 nodes of ceph on one osd per hardware node.
HDDs are all the same. Is it possible to identify io usage inside ceph? On nodes there is only one node which shows round about 15MB/second from proxmox io view, but thats not so much. Or is the view inside proxmox not shown correctly?
The ceph is build over infiniband with 56Gbit, so i think this should not be a bottleneck.

Regards,
Volker

Alwin · Mar 12, 2018

There are different perf counters, but I guess they might be more general.
http://docs.ceph.com/docs/luminous/dev/perf_counters/

But with some disks having high wait and utilization, while others don't. Sounds to me, as either the PGs are not evenly distributed (crush map) or that some of the OSDs are starving. Are you using a RAID controller for the OSDs?

EDIT: corrected the link, verion was wrong.

Search

Search

IO Usage in ceph

Volker Lieder

Renowned Member

Alwin

Proxmox Retired Staff

Volker Lieder

Renowned Member

Alwin

Proxmox Retired Staff

We value your privacy