Ceph Slow Requests

TecScott · Feb 12, 2019

I've currently got a 4 node cluster running Ceph on Proxmox 5.1 and noticed recently I'm getting a lot of blocked requests due to request_slow.

For example:

2019-02-12 11:47:33 cluster [WRN] Health check failed: 6 slow requests are blocked > 32 sec (REQUEST_SLOW)
2019-02-12 11:47:47 cluster [WRN] Health check update: 4 slow requests are blocked > 32 sec (REQUEST_SLOW)
2019-02-12 11:47:53 cluster [WRN] Health check update: 2 slow requests are blocked > 32 sec (REQUEST_SLOW)
2019-02-12 11:48:03 cluster [INF] Health check cleared: REQUEST_SLOW (was: 2 slow requests are blocked > 32 sec)

There are currently 2 OSD's per node, 4TB each at 7.2K RPM. These use a journal disk with is a NVME SSD drive.

Latency is always shown as 0 for commit and 0-2 for apply.

Any suggestions on ways to investigate what's causing issues? It's causing noticeable performance issues on VM's, particularly Windows Server 2016.

Alwin · Feb 12, 2019

Check the ceph logs '/var/log/ceph/' to verify which parts of Ceph are involved. And please describe your system in more detail, so we can get a better picture of your cluster.

TecScott · Feb 12, 2019

The logs only show what I've said really, the main log (ceph.log) shows health check failed: 2 slow requests are blocked > 32 sec (REQUEST_SLOW) then 1 slow request, then 3 slow, then 4 slow, then health check cleared after around 30 seconds and it's back to healthy.

ceph-mon.x.log shows at the same time a log_channel(cluster) log message with the same detail and sending the message to other monitors.

I don't find anything useful in the osd logs (just level-0 table started, so many bytes okay, delete type=0)

What other detail would you like to know? There are 4 nodes, 2 OSD's per node, NVMe SSD for journal, 7.2k disks for storage, 10GbE network for ceph.

Alwin · Feb 13, 2019

Did you check all ceph logs on all hosts? In some of the logs you may find entries that indicate which OSDs where involved with the slow requests.

TecScott said:
What other detail would you like to know? There are 4 nodes, 2 OSD's per node, NVMe SSD for journal, 7.2k disks for storage, 10GbE network for ceph.

How are the disks connected (HBA/Raid)? What CPU and RAM?

TecScott · Feb 13, 2019

Checked all logs on all nodes and there doesn't seem to be anything indicating what OSD's were the cause.

It may be unrelated but I've noticed that the SWAP usage on the hosts was pretty high (3GB+) although the RAM usage is only at 40-50%.

The nodes have 80GB RAM and 1x Xeon E5-2620 v4's each.

They're passed straight through as JBOD (no RAID), each physical disk is an OSD.

Alwin · Feb 13, 2019

TecScott said:
It may be unrelated but I've noticed that the SWAP usage on the hosts was pretty high (3GB+) although the RAM usage is only at 40-50%.

Depends on what was swapped out and if it was used at the time. If you have a performance monitoring it may show this. Seems like there may have been a resource spike that caused the delay.

TecScott said:
They're passed straight through as JBOD (no RAID), each physical disk is an OSD.

As a side note putting disk through as JBOD is not the same, as using a HBA.

TecScott · Feb 13, 2019

The IO delay on all nodes seems to sit around 10% too.

The SWAP usage is also consistent (i.e. it's not spiking to 3GB, it's consistently sitting around 3GB) even though the RAM is 40-50%.

Alwin · Feb 13, 2019

Are you talking about now, or when the slow requests showed up?

But in general, two spinners will do roughly ~160 MB/s (good ones), so it could well be that there are just not enough OSDs to cope with the load. But for now everything is guesswork.

TecScott · Feb 15, 2019

At all times, the SWAP seems to have been a result of the swappiness setting (at 40% it'll start using SWAP?).

The IO delay is always around 10% on each of the 4 hosts, however.

Any recommendation (i.e. what logs, debug logs, etc) to try to get to the bottom of what's causing this would be appreciated, if we can determine it's due to the slow disks then we can look into getting faster disks, or are you suggested more OSD's if that'll resolve the issue.

Alwin · Feb 18, 2019

There are different things you can test, disable SWAP (swapoff), run iotop,sar etc. Go through all of the logs and if there isn't any usable information, you can increase log verbosity.
http://docs.ceph.com/docs/luminous/rados/troubleshooting/log-and-debug/

Search

Search

Ceph Slow Requests

TecScott

Active Member

Alwin

Proxmox Retired Staff

TecScott

Active Member

Alwin

Proxmox Retired Staff

TecScott

Active Member

Alwin

Proxmox Retired Staff

TecScott

Active Member

Alwin

Proxmox Retired Staff

TecScott

Active Member

Alwin

Proxmox Retired Staff