High iodelay on one of three identical nodes

proxwolfe

Well-Known Member
Jun 20, 2020
501
52
48
49
Hi,

I have a three node PVE cluster with identical nodes. Each node has an SSD that is part of a Ceph pool (I know, I should have more SSDs in the pool). And each node also has two HDDs that are part of another Ceph pool.

I replaced the three enterprise grade SSDs with three other, larger enterprise grade SSDs. Two nodes show low iodelays (2%) while one node shows very high iodelays (25%). The three nodes are basically identical (make, model, cpu, memory) and also the (old as well as the) new enterprise grade SSDs are identical (make, model, size). The only difference is the cpu and memory load which is lower (!) on the the node with the high iodelay than on the other two nodes. I even migrated / shut down all VMs on the offending node but the iodelay is still there.

IIRC, the high iodelay was not there before I replaced the SSDs (but I did not check before). So it could -- somehow -- be the one SSD in the offending node. Or something else I'm missing. How can I figure out where this is coming from?

Thanks!
 
Could be a bad ssd indeed.

what is the ssd model ?

About your hdd && ssd, do you have created 2 differents crush rules, and set 1 rule for each pool ?

I even migrated / shut down all VMs on the offending node but the iodelay is still there.
Vms read/write to all osd in the cluster, not only the local osd. So migrate/shutdown vm from local node have no impact. (or you need to shutdown all vms in the cluster)
 
Vms read/write to all osd in the cluster, not only the local osd. So migrate/shutdown vm from local node have no impact. (or you need to shutdown all vms in the cluster)
True. I guess there could have been a VM that uses a local drive - but there wasn't.

About your hdd && ssd, do you have created 2 differents crush rules, and set 1 rule for each pool ?
Yes, two separate rules for each pool.

what is the ssd model ?
It's a Samsung PM863a with 3.84TB capacity.
 
Is your CEPH cluster OK or is something currently running? Are you sure the I/O delay is coming from the SSDs and not local storage or the HDDs? Can you maybe take a screenshot from PVE with the overview of the OSDs? And tell me a little more about your hardware, your network, layout, PGs, replicas etc. Everything can have an impact on your performance.
 
Is your CEPH cluster OK or is something currently running? Are you sure the I/O delay is coming from the SSDs and not local storage or the HDDs? Can you maybe take a screenshot from PVE with the overview of the OSDs? And tell me a little more about your hardware, your network, layout, PGs, replicas etc. Everything can have an impact on your performance.
I'd say it's definitely no local storage (because there are no VMs running on the node anymore).

It could be the HDDs for sure. But shouldn't that affect the other nodes as well? (They are all practically identical).

I will take a snapshot later and post the technical details.
 
where/how do you check i/o delay ?

because, with ceph and qemu using librbd, you'll not be able to see latency of the vms from the host.
(but inside the vm you can see it).


here a useful command to see real vms latency from the host
# rbd perf image iotop
 
where/how do you check i/o delay ?
The PVE GUI

because, with ceph and qemu using librbd, you'll not be able to see latency of the vms from the host.
Yeah, that is something that would be useful.

here a useful command to see real vms latency from the host
# rbd perf image iotop
Cool. Thanks. Will try that.

But in the case at hand, my issue does not seem to be coming from a VM, because there are no VMs on the node and the VMs running on the other node do not affect the iodelay of the other nodes (so badly).

My guess is that this is coming from one of the drives, likely the recently replaced SSD (which, however, is identical to the drive recently replaced SSDs on the other nodes). I'd try another SSD but I don't have another identical one at hand.

So I'm wondering how I could test the drive(s) for causing the delay...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!