Elaborate on IO delay in server summary

Shadow Sysop

Member
Mar 7, 2021
53
3
13
40
I'm curious about the actual meaning of the IO delay in the Server's Summary. For example, when I clone a VM from a template, I expect high IO. All my clone templates are on it's own dedicated drive. When I clone a VM from one of these templates, the IO delay for the node hops up as high as 30-35%, but VMs on that same node but on a different disk seem relatively unaffected. Even the nodes performance in general seems about normal, which is odd because if a high IO delay is, say, on the local disk, I would probably see some performance degradation. Not the case here in this specific instance. So am I right to assume that the IO delay spikes during the clone activity is restricted to the disk that hosts all those templates? I'm only asking the community because I've never seen such high IO activity and honestly expected my PVE to be screaming bloody murder with IO delays of 30-40% . Any clarification would be greatly apperciated
 
iodelay is basically the same as the linux kernel %iowait
from 'sar' manpage:

Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.

so a high iowait does not mean anything should be slow
it is just the relative percentage of 'waiting on io'/'idle cpu time'

if you cpu is fully loaded, this will generally be less, even if much io is going on

so this is always to be seen in context with cpu usage and general system load
 
  • Like
Reactions: Shadow Sysop
sorry for jumping in to the old thread.
but if IO delay is iowait, is there any way how to reduce it?
It seems like CPU's are almost all the time iddling, while there are disk i/o waiting. Why? I'd expect this to be happening when CPU's are 100% but it isnt.

1707141171455.png
 
bit of answer for the reference:
this was definitely caused by zfs and passthorough from hw controller to proxmox.
I've reinstalled proxmox on same machine (R620 10 drives), but on virtual disks, where HW raid is managing actuall drive pools not zfs from proxmox. Configuration is RAID1 for system (2disks), RAID10 (2x4disks) for data

immediatelly iowait eg. iodelay went from 7% to 0.02-0.1% with all the same LXC/VM's on the host as before.

TL;DR: stopped using ZFS, using LVM on HW raid, solved all the issues with performance
 
I obviously didnt, i had it in HBA mode and it was uterly bad. Even changed controllers and reflashed fw because of it.
But switched back to HW raid and never looked back to zfs
 
ZFS got a lot of overhead because of doing some sync writes as well as all the Copy-on-Write and additional integrity checks your HW raid and LVM are missing.
Higher IO delay is to be expected and one of the reasons why it is highly recommended to use enterprise SSDs for demanding workloads.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!