Hello,
We've been encountering some performance issues with VMs in general recently, and in particular with those on ZFS. We have a few hits against us up front in that we're using RAID-Z2, and we're using 7200RPM disks, which I know are bad for VM performance in general, but we're experiencing the same sort of sluggish IO on striped mirrors of SSDs.
According to iostat, our hardware could be performing better. Look at this output.
zd32 is the virtual disk of a Win7 VM that's doing updates. It's using the latest VirtIO drivers available for it. What interests me here is that zd32 is showing 99.6% utilized, when the disks are roughly 50% utilized. Now I've seen the disks get busy and crank up to 100% utilized when there's a lot of activity. But right now there are no users connected to the various VMs. Just this one Win7 VM applying updates.
So the question is: Should zd32 be 100% utilized if the underlying disks are not 100% utilized?
For what it's worth, this particular example is Proxmox 4.4-18, but I've also seen this on 4.4-22.
We've been encountering some performance issues with VMs in general recently, and in particular with those on ZFS. We have a few hits against us up front in that we're using RAID-Z2, and we're using 7200RPM disks, which I know are bad for VM performance in general, but we're experiencing the same sort of sluggish IO on striped mirrors of SSDs.
According to iostat, our hardware could be performing better. Look at this output.
Code:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 77.00 115.00 308.00 640.00 9.88 2.02 10.33 9.92 10.61 3.58 68.80
sdb 0.00 0.00 81.00 116.00 324.00 644.00 9.83 1.32 6.72 6.42 6.93 2.50 49.20
...
sde 0.00 0.00 77.00 117.00 308.00 640.00 9.77 1.16 6.25 5.25 6.91 2.35 45.60
sdf 0.00 0.00 78.00 116.00 312.00 640.00 9.81 1.25 6.45 5.64 7.00 2.47 48.00
...
zd32 0.00 0.00 0.00 197.00 0.00 788.00 8.00 1.09 5.54 0.00 5.54 5.06 99.60
So the question is: Should zd32 be 100% utilized if the underlying disks are not 100% utilized?
For what it's worth, this particular example is Proxmox 4.4-18, but I've also seen this on 4.4-22.