[SOLVED] RaidZ2 terrbile high IO delay

egy87

Member
Apr 27, 2020
6
0
6
38
Dear all,

I'm struggling with severe IO delay one of our server.

Configuration of server is:
- Dell R620
- H310 SAS controller flashed in IT mode
- 8x600 gb SAS hard disk 10k rpm
- 80 gb of ram
- 2x Intel(R) Xeon(R) CPU E5-2620
- ZFS raid Z2 with Slog and cache in separate nvme drive

During copy of big files in VM or during backup/restore a VM registered IO delay is almost 50% and looking in atop all disks are 100% busy. Also the server and all VMs becoming unresponsive.

All SMART information from disk are ok, so I don't think problem in disks. I also try to play with different setting and configuration of zfs but without result.

Any suggestion?
 
RAIDZ2 with HDDs has low IOPs, so it becomes a bottleneck. ZFS striped mirror may work a bit better.
You can set bandwidth limits for backup/restore and clone operations in Datacenter -> Options -> Bandwidth Limits
 
Jup, 8x 10K RPM HDDs in raidz2 = less IOPS that a single 10K RPM HDD could handle. So if you are lucky you get something like 200 IOPS for the entire pool.
 
Thank you for answer!!!

Do you think that switching from raidz2 to an hardware raid 6 with H710P controller I could solve the problem?
 
Thank you for answer!!!

Do you think that switching from raidz2 to an hardware raid 6 with H710P controller I could solve the problem?
I don't think so. Neither with Raid6 nor Raidz2 the IOPS performance will scale with the number of drives. In theory it would look like this:
8 disk raidz2 or raid6:8 disk striped mirror or raid10:
Usable Capacity:53 - 75%50%
IOPS:1x4x
Read troughput:6x8x
Write throughput:6x4x
Drives may fail:21 - 4

So no matter how much disks your raid6/raidz2 consists of, it will always got less IOPS performance than a single HDD.
 
Ok! In this case I have another question.

In terms of IOPS are there a lot of difference from raid 5 to raid 6? I ask because we have many server in production with hardware raid 5 (from 6 to 10 disk each server) and none of them have this problem.
 
If you are using hardware raid there is sometimes a cache, which can speedup raid speed. This is difference between software and hardware raid. We are using raid5 everywhere,but with ssds and raid cards.
 
Ok! In this case I have another question.

In terms of IOPS are there a lot of difference from raid 5 to raid 6? I ask because we have many server in production with hardware raid 5 (from 6 to 10 disk each server) and none of them have this problem.
No, both raid5 and raid6 are limited IOPS wise to the performance of a single disk.
 
Do you know the standard block size for a VM disk image? Or better, is there a standard block size that proxmox "see" for write and read into virtual machine disk?

Considering the limited IOPS I have, I would try optimize ZFS blocksize
 
Do you know the standard block size for a VM disk image? Or better, is there a standard block size that proxmox "see" for write and read into virtual machine disk?
Is blocksize what you are looking for?It is a property of the Storage and is used when creating new virtual disks for VMs (on that ZFS storage).
 
Do you know the standard block size for a VM disk image? Or better, is there a standard block size that proxmox "see" for write and read into virtual machine disk?

Considering the limited IOPS I have, I would try optimize ZFS blocksize
Don't choose a too small blocksize or you get alot of padding overhead: https://www.delphix.com/blog/delphi...or-how-i-learned-stop-worrying-and-love-raidz

And don't fill your pool more than 80% or it will get slow too. So 20% should always be kept free.
 
Last edited:
Do you know the standard block size for a VM disk image?
Most OSes use 4K blocksizes.

Keep in mind that ordinary RAID uses stripes to store the data that are larger than a normak 512b or 2048b Block on your disk. Something in the order of 32K to 128K. So you have to read more and save more in order to change one bit. ZFS on the other can be optimized with respect to the data stored. Imagine, you have 8 disks (<2TB) in a stripped mirror setup with 512b block sizes, you then can store one 2048b block in 4x512b blocks, one on each disk, which is perfekt, so that one 2048 block is read from all stripes. The same for e.g. a default 4K block of a filesystem like NTFS or EXT4. It'll read two 512 byte blocks from each strip or one 512 byte block from each disk, which is perfekt!
 
After some days of trial and test I decide to re-install proxmox in Raid 10 configuration. Now perfomance are way better and IO delay is almost zero.

Thank you everybody for suggestions and help!