Windows VM : poor disk latency

simplerezo

Member
Apr 26, 2022
12
0
6
PARIS
www.simplerezo.com
Hi !

Using diskspd (https://github.com/microsoft/diskspd) from Microsoft, we have benchmark disk speed on a Windows Server 2022 VM, hosted on a proxmox host - of course -, alone.

Code:
[...]
Total IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
-----------------------------------------------------------------------------------------------------
[...]
-----------------------------------------------------------------------------------------------------
total:       72978333696 |      1113561 |     231.99 |    3711.82 |   68.966 |   148.804

[...]
  %-ile |  Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      0.090 |      2.059 |      0.090
   25th |      2.917 |     53.505 |      4.996
   50th |      5.914 |     83.493 |     10.338
   75th |      9.033 |    192.702 |     65.206
   90th |     12.087 |    394.104 |    192.690
   95th |     14.286 |    515.227 |    351.609
   99th |     20.499 |    881.862 |    649.537
3-nines |     37.756 |   2056.043 |   1718.616
4-nines |     63.900 |   3498.892 |   2928.629
5-nines |     98.070 |   3506.438 |   3504.983
6-nines |    292.115 |   3506.912 |   3506.904
7-nines |    292.115 |   3506.912 |   3506.912
8-nines |    292.115 |   3506.912 |   3506.912
9-nines |    292.115 |   3506.912 |   3506.912
    max |    292.115 |   3506.912 |   3506.912

Throughput is good, but our concerns is the latency (AvgLat: 69ms) that seems pretty high.

Configuration:
- HW: H730p (HBA mode) - 3 SSD
- ZFS RAIDz1, lz4 ON, dedup OFF
- VirtIO SCSI + SCSI disk with writeback + discard
- Virtio drivers + QEmuAgent installed

I ran a sysbench fileio test on host (/rpool) and latency is good :
Code:
Latency (ms):
         min:                                    0.00
         avg:                                    0.04
         max:                                   44.16
         95th percentile:                        0.21
         sum:                                29590.82

How can we improve this ?
 
Raidz1 isn't great for latency/IOPS. Using Raidz1 of 3 disks would also mean that you are wasting lot of capacity due to padding overhead in case you didn'T increased the volblocksize from the default 8K to 16K. And keep in mind that a ZFS pool should always have atleast 20% of its capacity free or the pool will become slow.
 
  • Like
Reactions: fireon
You are probably still wasting 17% of your raw capacity due to padding overhead when not using a volblocksize of atlwast 16K. Compare your used space on the ZFS pool with the sum of all guest filesystems. Everything should consume 33% more space. So storing 1TB of data would result in 1.33TB used on the ZFS pool, because for each 1TB of data there would be additional 333GB of padding blocks. If you also include the 20% that always should be kept free, only 40% of the raw capacity should be usable when using zvols with the default 8K vollboksize.
You might also want to try that your guests blocksize matches the volblocksize of your zvols. When using a 16K volblocksize you could format your NTFS partitions with a 16K clustersize.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!