Hi!
I've set about figuring out exactly how big is IO performance drop in KVM compared to host ZFS performance. I have a Supermicro platform, 2 x Xeon Gold 6226R, 128 Gb DDR-4 RAM. The storage is 2 x Intel D3-S4610 (ssdsc2kg480g8) in ZFS mirror, pool ashift set to 13. Fresh install of PVE 6.3-2, no other CTs/VMs running on this node except single test VM (ZVol volblocksize=8k, virtio scsi single, iothread=1, no cache). I created a separate dataset for ZFS benching with recordsize set to 8k, which I think would be more fair to compare to 8k blocksize ZVol.
The goal was to compare raw SSD read performance to native ZFS and VM ZVol. I've tried to eleminate ARC as much as possible, so I set arc_max_size to 4G, benchmark datafile size to 16G and executed sync ; echo 3 > /proc/sys/vm/drop_caches before every bench run. I used fio as a benchmark for all the tests. The results for most of workloads were as expected, but random reads show strange behaviour: it does not scale up with iodepth at all on host ZFS, but it does scale perfectly well on VM.
Typical fio job file:
[global]
bs=8k
iodepth=1
direct=1
ioengine=libaio
numjobs=1
name=RandRead1
rw=randread
runtime=90
[job1]
filename=./generic.test
Results, raw SSD performance (MB/s@IOPS):
iodepth=1: 50.4@6.4k
iodepth=2: 95.4@12.2k
iodepth=4: 170@21.8k
iodepth=8: 280@35.7k
iodepth=16: 403@51.6k
iodepth=32: 455@58.2k
Host ZFS performance (MB/s@IOPS):
iodepth=1: 34.3@4.3k
iodepth=2: 38.9@4.9k
iodepth=4: 38.8@4.9k
iodepth=8: 37.9@4.8k
iodepth=16: 36.9@4.7k
iodepth=32: 38.8@4.9k
KVM ZVol performance (MB/s@IOPS):
iodepth=1: 28.1@3.6k
iodepth=2: 75@9.6k
iodepth=4: 127@16.3k
iodepth=8: 233@31.1k
iodepth=16: 412@52.8k
iodepth=32: 495@63.4k
Is there any explanation to this? Does it really mean async reads are processed as sync in ZFS?
I've set about figuring out exactly how big is IO performance drop in KVM compared to host ZFS performance. I have a Supermicro platform, 2 x Xeon Gold 6226R, 128 Gb DDR-4 RAM. The storage is 2 x Intel D3-S4610 (ssdsc2kg480g8) in ZFS mirror, pool ashift set to 13. Fresh install of PVE 6.3-2, no other CTs/VMs running on this node except single test VM (ZVol volblocksize=8k, virtio scsi single, iothread=1, no cache). I created a separate dataset for ZFS benching with recordsize set to 8k, which I think would be more fair to compare to 8k blocksize ZVol.
The goal was to compare raw SSD read performance to native ZFS and VM ZVol. I've tried to eleminate ARC as much as possible, so I set arc_max_size to 4G, benchmark datafile size to 16G and executed sync ; echo 3 > /proc/sys/vm/drop_caches before every bench run. I used fio as a benchmark for all the tests. The results for most of workloads were as expected, but random reads show strange behaviour: it does not scale up with iodepth at all on host ZFS, but it does scale perfectly well on VM.
Typical fio job file:
[global]
bs=8k
iodepth=1
direct=1
ioengine=libaio
numjobs=1
name=RandRead1
rw=randread
runtime=90
[job1]
filename=./generic.test
Results, raw SSD performance (MB/s@IOPS):
iodepth=1: 50.4@6.4k
iodepth=2: 95.4@12.2k
iodepth=4: 170@21.8k
iodepth=8: 280@35.7k
iodepth=16: 403@51.6k
iodepth=32: 455@58.2k
Host ZFS performance (MB/s@IOPS):
iodepth=1: 34.3@4.3k
iodepth=2: 38.9@4.9k
iodepth=4: 38.8@4.9k
iodepth=8: 37.9@4.8k
iodepth=16: 36.9@4.7k
iodepth=32: 38.8@4.9k
KVM ZVol performance (MB/s@IOPS):
iodepth=1: 28.1@3.6k
iodepth=2: 75@9.6k
iodepth=4: 127@16.3k
iodepth=8: 233@31.1k
iodepth=16: 412@52.8k
iodepth=32: 495@63.4k
Is there any explanation to this? Does it really mean async reads are processed as sync in ZFS?