Hi guys I thought I would share my data with you.
I already know from experience with hosting ext4 based disks on top of spindled storage with proxmox that 64k yielded a large performance improvement alongside improvements on i/o delay.
So now I have done some testing on ssd backed storage and with windows ntfs in the guest.
There is a lot of data to consume, I planned to test more and I may do so on a later post.
First some information for the host. Sorry for the lack of hardware for testing, dont have much at hand.
ZFS 2.0.5 pool is single Samsung 850 512gig pro ssd on onboard SATA port
Autotrim enabled ashift 12 on the pool
LZ4 is compression type for both datasets and volumes
VM disks are virtio scsi 1gig size, discard on, ssd emulation on
Volumes are created with thin provisioning disabled
Dataset disks were created using raw disk format
Qemu disks had iothread and ssd emulation enabled
Host cpu Ryzen 2600X
The tests were done with crystal diskmark 8.
I am providing data that was collected from the proxmox performance graphs as well as the numbers from the crystal diskmark inside the guest.
The test I expect would still likely be relevant for an ssd mirror setup, however raidz probably warrants separate data.
Future test results will add 16k size, tests done on ssd mirror, and tests done on spindle mirror.
I tested the following configurations.
4k cluster size on ntfs, 4k volblocksize, 4k recordsize.
64k cluster size on ntfs, 64k volblocksize, 64k record size.
Qemu caching modes tested - nocache - directsync - writeback
ZFS primarycache was tested on both all and metadata, sync was tested on standard and never.
The aim was to test overall performance with and without host caching. In the guest OS the default caching was enabled, write cache, but not disabling of write cache flushes.
Guest OS was fully patched Windows 10 20H2
4 vcpus
8 gig of ram
Host performance data (peak readings) for qemu cache nocache, zfs primarycache=all sync=standard
4k dataset, i/o delay 5.99%, vm read speed 621.5M, vm write speed 282.6M
4k volume, i/o delay 7.74%, vm read speed 1.39G, vm write speed 1.27G
64k dataset, i/o delay 3.91%, vm read speed 858.06M, vm write speed 423.43M
64k volume, i/o delay 4.59%, vm read speed 1.29G, vm write speed 847.4M
Host performance data (peak readings) for qemu cache nocache, zfs primarycache=metadata sync=always
4k dataset, i/o delay 1.96%, vm read speed 51.99M, vm write speed 5.23M
4k volume, i/o delay 20.43%, vm read speed 316.21M, vm write speed 27.86M
64k dataset, i/o delay 2.81%, vm read speed 59.64M, vm write speed 4.75M
64k volume, i/o delay 27.19%, vm read speed 235.47M, vm write speed 53.98M
Host performance data (peak readings) for qemu cache directsync, zfs primarycache=all sync=standard
4k dataset, i/o delay 2.83%, vm read speed 408.52M, vm write speed 28.88M
4k volume, i/o delay 7.07%, vm read speed 1.37G, vm write speed 35.79M
64k dataset, i/o delay 1.96%, vm read speed 991.31M, vm write speed 38.05M
64k volume, i/o delay 4.09%, vm read speed 1.36G, vm write speed 35.77M
Host performance data (peak readings) for qemu cache directsync, zfs primarycache=metadata sync=always
4k dataset, i/o delay 2.19%, vm read speed 72.86M, vm write speed 7.01M
4k volume, i/o delay 20.38%, vm read speed 187.03M, vm write speed 19.09M
64k dataset, i/o delay 2.49%, vm read speed 94,71M, vm write speed 7.17M
64k volume, i/o delay 27.42%, vm read speed 338.45M, vm write speed 62.48M
Host performance data (peak readings) for qemu cache writeback, zfs primarycache=all sync=standard
4k dataset, i/o delay 1.62%, vm read speed 1.33G, vm write speed 539.96M
4k volume, i/o delay 7%, vm read speed 1.57G, vm write speed 1.34G
64k dataset, i/o delay 1.12%, vm read speed 1.59G, vm write speed 1.27G
64k volume, i/o delay 1.04%, vm read speed 1.64G, vm write speed 1.47G
Host performance data (peak readings) for qemu cache writeback, zfs primarycache=metadata sync=always
4k dataset, i/o delay 19.05%, vm read speed 226.57M, vm write speed 17.95M
4k volume, i/o delay 10.41%, vm read speed 1.34G, vm write speed 1.34G (yes the same)
64k dataset, i/o delay 34.62%, vm read speed 221.11M, vm write speed 39.86M
64k volume, i/o delay 3.88%, vm read speed 1.59G, vm write speed 1.48G
Analysing this data shows a few things, the i/o delay performance ratio is in some cases significantly favourable to 64k clusters, in addition writeback mode caching is very powerful, incredibly powerful for zfs volumes to the point it maintains most of its performance with all zfs caching disabled albeit with massively increased i/o delay. When caching is disabled, 64k usually has a clear advantage which suggests to me its underlying performance is faster than 4k, however in the default configuration of nocache, and zfs caching enabled, on the host data 4k does win out on write speeds for volumes somewhat of an anomaly but nevertheless the default configuration.
Guest performance data in next post.
I already know from experience with hosting ext4 based disks on top of spindled storage with proxmox that 64k yielded a large performance improvement alongside improvements on i/o delay.
So now I have done some testing on ssd backed storage and with windows ntfs in the guest.
There is a lot of data to consume, I planned to test more and I may do so on a later post.
First some information for the host. Sorry for the lack of hardware for testing, dont have much at hand.
ZFS 2.0.5 pool is single Samsung 850 512gig pro ssd on onboard SATA port
Autotrim enabled ashift 12 on the pool
LZ4 is compression type for both datasets and volumes
VM disks are virtio scsi 1gig size, discard on, ssd emulation on
Volumes are created with thin provisioning disabled
Dataset disks were created using raw disk format
Qemu disks had iothread and ssd emulation enabled
Host cpu Ryzen 2600X
The tests were done with crystal diskmark 8.
I am providing data that was collected from the proxmox performance graphs as well as the numbers from the crystal diskmark inside the guest.
The test I expect would still likely be relevant for an ssd mirror setup, however raidz probably warrants separate data.
Future test results will add 16k size, tests done on ssd mirror, and tests done on spindle mirror.
I tested the following configurations.
4k cluster size on ntfs, 4k volblocksize, 4k recordsize.
64k cluster size on ntfs, 64k volblocksize, 64k record size.
Qemu caching modes tested - nocache - directsync - writeback
ZFS primarycache was tested on both all and metadata, sync was tested on standard and never.
The aim was to test overall performance with and without host caching. In the guest OS the default caching was enabled, write cache, but not disabling of write cache flushes.
Guest OS was fully patched Windows 10 20H2
4 vcpus
8 gig of ram
Host performance data (peak readings) for qemu cache nocache, zfs primarycache=all sync=standard
4k dataset, i/o delay 5.99%, vm read speed 621.5M, vm write speed 282.6M
4k volume, i/o delay 7.74%, vm read speed 1.39G, vm write speed 1.27G
64k dataset, i/o delay 3.91%, vm read speed 858.06M, vm write speed 423.43M
64k volume, i/o delay 4.59%, vm read speed 1.29G, vm write speed 847.4M
Host performance data (peak readings) for qemu cache nocache, zfs primarycache=metadata sync=always
4k dataset, i/o delay 1.96%, vm read speed 51.99M, vm write speed 5.23M
4k volume, i/o delay 20.43%, vm read speed 316.21M, vm write speed 27.86M
64k dataset, i/o delay 2.81%, vm read speed 59.64M, vm write speed 4.75M
64k volume, i/o delay 27.19%, vm read speed 235.47M, vm write speed 53.98M
Host performance data (peak readings) for qemu cache directsync, zfs primarycache=all sync=standard
4k dataset, i/o delay 2.83%, vm read speed 408.52M, vm write speed 28.88M
4k volume, i/o delay 7.07%, vm read speed 1.37G, vm write speed 35.79M
64k dataset, i/o delay 1.96%, vm read speed 991.31M, vm write speed 38.05M
64k volume, i/o delay 4.09%, vm read speed 1.36G, vm write speed 35.77M
Host performance data (peak readings) for qemu cache directsync, zfs primarycache=metadata sync=always
4k dataset, i/o delay 2.19%, vm read speed 72.86M, vm write speed 7.01M
4k volume, i/o delay 20.38%, vm read speed 187.03M, vm write speed 19.09M
64k dataset, i/o delay 2.49%, vm read speed 94,71M, vm write speed 7.17M
64k volume, i/o delay 27.42%, vm read speed 338.45M, vm write speed 62.48M
Host performance data (peak readings) for qemu cache writeback, zfs primarycache=all sync=standard
4k dataset, i/o delay 1.62%, vm read speed 1.33G, vm write speed 539.96M
4k volume, i/o delay 7%, vm read speed 1.57G, vm write speed 1.34G
64k dataset, i/o delay 1.12%, vm read speed 1.59G, vm write speed 1.27G
64k volume, i/o delay 1.04%, vm read speed 1.64G, vm write speed 1.47G
Host performance data (peak readings) for qemu cache writeback, zfs primarycache=metadata sync=always
4k dataset, i/o delay 19.05%, vm read speed 226.57M, vm write speed 17.95M
4k volume, i/o delay 10.41%, vm read speed 1.34G, vm write speed 1.34G (yes the same)
64k dataset, i/o delay 34.62%, vm read speed 221.11M, vm write speed 39.86M
64k volume, i/o delay 3.88%, vm read speed 1.59G, vm write speed 1.48G
Analysing this data shows a few things, the i/o delay performance ratio is in some cases significantly favourable to 64k clusters, in addition writeback mode caching is very powerful, incredibly powerful for zfs volumes to the point it maintains most of its performance with all zfs caching disabled albeit with massively increased i/o delay. When caching is disabled, 64k usually has a clear advantage which suggests to me its underlying performance is faster than 4k, however in the default configuration of nocache, and zfs caching enabled, on the host data 4k does win out on write speeds for volumes somewhat of an anomaly but nevertheless the default configuration.
Guest performance data in next post.
Last edited: