Benchmarking ZFS for Database Performance (MSSQL)

NiceRath · Monday at 15:58

This thread should provide some information on how you might be able to optimize I/O performance for your VMs.
Please: If you post answers with test-outputs - use the 'inline spoiler'

For those that are interested in the charts and values see: docs.google.com/spreadsheets

Warning: We are no experts in benchmarking hardware. If you see some issue - please point it out in the answers.

Note: We did not yet find time to run benchmarks on a MSSQL-instance inside a VM.. we might add them later on.

If you have some idea on how to improve performance of RND64k-Read in a Windows-VM (>14.5GB/s) => please give us a hint

Thanks to @Dunuin for the post on how to benchmark SSD's and additional information in other threads - we appreciate the information

Our Goals:

Optimize I/O performance for MSSQL (on Windows- or Linux-VM)
Use ZFS to allow for Replication (and possible HA-Setup)
The setup should provide high 64k-Read throughput as we see high read-values in our monitoring (prometheus & grafana) of the existing server
Have the disks in a RAID10-like array (2x ZFS-mirrors in a zpool)
As it is recommended to run MSSQL on a 64k FS - we mainly focused on 64k-RND-Read

Setup:

Hardware:

Dell PowerEdge R7615 Server
2x System disks NVMe
4x 3,84 TB Enterprise NVMe Read-Optimized U.2 4th Gen (fdisk -lshows 4096-byte physical-sectors) directly available to ZFS
RAM: 386GB DDR5-4800
CPU: 1x AMD EPYC 9274F, 4,05 GHz, 24C/48T

PVE:

Version: 8.4.1
Debian 12, Kernel 6.8.12-11-pve
No other traffic/load/VMs on the tested server

VMs:

Windows: Windows Server 2025 Standard, 200GB RAM (DDR5), 16 Cores (4GHz), Windows Updates 2025-07, UEFI Boot, System-disk = same zfs pool as data, memory ballooning off, virtio-win 0.1.271, FS: NTFS 64k
Linux: Debian 12.7 netinstall, 200GB RAM (DDR5), 16 Cores (4GHz), Kernel: 6.1.0-37-amd64, System-disk = same zfs pool as data, 2GB Swap, memory balooing off, FS: EXT4 and XFS tested

Benchmarks:

The Sheet with results and charts: docs.google.com/spreadsheets
On linux we used the tool 'fio' (apt install fio)
The fio config we used: gist.github.com/NiceRath
You might want to lower the iodepth and numjobs to the point where you do not see any improvement - we just eye-balled it
On windows we used the tool 'crystal benchmark'
On ZFS we disabled caching for all tests (unless stated specifically): zfs set primarycache=metadata <pool>
The I/O delay was observed via prometheus node-exporter and grafana

What we tested:

Disk-layer (1 disk, ZFS-ashift, for comparison: MD-R10)
ZFS-kernel-module parameter-tweaking
some zpool options
ZFS volsize for Linux-VM
VM-disk options for Windows
Some VM-disk options for Linux
Guest-FS for Linux (ext4, xfs)

Resources:

PVE Benchmarking Storage: pve.proxmox.com/wiki
PVE Performance Tweak: pve.proxmox.com/wiki
PVE VirtIO: pve.proxmox.com/wiki - VirtIO Windows, pve.proxmox.com/wiki - KVM VMs
ZFS Performance Tuning: openzfs.github.io
ZFS Kernel-Module parameters: openzfs.github.io - Scheduler, openzfs.github.io - Module Parameters
Disk-setup for MSSQL on Linux: learn.microsoft.com

Interesting to us:

Importance of optimizing block-size: physical sectors <= ZFS ashift <= ZFS recordsize <= ZFS volblocksize <= VM FS block-size
We were not yet able to get the Windows VM above 14.5GB/s RND64k-Read - not sure if we are missing something
VM-disk type SCSI seems to be the best
VM-disk option 'discard' gives a nice boost when using ZFS under it
VM-disk caching could help you in edge-cases - but it seems to be a 'double-edged-sword'
VM-disk caching seems to behave very differently on Windows- vs Linux-VMs
Tweaking the ZFS-kernel-module parameters showed some speed-improvement but did also significantly increase I/O delay
The VM-FS can make a huge difference (vs VM raw block-device)
It's not that easy to get a 64k FS mounted on linux.. (getconf PAGE_SIZE)
It can be hard to optimize ZFS recordsize and volblocksize for specific applications (MSSQL in our case) and general usage (basic VM I/O).
The only way we see is to create partitions on all physical disks and create multiple zpool's from these.
But this might also lead to some loss in performance.
Just a though: If the DB can run directly on Linux, application-layer redundancy can be configured for it and the HA-licensing it acceptable - it might be 'simpler' and more performant to use multiple bare-metal servers. (less layers between DB and disks..)
But sadly that's not the case for us as the app that uses the DB has dropped support for running it on Linux..
Future: QEMU/VirtIO disk-multi-threading might be interesting
Future: ZFS Direct-IO might be able to boost NVMe's (ZFS release 2.3.0 => Available in PVE 9.0 )

Have a nice day

Search

Search

Benchmarking ZFS for Database Performance (MSSQL)

NiceRath

Member

We value your privacy