Dear Proxmox community,
After several searches in the forum, I couldn't find much information regarding ZFS storage and its performance tuning. Thus, I'd like to start this thread to share best practices, tests, and tunning tips on how you design your data storage.
Recently, I built a home NAS/Sever, and to determine whether to use the Local ZFS Pool Backend or create a ZFS pool with datasets and use bind mount, I ran the tests below.
The
Detailed ZFS pool information: RAID 10 consists of 4 HDDs and 1 NVMe SSD (SLOG) created with following command:
The fio test results I ran in LXC container with mounted:
1. a disk created with Local ZFS module,
2. a bind mounted dataset.
Then I re-ran with
Conclusion:
You might say that I could expect such numbers, as we can find many good resources about tuning ZFS performance all around the internet e.g. [1], [2], [3], [4]
However, I was mainly curious about the raw data and comparison between the Local ZFS module and bind-mounted dataset performance.
So far, I'll stick with:
- For all LXC containers and VMs - disk created with Local ZFS module (with blocksize=128K)
- For file share applications (e.g. Nextcloud) - I'll use bind-mounted dataset with blocksize=64K
- For media app (e.g. Plex) - I'll bind-mounted dataset with blocksize=512K or 1M and compression=zstd-12.
If you've performed any tests or you have any tips please share and comment.
After several searches in the forum, I couldn't find much information regarding ZFS storage and its performance tuning. Thus, I'd like to start this thread to share best practices, tests, and tunning tips on how you design your data storage.
Recently, I built a home NAS/Sever, and to determine whether to use the Local ZFS Pool Backend or create a ZFS pool with datasets and use bind mount, I ran the tests below.
The
fio
tests:
Code:
# SEQ Write 2GB file - web-server-simulation:
sync; fio --filename=testfile-seq-2g --size=2GB --direct=1 --rw=write --bs=1M --ioengine=libaio --numjobs=4 --iodepth=32 --name=seq-write-test --time_based --runtime=120 --group_reporting
# SEQ Read 2GB file:
sync; fio --filename=testfile-seq-2g --size=2GB --direct=1 --rw=read --bs=1M --ioengine=libaio --numjobs=4 --iodepth=32 --name=seq-read-test --time_based --runtime=120 --group_reporting
# Random write 4GB:
sync; fio --filename=testfile-rand-4g --size=4GB --direct=1 --rw=randwrite --bs=4k --ioengine=libaio --numjobs=4 --iodepth=32 --name=rand-write-test --time_based --runtime=120 --group_reporting
# Random read 4GB:
sync; fio --filename=testfile-rand-4g --size=4GB --direct=1 --rw=randread --bs=4k --ioengine=libaio --numjobs=4 --iodepth=32 --name=rand-read-test --time_based --runtime=120 --group_reporting
# Mixed Random Read/Write 4GB database file:
sync; fio --filename=database-testfile.blob --size=4GB --direct=1 --rw=randrw --bs=8k --ioengine=libaio --iodepth=32 --numjobs=8 --rwmixread=70 --name=db-mixed-rw-test --time_based --runtime=300 --group_reporting
# Multi-Threaded Application Simulation e.g. a data analytics tool:
fio --filename=testfile.blob --size=10GB --direct=1 --rw=readwrite --bs=64k --ioengine=libaio --iodepth=16 --numjobs=16 --name=multi-thread-app
Detailed ZFS pool information: RAID 10 consists of 4 HDDs and 1 NVMe SSD (SLOG) created with following command:
Code:
# zpool create \
-o ashift=12 \
-O encryption=on -O keylocation=file:///root/zfs-pool.key -O keyformat=raw \
-O acltype=posixacl -O xattr=sa -O dnodesize=auto \
-O compression=zstd-7 \
-O normalization=formD \
rpool mirror /dev/sda /dev/sdb mirror /dev/sdc /dev/sdd log /dev/nvme0n1p1
The fio test results I ran in LXC container with mounted:
1. a disk created with Local ZFS module,
2. a bind mounted dataset.
blocksize=64K | SEQ Write 2GB file | SEQ Read 2GB file | Random write 4GB | Random read 4GB | Mixed Random Read/Write 4GB database file | Multi-Threaded Application Simulation |
---|---|---|---|---|---|---|
Local ZFS module | 618MB/s | 5938MB/s | 21.0MB/s | 3320kB/s | 46.8MB/s; 20.1MB/s | 533MB/s; 533MB/s |
Bind mounted dataset | 399MB/s | 6201MB/s | 28.7MB/s | 178MB/s | 46.0MB/s; 19.7MB/s | 609MB/s; 610MB/s |
Then I re-ran with
blocksize=128K
blocksize=128K | SEQ Write 2GB file | SEQ Read 2GB file | Random write 4GB | Random read 4GB | Mixed Random Read/Write 4GB database file | Multi-Threaded Application Simulation |
---|---|---|---|---|---|---|
Local ZFS module | 589MB/s | 6177MB/s | 15.9MB/s | 63.9MB/s | 31.7MB/s; 13.6MB/s | 561MB/s; 561MB/s |
Bind mounted dataset | 419MB/s | 6338MB/s | 17.8MB/s | 97.2MB/s | 31.0MB/s; 13.3MB/s | 999MB/s; 1000MB/s |
Conclusion:
You might say that I could expect such numbers, as we can find many good resources about tuning ZFS performance all around the internet e.g. [1], [2], [3], [4]
However, I was mainly curious about the raw data and comparison between the Local ZFS module and bind-mounted dataset performance.
So far, I'll stick with:
- For all LXC containers and VMs - disk created with Local ZFS module (with blocksize=128K)
- For file share applications (e.g. Nextcloud) - I'll use bind-mounted dataset with blocksize=64K
- For media app (e.g. Plex) - I'll bind-mounted dataset with blocksize=512K or 1M and compression=zstd-12.
If you've performed any tests or you have any tips please share and comment.