[SOLVED] ZFS read performance issue

Apr 10, 2020
26
0
1
Yeah yeah I know, another ZFS perf issue... :p

Seriously though, I've read some threads on this forum and others regarding ZFS issue, sometimes read, sometimes write, sometimes both, and well, in my case it's read issue.

See below the comparison between my HDD with ext4 and ZFS. I only tested with DD, I know it's not the most reliable but I can perform more tests with bonnie or fio or display the output of iotop if requested.
It's a real simple test with one HDD in single disk mode.
CPU Ryzen 24 threads
64B RAM

Simple GPT partition with gdisk + mkfs.ext4 with default parameters.
Bash:
root@pve:/home/okeur# dd if=/dev/zero of=/root/test-sdg/test bs=1G count=64
64+0 records in
64+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 270.184 s, 254 MB/s
root@pve:/home/okeur# dd if=/root/test-sdg/test of=/dev/null
134217728+0 records in
134217728+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 261.992 s, 262 MB/s
root@pve:/home/okeur#

Single disk pool created through the WebUI. Ashift = 12 and compression on.
Compression off does not change anything
For all the read test, CPU is hitting 100% on one core only.
Bash:
root@pve:/home/okeur# dd if=/dev/zero of=/ZFS-SDG/test bs=1G count=64
64+0 records in
64+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 35.1505 s, 2.0 GB/s
root@pve:/home/okeur# dd if=/ZFS-SDG/test of=/dev/null
^C1993255+0 records in
1993254+0 records out
1020546048 bytes (1.0 GB, 973 MiB) copied, 48.1261 s, 21.2 MB/s

For testing purpose, here is a raidz2 with 6 HDD and a recordsize of 1M
Bash:
root@pve:/home/okeur# dd if=/dev/zero of=/VM-Storage/test bs=1G count=64
64+0 records in
64+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 32.487 s, 2.1 GB/s
root@pve:/home/okeur# dd if=/VM-Storage/test of=/dev/null
^C1677102+0 records in
1677101+0 records out
858675712 bytes (859 MB, 819 MiB) copied, 118.629 s, 7.2 MB/s

Changing the recordsize to 128K helped a lot but still...

Bash:
root@pve:/home/okeur# zfs set recordsize=128K VM-Storage
root@pve:/home/okeur# dd if=/dev/zero of=/VM-Storage/test2 bs=1G count=64
64+0 records in
64+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 35.5218 s, 1.9 GB/s
root@pve:/home/okeur# dd if=/VM-Storage/test2 of=/dev/null
^C1794359+0 records in
1794358+0 records out
918711296 bytes (919 MB, 876 MiB) copied, 43.0306 s, 21.4 MB/s


Thanks guys !
 
Last edited:
It should be known that dd and ZFS with /dev/zero is total bogus.

Single disk pool created through the WebUI. Ashift = 12 and compression on.
Compression off does not change anything

That changes a lot: compression 0 is very easy:

Code:
root@proxmox ~ > zfs create zpool/test
root@proxmox ~ > cd /zpool/test
root@proxmox /zpool/test > zfs set compression=off zpool/test
root@proxmox /zpool/test > dd if=/dev/zero of=test bs=1M count=16384
16384+0 Datensätze ein
16384+0 Datensätze aus
17179869184 bytes (17 GB, 16 GiB) copied, 53,0677 s, 324 MB/s
root@proxmox /zpool/test > zfs list zpool/test
NAME         USED  AVAIL     REFER  MOUNTPOINT
zpool/test  16,0G  3,30T     16,0G  /zpool/test
root@proxmox /zpool/test > zfs set compression=on zpool/test
root@proxmox /zpool/test > dd if=/dev/zero of=test bs=1M count=16384
16384+0 Datensätze ein
16384+0 Datensätze aus
17179869184 bytes (17 GB, 16 GiB) copied, 16,1673 s, 1,1 GB/s
root@proxmox /zpool/test > zfs list zpool/test
NAME         USED  AVAIL     REFER  MOUNTPOINT
zpool/test    24K  3,31T       24K  /zpool/test

You see that writing 24K of data is much less than Writing 16G of data. The impact is much greater if you use a multithreaded i/o tool.

so to your other problem:

The main problem is the blocksize on read:

Code:
root@proxmox /zpool/test > dd if=test of=/dev/null
^C3667523+0 Datensätze ein
3667522+0 Datensätze aus
1877771264 bytes (1,9 GB, 1,7 GiB) copied, 304,99 s, 6,2 MB/s

root@proxmox /zpool/test > dd if=test of=/dev/null bs=64k
262144+0 Datensätze ein
262144+0 Datensätze aus
17179869184 bytes (17 GB, 16 GiB) copied, 27,0728 s, 635 MB/s
 
  • Like
Reactions: Okeur
Hmmm okay, so it would mean actually I don't have any performance issue, it's just the tests were not performed using the right parameters.
So what would be a reliable test I could base my tuning on ? Something more like bonnie++ or fio ?
 
Ok I understand why my first comparison did not make any sense.
I spent the last hours playing with bonnie++ and not fio unfortunately, but here are the results.
TLDR : Everything is ok, you just need to compare apples with apples and pears with pears.
TLDR2: ZFS raidZ2 even without compression seems slightly faster than mdadm raid6 if I understood the bonnie++ outputs.

MDADM Raid 6
Bash:
root@pve:/media/raid# bonnie++ -u root -r 64g -s 64g -d /media/raid/ -f -b -n 1 -c 4 -q
Version  1.98       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
pve          64G::4            530m  32  293m  16            616m  19 349.1  12
Latency                         111ms     487ms             67065us     219ms
Version  1.98       ------Sequential Create------ --------Random Create--------
pve                 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1  1024   0 +++++ +++  1024   0  1024   0 +++++ +++  1024   0
Latency               159ms      49us   69689us     121ms       7us   76839us

ZFS raid Z2 compression off
Bash:
root@pve:/test# bonnie++ -u root -r 64g -s 64g -d /VM-Storage/comp_off -f -b -n 1 -c 4 -q
Version  1.98       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
pve          64G::4            672m  66  316m  44            614m  48 322.0  10
Latency                         104ms     679ms               627ms     194ms
Version  1.98       ------Sequential Create------ --------Random Create--------
pve                 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1  1024   1 +++++ +++  1024   1  1024   0 +++++ +++  1024   0
Latency             62977us       7us     147ms   99213us      14us     125ms

ZFS raid Z2 compression lz4
Bash:
root@pve:/test# bonnie++ -u root -r 64g -s 64g -d /VM-Storage/comp_lz4 -f -b -n 1 -c 4 -q
Version  1.98       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
pve          64G::4            1.1g  88  996m  91            2.3g  99  1747  36
Latency                         103ms     101ms              6658us     150ms
Version  1.98       ------Sequential Create------ --------Random Create--------
pve                 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1  1024   0 +++++ +++  1024   0  1024   0 +++++ +++  1024   0
Latency               108ms       5us   89195us   75387us       5us   98611us

ZFS raid Z2 compression gzip.
CPU went through the roof with this one... all 24 cores around 80%. I understand why reading the doc, it's just for testing purpose.
Bash:
root@pve:/test# bonnie++ -u root -r 64g -s 64g -d /VM-Storage/comp_gzip -f -b -n 1 -c 4 -q
Version  1.98       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
pve          64G::4            947m  89  572m  93            1.0g  99  1462  83
Latency                       54663us   59014us              5264us     107ms
Version  1.98       ------Sequential Create------ --------Random Create--------
pve                 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1  1024   0 +++++ +++  1024   0  1024   0 +++++ +++  1024   0
Latency             97446us       6us   66677us   88705us       5us   68223us

Will play with fio tomorrow but everything seems ok to me now that I understand... ;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!