Dell R340 - LVM -> ZFS weirdness I'm trying to understand... Please help?

SwissHomeLabber

New Member
Mar 29, 2021
6
0
1
44
Zurich, Switzerland
Hello!

I have two boxes, with the following configurations

Box1:

Dell R340
Intel E-2136
128GB RAM
8x Micron 5100 ECO 3.84TB SSD
Dell PERC H730P (2GB cache)

Box2:
Dell R340
Intel E-2134
64GB RAM
8x Samsung PM863a 3.84TB SSD
Dell PERC H730P (2GB cache)

I've used these in a hardware RAID6 setup (via the Dell PERC controller) with Proxmox on LVM.

My colleague has convined me of some of the benefits of ZFS, so I thought I'd try it out and do some basic testing with https://github.com/rsyring/disk-bench/blob/master/readme.rst and pveperf on vanilla proxmox installs of 6.3-1 with the disks in passthrough (HBA mode) and using zfs raidz2

From the results, it looks like ZFS has some good benefit - Even if it is reducing my usable space from ~20.9TiB to 18TiB for what I understand is ZFS reservations/housekeeping space.

What I don't understand is the significant drop in FSYNC performance in pveperf, and large drop in 4kQD32write tests.

Could this be attributed to the PERC H730P controllers onboard cache?

I know 6.3-1 doesn't have ZFS 2.0x and I will do some more testing soon when I get time, but does anyone have any configuration advice for general performance optimisation? Typical workload is homelab VMs, some small databases, bitcoin node, basic web serving and applications like Nextcloud etc. The setup is overkill already for these purposes, but the large drop in 4kQD32write is my biggest worry if I try to spin up some more intensive applicaitons like a torrent seedbox or other disk intensive program.

Any advice is appreciated, thank you! :)




Code:
Box1 - LVM

root@box1:~# pveperf
CPU BOGOMIPS:      79199.76
REGEX/SECOND:      4657388
HD SIZE:           245.08 GB (/dev/mapper/pve-root)
BUFFERED READS:    686.53 MB/sec
AVERAGE SEEK TIME: 0.19 ms
FSYNCS/SECOND:     9995.65
DNS EXT:           29.54 ms
DNS INT:           18.84 ms (test.test)

root@box1:/tmp# disk-bench /tmp
╭──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────╮
│ Stats (MB/s) │      seqread │     randread │   4kQD32read │   4kQD16read │     seqwrite │    randwrite │  4kQD32write │  4kQD16write │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│              │        654.4 │        178.4 │         40.1 │         29.8 │      2,041.2 │      3,240.5 │      1,410.5 │         16.1 │
╰──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────╯

Box1 - ZFS

root@box1:~# pveperf
CPU BOGOMIPS:      79199.76
REGEX/SECOND:      4658684
HD SIZE:           18457.93 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     1139.34
DNS EXT:           27.13 ms
DNS INT:           17.31 ms (test.test)

root@box1:~# disk-bench /tmp
╭──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────╮
│ Stats (MB/s) │      seqread │     randread │   4kQD32read │   4kQD16read │     seqwrite │    randwrite │  4kQD32write │  4kQD16write │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│              │      4,680.8 │        552.3 │        376.0 │        105.7 │      1,556.0 │      2,635.8 │         92.8 │         57.0 │
╰──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────╯



Box2 - LVM

root@box2:~# pveperf
CPU BOGOMIPS:      55998.56
REGEX/SECOND:      4705716
HD SIZE:           245.08 GB (/dev/mapper/pve-root)
BUFFERED READS:    1455.23 MB/sec
AVERAGE SEEK TIME: 0.17 ms
FSYNCS/SECOND:     10491.70
DNS EXT:           33.52 ms
DNS INT:           19.65 ms (test.test)

root@box2:/tmp# disk-bench /tmp
╭──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────╮
│ Stats (MB/s) │      seqread │     randread │   4kQD32read │   4kQD16read │     seqwrite │    randwrite │  4kQD32write │  4kQD16write │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│              │      1,332.1 │        353.5 │         44.0 │         33.9 │      1,879.1 │      3,396.4 │      1,463.4 │         18.3 │
╰──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────╯

Box2 - ZFS

root@box2:~# pveperf
CPU BOGOMIPS:      55998.56
REGEX/SECOND:      4716347
HD SIZE:           18457.92 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     1183.06
DNS EXT:           29.77 ms
DNS INT:           17.52 ms (test.test)


root@box2:~# disk-bench /tmp
╭──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────┬──────────────╮
│ Stats (MB/s) │      seqread │     randread │   4kQD32read │   4kQD16read │     seqwrite │    randwrite │  4kQD32write │  4kQD16write │
├──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┤
│              │      2,402.1 │      1,055.1 │        298.1 │         77.6 │      1,605.0 │      3,124.3 │         64.0 │         41.8 │
╰──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────╯
 
Last edited:
First of, if you plan to store VMs on these ZFS storages, don't use zraid but a pool of mirrored disks. The way ZFS block device datasets (for VM disks) work in a raidz when it comes to parity means that you will most likely end up with much less usable space than expected. The IOPS performance for a pool made up of mirrored disks is also much better.
For more details there is a section in the docs about this.

Regarding the worse performance: ZFS does have more overhead than a simple file system, but in return you do get a lot of nice features. The RAID controller might be one cause. I experienced on some older HP DL380 G8 that the switch from the RAID controller to an HBA reduced the disk commit latency quite a bit.

For a ZFS storage I can recommend to have some RAM left over as ZFS will use up to 50% of the RAM if free as cache for read operations. Having a ZIL/SLOG as a dedicated log device for sync writes can also help. Some small Intel Optane SSDs work very well as they don't have an internal cache and will provide consistent fast writes. More than a few GB are usually not needed for the ZIL/SLOG!