ZFS on RAID 10 with 4 x PCIe5 NVMe Drives: Performance Insights and Questions

fmoreira86

Renowned Member
Aug 7, 2013
35
0
71
Hello everyone.

I recently acquired 2 HPE DL380 GEN11 servers. Each node has 512GB of RAM and 4 PCIe5 NVME drives of 3.2TB (HPE, but manufactured by KIOXIA). Given my use case, where the goal was not to have shared storage but rather a cluster with VM replication, I created a logical volume for VMs with the NVME drives (directly attached to the CPU, of course). My boot logical volume consists of two Samsung SSDs... but that's not relevant for this discussion.

In some tests performed within Windows VMs (deployed using the recommendations from the Proxmox wiki), I’m getting a maximum of around 6500MB/s.

Don’t get me wrong—I am very satisfied with the overall performance of the system, but I feel that, given the hardware I have, I'm facing a significant penalty due to ZFS. Even though I know ZFS is not a filesystem focused on performance, I think we all are always looking for something more ;) My ZFS configuration is default, with ashift 12 and compression enabled. I have atime disabled.

For your reference, HPE states that these disks are capable of:

MAX Seq Reads / Max Seq Writes Throughput (MiB/s): 13,852 / 6,802 MB/s

Read IOPS
Random Read IOPS (4KiB, Q=16) = 215,995
Max Random Read IOPS (4KiB) = 1,159,100 @ Q128
Write IOPS
Random Write IOPS (4KiB, Q=16) = 599,050
Max Random Write IOPS (4KiB) = 649,489 @ Q16, 13,852 / 6,802 MB/s

Any ideas? :)

1742074418212.png
1742076569015.png

1742074446669.png


1742074491758.png


EDIT:

Tested with:

1742077168239.png

1742077095610.png
1742077119810.png
 
Last edited:
ZFS indeed can't be comparable to regular file system.
Compare with Lvmthin to show the ZFS penalty.

Virtualization has penalty too.
Q1 single thread is the worst case for virtualization.

IOPS is missing from your CDM screenshot.
 
Last edited:
ZFS indeed can't be comparable to regular file system.
Compare with Lvmthin to show the ZFS penalty.

Virtualization has penalty too.
Q1 single thread is the worst case for virtualization.

IOPS is missing from your CDM screenshot.
Just added IOPS screenshot ;)

Also added testing with:

1742077183964.png
 
Last edited: