zfs for storage and 4k rand write/read values are very low

eyup51

New Member
Nov 1, 2024
1
0
1
Hello everyone, I'm new to the forum.

I'm configuring a new server and plan to share storage via NFS. My setup will run Windows/Linux VMs on BL460c nodes, and I anticipate that 4K random write/read performance will significantly impact VM performance. I have a few questions and issues, so I'd appreciate any insights or experiences you can share.

Server Specifications for NFS File Server:
  • HPE DL380 GEN10 Server
  • 2x Gold 6150 CPUs (2.7-3.7 GHz)
  • 512GB or 1024GB 2400 MHz RAM
  • 8x 8TB Intel SSD DC P4510 U.2 NVMe SSDs (configured in ZFS RAID10)
  • Truenas Core

The core issue is a major performance drop that has left me uncertain about using ZFS. Despite running multiple fio tests based on forum and ChatGPT guidance, my results consistently show poor 4K random write performance, almost as if the disk is performing at one-tenth its capability. Here’s an example result:

  • Performance on ext4 or XFS: IOPS = 189k, Bandwidth = 738 MiB/s
  • Performance on ZFS: Write: IOPS = 28.3k, Bandwidth = 110 MiB/s

Sample Test Results:
  • CPU: Ryzen 7900X
  • RAM: 192GB
  • Disks: 2x 4TB Nextorage SSD NE1N4TB
  • ZFS Configuration: Mirror (RAID0)
  • Block Size: 16K
  • Sync: Standard
  • Compression: LZ4
  • Ashift: 12
  • Atime: Off

Test Parameters Used:

Code:
fio --name=test --size=4G --filename=tempfile --bs=4K --rw=randwrite --ioengine=sync --numjobs=64 --iodepth=32 --runtime=60 --group_reporting

Results:

Code:
write: IOPS=4665, BW=18.2MiB/s (19.1MB/s)(1094MiB/60012msec); 0 zone resets
clat (usec): min=4, max=94195, avg=13710.46, stdev=5785.48
lat (usec): min=4, max=94195, avg=13710.57, stdev=5785.39


Questions:
  1. Why does 4K random write/read performance drop so drastically as soon as I use ZFS?
  2. If I add an SLOG or ZIL device, would it help improve these values? Since I'm already using NVMe drives, is an additional NVMe SLOG necessary? What percentage of improvement could I realistically expect?
  3. In a live environment with Proxmox QEMU virtualization, would low 4K random write/read values affect general VM performance (e.g., browsing) on Windows and Linux VMs?
  4. Proxmox documentation suggests that with RAIDZ2, I might only achieve the IOPS of a single disk. Given that ZFS on a single disk seems to perform 10x slower, would RAIDZ2 inherit this reduction?
  5. The specs of the P4510 U.2 NVMe list up to 637,000 IOPS for reads and 139,000 IOPS for writes. The source I linked shows 190,000 IOPS on XFS. With an 8-disk ZFS RAID10 setup, is it technically feasible to achieve 400K 4K random write IOPS?
  6. On the server I will prepare, VM will not run locally, it will only share the disk with Truenas, should RAM be 512GB or 1024GB

Link to similar issue: extremely poor performance for ZFS 4k randwrite on NVMe compared to XFS

Thanks in advance for any guidance or experience you can share!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!