Hello everyone, I'm new to the forum.
I'm configuring a new server and plan to share storage via NFS. My setup will run Windows/Linux VMs on BL460c nodes, and I anticipate that 4K random write/read performance will significantly impact VM performance. I have a few questions and issues, so I'd appreciate any insights or experiences you can share.
Server Specifications for NFS File Server:
The core issue is a major performance drop that has left me uncertain about using ZFS. Despite running multiple fio tests based on forum and ChatGPT guidance, my results consistently show poor 4K random write performance, almost as if the disk is performing at one-tenth its capability. Here’s an example result:
Sample Test Results:
Test Parameters Used:
Results:
Questions:
Link to similar issue: extremely poor performance for ZFS 4k randwrite on NVMe compared to XFS
Thanks in advance for any guidance or experience you can share!
I'm configuring a new server and plan to share storage via NFS. My setup will run Windows/Linux VMs on BL460c nodes, and I anticipate that 4K random write/read performance will significantly impact VM performance. I have a few questions and issues, so I'd appreciate any insights or experiences you can share.
Server Specifications for NFS File Server:
- HPE DL380 GEN10 Server
- 2x Gold 6150 CPUs (2.7-3.7 GHz)
- 512GB or 1024GB 2400 MHz RAM
- 8x 8TB Intel SSD DC P4510 U.2 NVMe SSDs (configured in ZFS RAID10)
- Truenas Core
The core issue is a major performance drop that has left me uncertain about using ZFS. Despite running multiple fio tests based on forum and ChatGPT guidance, my results consistently show poor 4K random write performance, almost as if the disk is performing at one-tenth its capability. Here’s an example result:
- Performance on ext4 or XFS: IOPS = 189k, Bandwidth = 738 MiB/s
- Performance on ZFS: Write: IOPS = 28.3k, Bandwidth = 110 MiB/s
Sample Test Results:
- CPU: Ryzen 7900X
- RAM: 192GB
- Disks: 2x 4TB Nextorage SSD NE1N4TB
- ZFS Configuration: Mirror (RAID0)
- Block Size: 16K
- Sync: Standard
- Compression: LZ4
- Ashift: 12
- Atime: Off
Test Parameters Used:
Code:
fio --name=test --size=4G --filename=tempfile --bs=4K --rw=randwrite --ioengine=sync --numjobs=64 --iodepth=32 --runtime=60 --group_reporting
Results:
Code:
write: IOPS=4665, BW=18.2MiB/s (19.1MB/s)(1094MiB/60012msec); 0 zone resets
clat (usec): min=4, max=94195, avg=13710.46, stdev=5785.48
lat (usec): min=4, max=94195, avg=13710.57, stdev=5785.39
Questions:
- Why does 4K random write/read performance drop so drastically as soon as I use ZFS?
- If I add an SLOG or ZIL device, would it help improve these values? Since I'm already using NVMe drives, is an additional NVMe SLOG necessary? What percentage of improvement could I realistically expect?
- In a live environment with Proxmox QEMU virtualization, would low 4K random write/read values affect general VM performance (e.g., browsing) on Windows and Linux VMs?
- Proxmox documentation suggests that with RAIDZ2, I might only achieve the IOPS of a single disk. Given that ZFS on a single disk seems to perform 10x slower, would RAIDZ2 inherit this reduction?
- The specs of the P4510 U.2 NVMe list up to 637,000 IOPS for reads and 139,000 IOPS for writes. The source I linked shows 190,000 IOPS on XFS. With an 8-disk ZFS RAID10 setup, is it technically feasible to achieve 400K 4K random write IOPS?
- On the server I will prepare, VM will not run locally, it will only share the disk with Truenas, should RAM be 512GB or 1024GB
Link to similar issue: extremely poor performance for ZFS 4k randwrite on NVMe compared to XFS
Thanks in advance for any guidance or experience you can share!