ZFS Config Help for Proxmox Backup Server (PBS) - 22x 16TB HDDs (RAIDZ2 vs. dRAID2)

sensei_pv · 2025-10-01T13:52:20+0200

Hi every one.

I’m building a dedicated Proxmox Backup Server and would appreciate your feedback on the best ZFS layout for my hardware. My primary goals are high random-I/O performance (for garbage collection and small writes), robust data integrity, and reasonable capacity.

Hardware

22 × 16 TB HDDs (2 reserved as hot spares)
2 × 3.84 TB MU NVMe (to be used as a mirrored special-metadata vdev)
2 × 480 GB RI NVMe (mirrored OS boot; remaining partitions for SLOG)

Option A: Traditional RAIDZ2

Data pool: 2 × 10-drive RAIDZ2 vdevs
Hot spares: 2 HDDs as global spares
Performance vdevs:
– Special metadata: mirror of 3.84 TB NVMes
– SLOG: mirror of leftover 480 GB NVMe partitions

Pros: Well-understood, excellent parallel I/O from two vdevs, proven reliability
Cons: Slow resilver on 16 TB drives

Option B: dRAID2

Data pool: single draid2:10d:22c:2s vdev (10 data + 2 parity per stripe, 2 distributed spares)
Performance vdevs: same NVMe mirrors as in Option A

Pros: 5–10× faster resilver, instant use of distributed spares
Cons: Fixed stripe width may hurt small-file efficiency, only 2 failures tolerated

My Questions:

Random I/O & Throughput: Which option gives better IOPS/throughput for PBS workloads?
Resilver Time & Risk: Real-world resilver times on 16 TB drives—does dRAID’s speed justify its lower failure tolerance?
Capacity Efficiency: Post-parity usable space difference between configurations?
NVMe Metadata/SLOG: Does using a special vdev and SLOG make the HDD layout choice less critical?
Complexity vs. Expansion: For a fixed, large pool, is dRAID worth the added complexity over RAIDZ2?

I’m currently leaning toward 2 × 10-drive RAIDZ2 for its maturity, but dRAID’s faster rebuilds are tempting. Any real-world experience, benchmarks, or tuning tips would be hugely helpful!

Thanks in advance.

UdoB · 2025-10-01T20:00:00+0200

sensei_pv said:
2 × 480 GB RI NVMe (mirrored OS boot; remaining partitions for SLOG)

The SLOG will only help with SYNC writes. As far as I know PBS .chunks are written "normal". (Can someone prove me wrong?)

And that SLOG will accept incoming data of one single 5 seconds interval while a second TXG (transaction group) is being flushed out to the HDDs. So it stores 10 seconds worth of data maximum. With 1 GBit/s you can get 125 MB/s * 5s * 2 txg = 1.25 GB. With 10 GBit/s --> 12.5 GB/s. With 100 GBit/s it is 125 GByte. If the SLOG is larger then the additional space will never be used.

The Special Device is much more important - and worth it.

If you go for RaidZ2 then the Special Device should consist of three devices mirrored. The redundancy level of RaidZ2 allows for losing two HDDs --> it should also allow two SD devices to fail.

sensei_pv said:
Which option gives better IOPS/throughput for PBS workloads?

Mirrored vdevs! Plus that Special Device.

Restoring large VMs will always be slow if your main storage is a) HDD and b) only a very few spindles. Remember that RaidZx gives the IOPS of a single drive!

And for reading data the physical head has to move a zillion times - with or w/o Special Device! There is no optimization like during write, where the ZIL optimizes that 5 seconds interval.

Disclaimer: you are way larger than my small clusters and PBS's - where my own experience comes from.

Search

Search

ZFS Config Help for Proxmox Backup Server (PBS) - 22x 16TB HDDs (RAIDZ2 vs. dRAID2)

sensei_pv

New Member

Option A: Traditional RAIDZ2

Option B: dRAID2

My Questions:

UdoB

Distinguished Member

We value your privacy

ZFS Config Help for Proxmox Backup Server (PBS) - 22x 16TB HDDs (RAIDZ2 vs. dRAID2)

sensei_pv

New Member

Option A: Traditional RAIDZ2​

Option B: dRAID2​

My Questions:​

UdoB

Distinguished Member

We value your privacy

Option A: Traditional RAIDZ2

Option B: dRAID2

My Questions: