ZFS recommendation for 8x 22TB HDD + 2x 2TB NVMe SSD

erpel

New Member
Feb 8, 2025
2
2
3
At the moment, I have a server at Hetzner with 4x 16TB HDD (no SSDs) that I use as central offsite backup site for multiple local PBS.
I use ZFS with two mirror vdevs with 2 HDDs each, i.e. like a RAID10, with 2x 16TB effective storage space.

That worked quite well, but now the HHDs are rather full and I need to expand it.
Also, on my old server, some tasks like verify took a pretty long time (~20h for one of the datastores).

I want to switch to an SX135 with 128GB RAM, 8x 22TB HDD and 2x 1.92TB NVMe SSD.
https://www.hetzner.com/de/dedicated-rootserver/matrix-sx/

The question is now how to layout the ZFS pool.

My idea is this:
Create a zfs pool "rpool" on a small partition (e.g. 256GB) on the SSDs (as mirrored vdev).
This would leave > 1.5TB free on the SSDs.

Create a zfs pool "hddpool" as RAIDZ-2 (this would give me 6x 22TB effective storage space) or RAID10/mirror vdevs like before (this would give me 4x 22TB).

What layout would you recommend?
Is it beneficial to use the remaining space on the SSDs as ZFS log, cache or special device?

Does a log and/or cache device help for a PBS?
I think the special device will only speed up access to the metadata, but that might still help a bit?

Thanks a lot for your help!

/Christian
 
Last edited:
  • Like
Reactions: Johannes S
Run the installation to create a single "rpool" utilizing all HDDs. Side effect: each and every disk will be bootable. For PBS I would highly recommend to create mirrors for the usual and often discussed reasons.

My point is: after installation (and everything basically running) and before putting much data onto it add those two NVMe as a mirrored "Special Device".

Maybe 1.92 TB is a little bit too large to be used "just" for metadata. But I would not partition them to use some space for Cache/SLOG. Instead I would read some articles regarding placing "Small Blocks" onto this fast vdev. Setting parameters may be tricky, be sure to understand the mechanism before you go productive.

(( An additional Cache is useless as your data volume is by far too large and most data is not read multiple times in sequence; an SLOG would help for SYNC writes, but this aspect is not really critical from my point of view. Both will NOT help for reading the actual data - the physical heads have to move in any case. Meta-data on a separate device will help to raise IOPS as no head movement is necessary for reading AND writing metadata anymore. ))

Without an SD rotating rust is not usable for a large PBS - at least not in my universe. PBS needs IOPS!

Good luck :-)

PS: Redundancy levels should be consistent. If you go for (not recommended) RaidZ2 you should use triple mirrors --> two devices are allowed to fail.
 
  • Like
Reactions: Johannes S and news
Ok, thank you, I'll try a RAID10 like setup with mirrored special device on SSD and play a bit with the small blocks size.
 
  • Like
Reactions: Johannes S
For speed up the consistency check the small file support of special device won't help u, because all data in PBS are small files and u own only 2TB of fast storage. Therefor proxmox adviced to use SSDs. An user with the same problem experienced good speed with usage of btrfs rather than zfs because the filesystem speed is much better. Therefore I would recommend to implement btrfs raid10 if u only can use HDD.