PBS with Expander Backplanes and JBOD

OsvaldoP

Well-Known Member
Mar 30, 2020
178
21
58
45
Hi everyone,

We need a slightly larger PBS and are considering using 1 storage SSD and 10x20TB HDDs for long-term archiving.

Since this will involve quite a few disks, I wanted to ask for your recommendations on how to connect them.

We previously had a system with an expander backplane, where GC (Garbage Collection) and Verify weren't particularly good compared to directly connecting the HDDs to an HBA. We were using RaidZ1. As I can't rule out that it was due to my configuration, I'd like to ask for your experiences.

Alternatively, another HBA with a JBOD as an extension would be an option, although we'd probably end up with expander backplanes again soon.

What experience do you have with expander backplanes and JBOD?

Many thanks for your feedback.
Roland
 
With those backplanes sometimes it works sometimes not. Really you need to test it with zfs. I had some 40disks backplanes work,and some not. But zfs raid10, maybe 2 nvme/ssd for caching and you're good to go.
 
It appears we had an "bad" expander in a Supermicro server (sorry, model unknown).

Regarding the cache, You're referring to a L2Arc consisting of two 4/8TB SSDs.
Given this, is it necessary to configure them in a RAID 10 array? Using RAID-Z1 is a bad idea?

kr
 
No,im talking about special device in zfs. It can speed up garbage collection. It is always good to have them, if your zfs is 10tb or 200tb in some cases.
This. Their redundancy should match the redundancy of the HdD RAID though, a broken special device will make the whool pool unusuable
 
  • Like
Reactions: UdoB
What experience do you have with expander backplanes and JBOD?
Generally, pretty good. Some notes:

1. Stay away from ancient SAS2 Expanders. They are quite weak and can severely limit IOPS.

2. I have personally experienced some weird issues with newer Broadcom HBAs combined with expander backplanes (but this only happened in one machine type with ASRock Rack X570 and Ryzen 5000).
If achieve 500MB/s for a single stream read (plain dd) for an SSD that is plugged into the expander backplane, all is good.

3. Expanders eat IOPS. Not an issue with HDDs, but don't expect great performance with HDDs.

4. Stay away from ancient SAS2 controller. 9300 series is pretty good, or 9500. 9400 and 9600 have some weird bugs.
 
Did I understand you correctly that you want to mirror your data or put them on a RAIDZ? Shouldn't backups have some data protection?

In general PBSs performance isn't great with HDDs since PBS deduplication of backupdata splits the data in a lot of small files called chunks which will all need to be read for most operations. See https://pbs.proxmox.com/docs/introduction.html#architecture and https://pbs.proxmox.com/docs/technical-overview.html#chunks ).
Now RAIDZ it self is already problematic in terms of performance (this is especially true for running VMs from them but (to a lesser extent) also for bulk data like backups see https://forum.proxmox.com/threads/fabu-can-i-use-zfs-raidz-for-my-vms.159923/ and https://forum.proxmox.com/threads/zfs-vs-single-disk-configuration-recomendation.138161/#post-616199 ).
There is a reason why the PBS manual recommends using dc-grade SSDs for storage of the backups (see https://pbs.proxmox.com/docs/installation.html#recommended-server-system-requirements )

For example on my homelab PBS the latest gc job showed following stats:
2025-04-23T02:00:20+02:00: Original data usage: 70.316 TiB
2025-04-23T02:00:20+02:00: On-Disk usage: 272.909 GiB (0.38%)
2025-04-23T02:00:20+02:00: On-Disk chunks: 244296
2025-04-23T02:00:20+02:00: Deduplication factor: 263.84
2025-04-23T02:00:20+02:00: Average chunk size: 1.144 MiB

So my datastore has 244296 small files, whose metadata will be need to read for every gc job. A verify job will also need to read all of them everytime I want to verify everything. On a real server (not homelab anymore) the file numbers will be a lot higher obviouvsly and all will be need to be read by backup/restore/verify and garbage collection jobs. So high IOPS on a datastore is really needed for sufficient PBS performance.
For that reason neither RAIDZ noch HDDs are a good idea for a PBS datastore. . Now of course SSDs can be quite expensive depending on the amount of data you need to store. So my recommendation would be to try to set them up as striped mirrors (ZFS speech for RAID10) and adding some small ssds as special specvice (capacity of around 2% of HDD capacity should be enough) striped mirrors matching your HDD pools. You will still have at least the redundancy of the mirror and a much improved performance. First the striped mirror setup of the HDDs will improve read performance even for non-metadata. The special device will take all new metadata (for all data you would need to rewrite it with zfs send/receive) so any operations which works with them (e.G. garbage collection) will profit from it. A special device can also be used with RAIDZ, if a striped mirror wouldn't give you the needed capacity.
Further reading:
https://openzfs.org/wiki/Features#Improve_N-way_mirror_read_performance
https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_raid_considerations
https://pbs.proxmox.com/docs/sysadmin.html#local-zfs-special-device
https://www.truenas.com/community/threads/zfs-raidz2-or-striped-mirror.95027/
 
Hey Robin,

First time thinking about connecting a JBOD to PBS. Since I don't have much experience, I was looking at this Supermicro 826BE2C-R802JBOD [0].

[0] https://www.supermicro.com/en/products/chassis/2u/826/sc826be2c-r802jbod




Johannes, thanks for your detailed answer!

Backups should be safe, but I thought RAID-Z also gives some protection.

I know PBS prefers SSDs, I've read the manual multiple times. But it's always about the cost. That's why I keep thinking about RAID-Z1 or RAID-Z2.
I know RaidZ have space overhead, but it should be less than with RAID 10?

We use a RAID-Z1 pool with 6x 20TB HDDs, currently using about 55TB.
PVE Backup goes to SSD storage first and then gets synced to the HDD storage.

A GC takes about 3.5 hours. GC data is in [1].
Interesting, our deduplication factor is 11, yours is 263.
We have a mix of about 100 Windows and Linux VMs.
The customer have about 40 VMs with normal storage changes.

RaidCalc:
RaidZ1 10x 20TB HDD => Usable Storage ~124TB
Raid10 10x 20TB HDD => Usable Storage ~70TB

[1]

Code:
2025-04-22T21:31:36+02:00: Original data usage: 482.56 TiB
2025-04-22T21:31:36+02:00: On-Disk usage: 43.708 TiB (9.06%)
2025-04-22T21:31:36+02:00: On-Disk chunks: 22683969
2025-04-22T21:31:36+02:00: Deduplication factor: 11.04
2025-04-22T21:31:36+02:00: Average chunk size: 2.02 MiB
 
Hey Robin,

First time thinking about connecting a JBOD to PBS. Since I don't have much experience, I was looking at this Supermicro 826BE2C-R802JBOD [0].

[0] https://www.supermicro.com/en/products/chassis/2u/826/sc826be2c-r802jbod
Generally, good choice (SAS3 Expander Backplane).

But you have chosen the BE2C model which has Dual Expander, usually much more expensive.
I suppose you're not planning a dual storage-head setup (see https://github.com/ewwhite/zfs-ha/wiki for what I'm talking about)?

You probably want the BE1C model then, which is standard Single Expander and cheaper.
 
@Robin C.

You are right, single expander is ok. I took the wrong url. See [0]
Also interessting info about epander is podcast [1] in german.

[0] https://www.supermicro.com/en/products/chassis/2u/826/sc826be1c-r609jbod
[1] https://www.thomas-krenn.com/de/tkm...ckplanes-und-raid-controller-thomas-krenn-ag/


@ness1602
Thanks I will read more about Raid10, If all here prefere, it must be true :)

@Gabriel
We have already updated some systems, however, we encountered the error [3]. An update is pending though.

[3] https://forum.proxmox.com/threads/3-4-0-upgrade-–-detected-x-index-files-outside-expected-directory-structure.164968/#post-765418]

kr
Roland