building a new PBS server. have a few design questions.

danradom

Active Member
Aug 24, 2019
2
0
41
53
I'm replacing my existing PBS server. Current server datastores are connected via 10G network via NFS. Metadata lives on NFS as well. It's a 14T mdadm RAID1. The 2 datastores currently require 600 GB of storage.

For the replacement server I have 2 Samsung 970 EVO Plus 500 GB drives. The datastore storage will be 4 1T SSD drives in a mdadm RAID6 array. I also have a 240 GB SSD for the OS if that is desirable.

My questions are, is it better to install the OS and PBS on the 240 G SSD, use the 500 GB NVMe disks in a ZFS special device mirror, or should I install PBS / OS on the NVMe drives and put the metadata on the RAID6 disks with the datastores? Also, how much of an advantage would I get from a ZFS double parity array without ECC RAM? This is a low budget build and I have zero experience with ZFS.

Thanks in advance.
 
I think using those 970 EVOs is wasted money. I would just install PVE to those 4x 1TB SSDs in a striped mirror (raid10) for performance or a raidz1 (raid5) for capacity and use the same ZFS pool for your PBS datastore too. Because PBS itself just needs a few GBs and not a fast storage and 500GB is way too big for special device. So even if you would use those 970 EVOs for PBS + special device, you only would use 32GB of that capacity.
If you think you need more reliablity than a striped mirror/raidz1 then I would prefer to setup a second PBS and create a sync job between them. For the second PBS you could use a offsite server or your old server (or even just a VM/LXC) with its NFS datastore. Then you would get a fast primary PBS based on SSDs and slow secondary PBS in case your primary PBS fails.
Not using ECC will work but you can never trust your data as a ZFS scrub job can't detect corruption if that corruption happened in RAM before data was written to the ZFS pool. So thats not ideal without ECC, no matter what kind of raid level you use. PBS also got checksumming with its verify jobs but I'm not sure if that can fully replace the ZFS checksumming. As far as I undestand PVE will checksum the chunks (so it can check if that chunk is already existing on the datastore) and I guess a PBS verify job will just check if the chunk still matches the checksum PVE created.
In that case it wouldn't be that bad without ECC.
 
Last edited:
Okay this makes sense about the wasted NVMe drives. I do intend to run a second PBS in a KVM on the PVE cluster, so I will be syncing the datastores to my NAS accessed via NFS from the PBS VM. I suppose RAID10 would be beneficial over RAID6, but I've never really done single parity for important stuff. I guess the second PBS would cover me.

Thanks for this info. It is helpful.