What is the best solution for using an HDD as a PBS backup server?

melon

Member
Mar 18, 2022
19
0
6
23
Currently, I am using a RAID10 ZFS array composed of 12 4TB enterprise-grade hard drives.

The backup speed is satisfactory and meets my requirements.

I have found that processing and verifying backups takes a considerable amount of time, nearly 20 hours each time.

I want to know what determines the performance of backup verification? Is it sequential read/write or random read/write?

If ZFS RAID10 does not improve the speed of backup verification, can I try setting up multiple RAID1 arrays?

For example, I could change my current 12 4TB hard drives from RAID10 to 6 RAID1 arrays.
 
Currently, I am using a RAID10 ZFS array composed of 12 4TB enterprise-grade hard drives.
PBS split the data in small files of around two to four MB (chunks). Verify and garbage collection needs to read every file which together with the high number of files hurts performance. That's the reason the manual recommends using enterprise ssds as datastore Media.

However: You could add a ssd mirror as special device to your hdd vdev and rewrite the saved data with zfs send/receive. Then the Metadata will be saved on the ssd resulting in a speedup. Please note that reading the actual data will still be slow since it's still on the hdd.
Another caveat: The ssd mirror special device should have at least the same redundant as the hdd datastore since a Liste special device will lead to a broken vdev as a whole:
https://pbs.proxmox.com/docs/sysadmin.html#zfs-administration
 
  • Like
Reactions: UdoB
You could add a ssd mirror as special device to your hdd vdev and rewrite the saved data with zfs send/receive.

Actually I am doing this right now: a RaidZ2(!) got a triple mirror Special Device via USB a week ago.

Note: RaidZ/Z2 on rust is bad and USB is... not recommended. Rotating rust is not recommended at all to be used for PBS. I would call it a worst-case-scenario.

Reading and writing some million files takes some time: there is "only" 6 TB in a PBS datastore. Conversion by "read, write-copy, delete original" is already running for seven days now and it has 28 percent done. So another month or so continuously reading and writing data is required... :)

Maybe send/receive would have done a better job. I did not expect this long duration and it was not clear to me if send/receive would do what I wanted to do: migrating all metadata (and some "small files") onto the Special Device. Now I will let it just run...


Again: this is "experimental" and this setup is definitely not recommended - it is just one of several PBS in my homelab.
 
  • Like
Reactions: Johannes S
PBS split the data in small files of around two to four MB (chunks). Verify and garbage collection needs to read every file which together with the high number of files hurts performance. That's the reason the manual recommends using enterprise ssds as datastore Media.

However: You could add a ssd mirror as special device to your hdd vdev and rewrite the saved data with zfs send/receive. Then the Metadata will be saved on the ssd resulting in a speedup. Please note that reading the actual data will still be slow since it's still on the hdd.
Another caveat: The ssd mirror special device should have at least the same redundant as the hdd datastore since a Liste special device will lead to a broken vdev as a whole:
https://pbs.proxmox.com/docs/sysadmin.html#zfs-administration

Thank you for your response.

I want to know if adding a mirrored special device to the storage pool still does not increase the speed of verifying backups?
 
a special device solve only one of the both action Verify or GC to be faster.
there is a post - I`m to lazy to look for where someone changed from zfs to btrfs and recognized its much faster in Verify and GC.
 
a special device solve only one of the both action Verify or GC to be faster.

According to the threads I read here it speeds up both but gc more than verify ( since verify also needs to read the real data while gc mostly deals with metadata )
there is a post - I`m to lazy to look for where someone changed from zfs to btrfs and recognized its much faster in Verify and GC.
And it's Support is considered Experimental and it has less Features than ZFS so I would take this with a grain of salt.
 
  • Like
Reactions: UdoB