PBS with a volume >100 TB — how do you handle verify?

biowan1

New Member
Sep 11, 2025
1
0
1
Hello everyone,


I’m a new PBS user and would like some feedback based on your experience. I have a large backup storage system (RAID6) using a Dell PERC H965i RAID controller. One of my largest VMs is over 10 TB, and the verify duration has already exceeded 24 hours for this VM alone.

Is verify absolutely necessary on RAID6 storage, especially considering that the retention period is less than 6 months? How do you handle this on your side?

Thanks in advance for your feedback.
Best regards
b
 
Is verify absolutely necessary on RAID6 storage, especially considering that the retention period is less than 6 months? How do you handle this on your side?

HW Raid or ZFS RAIDz won't cover whether the backup is still consistent. So yes: Imho it's absolutely necessary.
What kind of storage are you using? HDDs, SSDs or a fusion pool ( HDDs with SSDs as special device in zfs)?

If you can't afford faster storage I would put the large vm in their own namespace and change the verify schedule. For example normal vms might get verified any day and the large vms every week or two weeks.
Edit: As UdoB said a mirror or striped mirror (aka RAID1/RAID10) of HDDs together with a SSD-based special device mirror should speed up the process at least a little bit.
 
Last edited:
  • Like
Reactions: biowan1 and UdoB
Assuming HDDs:
You mentioned Raid6. This gives you the IOPS of a single disk. A verify need to read the actual data, calculate the checksum and compare it with the stored original checksum. There is no single, large file to be read. Instead it needs to read some ten thousands of chunks, possibly distributed in millions of separately stored sectors (because of fragmentation) takes... a long time. Remember: there is physical head movement involved.

Technically a verify is not necessary. But you can only be sure to be able to read the data if you do exactly this once in a while. My personal choice is to re-verify every few months.

From my own (definitely limited) perspective the only acceptable construct uses pairs of mirrors and a fast + reliable(!) Special Device for the metadata - as I do use ZFS everywhere ;-)

Of course your-mileage-may-vary. But the massive duration is... expected - if my assumptions are right.
 
Let me share my story with quite large PBS datastores. We have 2 PBSs, one about 120TiB and one 150TiB. They are physical servers with 20 NVMe(15TB each) disks without any RAID. They are running ZFS Raidz2.

In the past few months Proxmox has improved process to verify backups. Now if you click Advanced you can tune how many thread will be used for reading and verifying. I have pushed this number to 16/32.

This approach has improved speed significantly(from 36 hours to about 12 hours). Nevertheless, I noticed that speed started decreasing a bit in past few weeks, disproportionally with amount of added backups. With use of Google/Gemini I came to an idea to implement a dedicated disk used just for L2ARC cache. So I have offloaded a lot of meta data from RAM. Oh yes, one appliance has 1TB of Ram and the other 768GB.

The caching disk is a super fast Optane disk that was just sitting on a shelf doing nothing for a while(384GB). I have used one Optane per PBS, and used it only for Metadata.

The GC and Verification times went down.
 
  • Like
Reactions: Onslow and biowan1
The caching disk is a super fast Optane disk that was just sitting on a shelf doing nothing for a while(384GB). I have used one Optane per PBS, and used it only for Metadata.
You are aware that if the metadata containing special device in a ZFS pool gets lost the whole pool is gone? For that reason the special device (aka metadata store) should be setup on a mirror.
 
  • Like
Reactions: Johannes S
Aha sorry, I didn't got the part on the caching. If this works for your, that's great. I would still expect that a special device would give a further speedup.
 
Aha sorry, I didn't got the part on the caching. If this works for your, that's great. I would still expect that a special device would give a further speedup.
Yes it would give even better performance. In that case I would need to use two disks and I am running low on empty disk slots. So for now I will use a single disk as caching disk.
 
  • Like
Reactions: Johannes S