Validation & GB probably slow

AST

Well-Known Member
Nov 28, 2018
113
8
58
Hello

What is your experience with validation and GB/Prune of backups?
I currently have ~5TB of data on the PBS and there validation takes almost up to 10 hours, the GC meanwhile took 12 hours.
Neither RAM, DiskIO nor CPU seem to be busy.

Greetings, Patrick
 
I'm struggling with this too.

I currently verify every datastore daily. With about 14TB of data, and reverify after 30 days, the verification takes about 16 hours. During verification the disks are pretty busy, reading (currently) about 170MB per second. This will probably increase again after adding a new zraid1 set to the pool, but I wonder if this really scales that much.

Although I get that verification is useful, I wonder if there is a more efficient way to do this. (Running backups on SSD is not a viable option)
 
Interesting.

A silly thing and a technical one.
First the stupid thing: I had backups (and thus instant verification) and let GC/Prune start both at 00:00. I guess the disks were simply overcharged with that. Now i set backups for 00:00 and GC fpr 12:00.
At the same time I have extended the RAM from 32 to 64 GB and think that performance is better with verification and GC.
 
@t.lamprecht In https://forum.proxmox.com/threads/is-verify-task-needed-with-zfs-backed-datastore.84081/ you say that verification with ZFS may be overkill. However, when a backup fails because a datastore is full, verification mentions that.

Is it possible to do create two types of verification, just like Ceph has 'scrub' and 'deep-scrub' ? Where checksumming filesystems like ZFS don't really require deep-scrub, but scrub does check if all chunks needed for a backup are available (or something like that)?

I'm still reading about 20T per day for verification. Also because the GUI clearly shows it's unhappy when stuff is not verified.
 
you can already configure verification to not re-verify everything everytime, but verify at least once every X days. new stuff will get verified by the next verify job, but old stuff only every once in a while to make sure it didn't get corrupted.
 
I have that configured. Still verification takes half a day.

But the point is, is verification useful for filesystems like ZFS? @t.lamprecht seems to suggest it doesn't (which makes sense). But the GUI gently forces you to do it anyways.
 
I have that configured. Still verification takes half a day.

But the point is, is verification useful for filesystems like ZFS? @t.lamprecht seems to suggest it doesn't (which makes sense). But the GUI gently forces you to do it anyways.
Note that I nowhere state that it does not make sense, or provides no gain on such storage.
Verification has certainly some use and protects from possible issues besides bitrot, which ZFS cannot know of.

If you can trust the bitrot detection and repair capability of the underlying storage and just want a "do all referenced chunks exists check" then you would get that by running GC, which goes over all indexes to touch used chunks, if one does not exist it will warn about that.

Verification should IMO always do the full thing, no use in checking only half and suggesting fully verified over the gui/api, that just builds a false sense of trust/reliability.
 
Last edited:
  • Like
Reactions: fabian
Note that I nowhere state that it does not make sense, or provides no gain on such storage.
Verification has certainly some use and protects from possible issues besides bitrot, which ZFS cannot know of.

If you can trust the bitrot detection and repair capability of the underlying storage and just want a "do all referenced chunks exists check" then you would get that by running GC, which goes over all indexes to touch used chunks, if one does not exist it will warn about.

Verification should IMO always do the full thing, no use in checking only half and suggesting fully verified over the gui/api, that just builds a false level of trust.

I agree.

However, I have the feeling I'm not the only one struggling with what is wisdom for this situation. So if I may try to conclude:

When using a checksumming filesystem as storagelayer for PBS (e.g. ZFS), one can suffice with verifying weekly or monthly. The GC-run will find out about failed backups and missing chunks and notify the user about that.
The full verification is useful for explicitly checking the backup.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!