verification questions

veehexx

Member
Jun 9, 2022
13
2
8
I run PVE & PBS for my home network.
Production PVE is fully SSD/NVME, which also runs a 30day short-term retention PBS server as VM. ZFS on the disk storage, using a .raw VM disk with ZFS inside
My longterm PBS VM (on a second PVE host) runs on HDD's, but also fairly power hungry & noisy so does not run 24/7. BTRFS on the physical disks, with .raw VM disk with ZFS inside.

I'm aware of the scheduled (re)verification task I can configure myself, however these have been disabled.


1)
i've a host backup (running via proxmox-backup-client with metadata detection mode) going to the longterm PBS (HDD), that runs against a Backup folder on a fileshare. Nothing too weird here - basic cfg, conf, tar.gz, xml etc that gets written to manually. 2.5GB total which takes 9 seconds to backup, with 0 bytes changed between two checkpoints.

Why does verification occur on these taking over 10 minutes? Seems like there should be no processing required with no new files and should succeed within seconds.

I've another job at 35GB with 0 bytes change. this takes 20mins+ to verify and seem to scale out to the mutli-TB jobs so not specific to one job.

2)
It appears that any backup job (either VM/CT/Host type) always triggers a verify immediately after the backup has completed.
How do i disable that feature?

My dataset is split across multiple backups, with my entire raw data size around 11TB, with a churn of maybe 30GB on a heavier week.
Sync copies are naturally fast with that little change to be transferred (completed within about 5 minutes), but it seems PBS is doing verification on the full data size, so 11TB every verification.

Partly i want to handle this either via a scheduled verify or disable it entirely and rely on either ZFS or BTRFS scrubs for bitrot on either of them.
 
Why does verification occur on these taking over 10 minutes?
Verifying means to actually read all relevant chunks, calculate its checksum and compare that with another copy of the checksum which was stored persistently when the backup was created.

It is often stated that PBS needs some IOPS. My suspicion: you see the effect of a slow storage device (IOPS-wise) here. (Plus a network roundtrip plus some protocol overhead, if I read "that runs against a Backup folder on a fileshare" correctly.) Compare: https://pbs.proxmox.com/docs/installation.html#recommended-server-system-requirements

(( Personally I do utilize rotating rust too, for the usual reasons. But I always help it by adding a fast Special Device to it, for both Meta-Data and "Small Blocks". This approach works well for me. ))
 
  • Like
Reactions: news
It appears that any backup job (either VM/CT/Host type) always triggers a verify immediately after the backup has completed.
How do i disable that feature?
The datastore configuration on the PBS has a tab "Options". There you'll find "Verify New Snapshots".

Probably that's the relevant toggle.
 
  • Like
Reactions: veehexx
The datastore configuration on the PBS has a tab "Options". There you'll find "Verify New Snapshots".

Probably that's the relevant toggle.
THATS the option! i had a feeling i saw that in the past, but was looking in the wrong menu and couldnt find it mentioned to it in the docs again!

Verifying means to actually read all relevant chunks, calculate its checksum and compare that with another copy of the checksum which was stored persistently when the backup was created.

It is often stated that PBS needs some IOPS. My suspicion: you see the effect of a slow storage device (IOPS-wise) here. (Plus a network roundtrip plus some protocol overhead, if I read "that runs against a Backup folder on a fileshare" correctly.) Compare: https://pbs.proxmox.com/docs/installation.html#recommended-server-system-requirements

(( Personally I do utilize rotating rust too, for the usual reasons. But I always help it by adding a fast Special Device to it, for both Meta-Data and "Small Blocks". This approach works well for me. ))
raw PBS performance seems to run around 250MB/s max from the host, which is about what i'd expect from a real world data on HDDs (BTRFS raid10. iperf3 can max 10G network, dd runs around 500MB/s). I will take another look at the special device (i briefly saw mentions of this in another thread), but if i've understood the deployment and specifically ZFS across a variety of disk sizes, then if its even possible its a substantial change to my existing storage and possibly not enough sata/sas ports.

What i was getting at with that question is based on 0 bytes changed from the last backup, then what data is actually being verified? PBS already knows about the most recent backups' successfully verified state, and the next backup has no changes, then to my mind they'd be nothing to verify.
 
  • Like
Reactions: UdoB
What i was getting at with that question is based on 0 bytes changed from the last backup, then what data is actually being verified?
It is especially the technical readability of the data chunks from the physical disk which is confirmed by this process. (And then to be sure that the read data is actually correct the checksum is used.)