Resolving corrupt chunks

tyaz

New Member
Mar 23, 2025
2
0
1
Over the last few days, I've been getting corrupted backups.

The verification tool states things like:

2025-03-23T12:59:13+00:00: can't verify chunk, load failed - store 'backup', unable to load chunk 'e0975b280f78a0069b30e3c5f07b4b74633fb0f7ac322ce17a5bdd3938f28907' - No such file or directory (os error 2)

I'm trying to understand how to fix this going forward so future backups are fine. I'm running the latest version of PBS... Between verification / garbage collection and prune, will it fix itself or do I need to manually go in and do something?
 
I'm trying to understand how to fix this going forward so future backups are fine.

You need to find a way to give reliable storage to PBS. (You did not tell us what storage your PBS datastore is on.)

When one chunk is "unable to load" all backups referencing this one are... not usable anymore. That's the price tag for de-duplication.

"Reliable storage" (for me) means ZFS with striped mirrors. On SSD - if you have money for that.

There are compromises you can possibly make: at one instance I run a rotating-rust RaidZ2 with an NVMe "Special Device". Constructs like this are not recommended, but depending on your personal pain threshold a lot of "things" do work. If you actually do similar things do test it! Often! And then again. Every few months!

Again: you need reliable storage for a reliable (backup) system --> build it with redundancy for error correction. And for PBS you additionally want IOPS, IOPS, IOPS... ;-)
 
I'm using S3 for my storage. My question is more about how does proxmox fix this going forward? I understand those backups are corrupt and I won't be able to use them going forward, but now that proxmox knows those are corrupt, will future backups work fine or will this be an issue until I remove all the corrupt backups?
 
Just run prune + garbage collection. 24 hours later not-used chunks are discarded. This will not "repair" a damaged chunk.

To "repair" it there is only one single way: make a new backup while the corresponding data is still available on the source. The problem is that usually you do not know if that is the case.

So..., I would just create new backups and delete the old know-bad one.
 
  • Like
Reactions: tcabernoch
Hi,
I'm using S3 for my storage. My question is more about how does proxmox fix this going forward?
there is no official support for S3 (although work in progress, see https://bugzilla.proxmox.com/show_bug.cgi?id=2943). So you are running an unsupported and untested storage backend. May I ask how exactly you are using S3 as storage backend?

I understand those backups are corrupt and I won't be able to use them going forward, but now that proxmox knows those are corrupt, will future backups work fine or will this be an issue until I remove all the corrupt backups?
If you verify backup snapshots, and they fail verification, these snapshots will never be used as reference snapshots for incremental backukps. The subsequent backup run will re-upload all chunks, which if you are lucky "heals" also the previous snapshot if the data for the previously corrupt chunk is re-uploaded.