[SOLVED] Proxmox Backup Remote Sync Error

Jarvar

Active Member
Aug 27, 2019
317
10
38
I have two onsite Proxmox Backup Servers at two locations. They both sync to an offsite remote (pull). Then there is a cloud proxmox backup server Pulls from the offsite remote and both onsite locations
After everything was synced I had a couple backups that had errors saying unable to load blob and - No such file or directory (os error 2)
I deleted one, but how do I sync it back?
any help would be much appreciated.
Thank you.
 
you cannot sync a single snapshot (only a single group, which will only pull in new snapshots, not any "holes" in the past).

you can create a new namespace and pull the whole group there - it will re-use any chunks that already exist in the datastore.
 
you cannot sync a single snapshot (only a single group, which will only pull in new snapshots, not any "holes" in the past).

you can create a new namespace and pull the whole group there - it will re-use any chunks that already exist in the datastore.
Thank you. What would your recommendation be? with the two errors, is there any way to fill them? or signal them to be checked without doing a whole sync?
it's taken a few days to sync almost 2 TB of data.
 
if you create a namespace on the existing datastore all the chunks which are already there will not be re-downloaded, so the transfer size should be small (basically only snapshot manifests + chunks missing for the group you are filtering for would be downloaded). you can then either leave that namespace as archive, or move the missing snapshot directories from one namespace to the original one manually (on the shell, being careful to move only what's needed).

@sterzy is working on a "re-sync" feature that would allow re-syncing missing and/or corrupt snapshots directly as part of the sync, but that is not yet available.
 
if you create a namespace on the existing datastore all the chunks which are already there will not be re-downloaded, so the transfer size should be small (basically only snapshot manifests + chunks missing for the group you are filtering for would be downloaded). you can then either leave that namespace as archive, or move the missing snapshot directories from one namespace to the original one manually (on the shell, being careful to move only what's needed).

@sterzy is working on a "re-sync" feature that would allow re-syncing missing and/or corrupt snapshots directly as part of the sync, but that is not yet available.
Okay I have fixed it by creating a namespace under root and then syncing. I see that one error likely occurred because it was pruned during the week or more sync.
However, can I delete the VM's from the root? or will they delete it from the Namespace which is nested under root?
And will I be able to sync it back to root if I have them in the namespace?
Thank you for your time. I hope this makes sense.
 
the namespaces only contain the "logical" or metadata part of the backup snapshots, the actual data is in chunks in the chunk store, which is shared by the whole datastore.

so if you have two copies of the same snapshot referencing the same chunks in two different namespaces, deleting one of them will not affect the other. but it's not trivial to detect whether this is true for *all* snapshots in a given namespace, so unless you are 100% sure, I'd not clear them out. my suggestion was to only sync those groups where you are missing snapshots, and then to move those frehsly synced snapshots (the directories containing their indices, e.g. vm/123/2022-XXX) into the original namespace.
 
I want to be safe. So if I have one error with a specific date on snapshot for vm 107 for example on the 9th of October. I can delete that from root and then go into the namespace and mv or cp from there to the root?
 
if you still have the (original, corrupt) snapshot, you can keep that, sync the group vm/107 into a new namespace on the same datastore (which will download all the snapshots of that group, but will re-use any existing, valid chunks!), and then re-verify the orginal snapshot (which should show it picks up the newly downloaded chunk and no longer be marked as corrupt).
 
I'm trying to do that. This is the error I get
() Task viewer: Snapshot store001:/vm/107/62F027D2 - Verification OutputStatus Stop 2022-10-13T06:44:18-04:00: verify store001:vm/107/2022-08-07T21:00:02Z - manifest load error: unable to load blob '"/mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/index.json.blob"' - No such file or directory (os error 2) 2022-10-13T06:44:18-04:00: Failed to verify the following snapshots/groups: 2022-10-13T06:44:18-04:00: vm/107/2022-08-07T21:00:02Z 2022-10-13T06:44:18-04:00: TASK ERROR: verification failed - please check the log for details
 
did you delete the files but leave the directory?
 
did you delete the files but leave the directory?
Thank you for your timely responses. I did not delete the corrupt snapshot in the original root directory.
I created a namespace as suggested and synced the group vm/107 there without any errors to the new namespace.
 
can you post the output of the following:

Code:
ls -lha /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/

and (replace NAMESPACE with your namespace)

Code:
ls -lha /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-07T21:00:02Z/
 
@fabian This is the output.
Thank you for your time.

root@pbs001:~# ls -lha /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/ total 8.8M drwxr-xr-x 2 backup backup 4.0K Oct 11 16:07 . drwxr-xr-x 24 backup backup 4.0K Oct 12 22:02 .. -rw-r--r-- 1 backup backup 1.8M Oct 5 16:16 drive-scsi0.img.tmp -rw-r--r-- 1 backup backup 7.0M Oct 5 14:18 drive-scsi1.img.fidx -rw-r--r-- 1 backup backup 737 Oct 5 14:18 index.json.tmp -rw-r--r-- 1 backup backup 429 Oct 5 14:18 qemu-server.conf.blob root@pbs001:~# ls -lha /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-28T21\:00\:01Z/ total 8.8M drwxr-xr-x 2 backup backup 4.0K Oct 13 06:45 . drwxr-xr-x 18 backup backup 4.0K Oct 12 07:42 .. -rw-r--r-- 1 backup backup 1.9K Oct 12 07:40 client.log.blob -rw-r--r-- 1 backup backup 4.1K Oct 12 07:40 drive-efidisk0.img.fidx -rw-r--r-- 1 backup backup 1.8M Oct 12 07:40 drive-scsi0.img.fidx -rw-r--r-- 1 backup backup 7.0M Oct 12 07:40 drive-scsi1.img.fidx -rw-r--r-- 1 backup backup 740 Oct 12 07:40 index.json.blob -rw-r--r-- 1 backup backup 459 Oct 12 07:40 qemu-server.conf.blob root@pbs001:~#
 
okay, so you can problably cp /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-28T21\:00\:01Z/* /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/

but I have to say the contents of the original dir look pretty weird - did that PBS system crash at some point? how often do you sync? is that the original one (where the backup was made) or a sync target?
 
  • Like
Reactions: Jarvar
Hello @fabian yes the local onsite PBS did crash at one point but most likely it was after the 08-28 date. In September there was I/O errors because the disk was full, likely due to a failed backup and the /var/log filled up.
I have the third PBS syncing from both the local onsite PBS001 and another Remote Site PBS002 which I have access to.
I think the error occurred when I the sync was terminated because it was over a week or more due to the large volume. It was not a long continuous seamless sync.
Looked like the cp /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-28T21\:00\:01Z/* /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/

Worked. We're not at disaster recovery now, but it's always good to sort these things out before we get there. Likely don't even need that snapshot, but I never know what will happen.
 
Last edited:
@fabian or anybody knowledgeable about this.
Is it possible to sync with two sources to the same location?
Local site A1, Remote site A2 and Remote site A3.
Will one complete the uncompleted syncs of the other or will it cause trouble?
Or what would be the best setup? A2 is offsite location. A3 is a faster connection but remote storage.
Thank you for any advice.
 
I guess you mean like in the following situation?

Code:
A --> B -> D
  \-> C -/

If you do that I would setup two namespaces on D, so that the two sync jobs with two different sources share the chunk store (bulk of the data), but cannot interfere with eachother w.r.t. locking backup groups and snapshots.
 
  • Like
Reactions: Jarvar
@fabian
I'm going to post this on here,
How would I move VM backups from one datastore to another datastore in the same PBS Server?
I understand if it's just moving backups within the same store it will share the chunks.
What would be the best way to move or copy from one datastore to the other? I'd like to run GC and pruning on each set independently.
Also It'd be easier to backup each store separate instead of one large one depending on where the backups are coming from.
Thank you
 
either use proxmox-backup-manager pull or set up one or more sync jobs (they can be disabled and just executed via "Run Now").
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!