[SOLVED] Proxmox Backup Remote Sync Error

Jarvar · Oct 12, 2022

I have two onsite Proxmox Backup Servers at two locations. They both sync to an offsite remote (pull). Then there is a cloud proxmox backup server Pulls from the offsite remote and both onsite locations
After everything was synced I had a couple backups that had errors saying unable to load blob and - No such file or directory (os error 2)
I deleted one, but how do I sync it back?
any help would be much appreciated.
Thank you.

Jarvar · Oct 12, 2022

How can I sync the backup that had an error which I deleted?
thanks

fabian · Oct 12, 2022

you cannot sync a single snapshot (only a single group, which will only pull in new snapshots, not any "holes" in the past).

you can create a new namespace and pull the whole group there - it will re-use any chunks that already exist in the datastore.

Jarvar · Oct 12, 2022

fabian said:
you cannot sync a single snapshot (only a single group, which will only pull in new snapshots, not any "holes" in the past).

you can create a new namespace and pull the whole group there - it will re-use any chunks that already exist in the datastore.

Thank you. What would your recommendation be? with the two errors, is there any way to fill them? or signal them to be checked without doing a whole sync?
it's taken a few days to sync almost 2 TB of data.

fabian · Oct 12, 2022

if you create a namespace on the existing datastore all the chunks which are already there will not be re-downloaded, so the transfer size should be small (basically only snapshot manifests + chunks missing for the group you are filtering for would be downloaded). you can then either leave that namespace as archive, or move the missing snapshot directories from one namespace to the original one manually (on the shell, being careful to move only what's needed).

@sterzy is working on a "re-sync" feature that would allow re-syncing missing and/or corrupt snapshots directly as part of the sync, but that is not yet available.

Jarvar · Oct 12, 2022

fabian said:
if you create a namespace on the existing datastore all the chunks which are already there will not be re-downloaded, so the transfer size should be small (basically only snapshot manifests + chunks missing for the group you are filtering for would be downloaded). you can then either leave that namespace as archive, or move the missing snapshot directories from one namespace to the original one manually (on the shell, being careful to move only what's needed).

@sterzy is working on a "re-sync" feature that would allow re-syncing missing and/or corrupt snapshots directly as part of the sync, but that is not yet available.

Okay I have fixed it by creating a namespace under root and then syncing. I see that one error likely occurred because it was pruned during the week or more sync.
However, can I delete the VM's from the root? or will they delete it from the Namespace which is nested under root?
And will I be able to sync it back to root if I have them in the namespace?
Thank you for your time. I hope this makes sense.

fabian · Oct 12, 2022

the namespaces only contain the "logical" or metadata part of the backup snapshots, the actual data is in chunks in the chunk store, which is shared by the whole datastore.

so if you have two copies of the same snapshot referencing the same chunks in two different namespaces, deleting one of them will not affect the other. but it's not trivial to detect whether this is true for *all* snapshots in a given namespace, so unless you are 100% sure, I'd not clear them out. my suggestion was to only sync those groups where you are missing snapshots, and then to move those frehsly synced snapshots (the directories containing their indices, e.g. vm/123/2022-XXX) into the original namespace.

Jarvar · Oct 12, 2022

I want to be safe. So if I have one error with a specific date on snapshot for vm 107 for example on the 9th of October. I can delete that from root and then go into the namespace and mv or cp from there to the root?

fabian · Oct 13, 2022

if you still have the (original, corrupt) snapshot, you can keep that, sync the group vm/107 into a new namespace on the same datastore (which will download all the snapshots of that group, but will re-use any existing, valid chunks!), and then re-verify the orginal snapshot (which should show it picks up the newly downloaded chunk and no longer be marked as corrupt).

Jarvar · Oct 13, 2022

I'm trying to do that. This is the error I get



()

Task viewer: Snapshot store001:/vm/107/62F027D2 - Verification

OutputStatus

Stop
2022-10-13T06:44:18-04:00: verify store001:vm/107/2022-08-07T21:00:02Z - manifest load error: unable to load blob '"/mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/index.json.blob"' - No such file or directory (os error 2)
2022-10-13T06:44:18-04:00: Failed to verify the following snapshots/groups:
2022-10-13T06:44:18-04:00:     vm/107/2022-08-07T21:00:02Z
2022-10-13T06:44:18-04:00: TASK ERROR: verification failed - please check the log for details

fabian · Oct 13, 2022

did you delete the files but leave the directory?

Jarvar · Oct 13, 2022

fabian said:
did you delete the files but leave the directory?

Thank you for your timely responses. I did not delete the corrupt snapshot in the original root directory.
I created a namespace as suggested and synced the group vm/107 there without any errors to the new namespace.

fabian · Oct 13, 2022

can you post the output of the following:

Code:

ls -lha /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/

and (replace NAMESPACE with your namespace)

Code:

ls -lha /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-07T21:00:02Z/

Jarvar · Oct 13, 2022

@fabian This is the output.
Thank you for your time.


root@pbs001:~# ls -lha /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/
total 8.8M
drwxr-xr-x  2 backup backup 4.0K Oct 11 16:07 .
drwxr-xr-x 24 backup backup 4.0K Oct 12 22:02 ..
-rw-r--r--  1 backup backup 1.8M Oct  5 16:16 drive-scsi0.img.tmp
-rw-r--r--  1 backup backup 7.0M Oct  5 14:18 drive-scsi1.img.fidx
-rw-r--r--  1 backup backup  737 Oct  5 14:18 index.json.tmp
-rw-r--r--  1 backup backup  429 Oct  5 14:18 qemu-server.conf.blob

root@pbs001:~# ls -lha /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-28T21\:00\:01Z/
total 8.8M
drwxr-xr-x  2 backup backup 4.0K Oct 13 06:45 .
drwxr-xr-x 18 backup backup 4.0K Oct 12 07:42 ..
-rw-r--r--  1 backup backup 1.9K Oct 12 07:40 client.log.blob
-rw-r--r--  1 backup backup 4.1K Oct 12 07:40 drive-efidisk0.img.fidx
-rw-r--r--  1 backup backup 1.8M Oct 12 07:40 drive-scsi0.img.fidx
-rw-r--r--  1 backup backup 7.0M Oct 12 07:40 drive-scsi1.img.fidx
-rw-r--r--  1 backup backup  740 Oct 12 07:40 index.json.blob
-rw-r--r--  1 backup backup  459 Oct 12 07:40 qemu-server.conf.blob
root@pbs001:~#

fabian · Oct 13, 2022

okay, so you can problably cp /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-28T21\:00\:01Z/* /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/

but I have to say the contents of the original dir look pretty weird - did that PBS system crash at some point? how often do you sync? is that the original one (where the backup was made) or a sync target?

Jarvar · Oct 13, 2022

Hello @fabian yes the local onsite PBS did crash at one point but most likely it was after the 08-28 date. In September there was I/O errors because the disk was full, likely due to a failed backup and the /var/log filled up.
I have the third PBS syncing from both the local onsite PBS001 and another Remote Site PBS002 which I have access to.
I think the error occurred when I the sync was terminated because it was over a week or more due to the large volume. It was not a long continuous seamless sync.
Looked like the cp /mnt/disk001/store001/ns/NAMESPACE/vm/107/2022-08-28T21\:00\:01Z/* /mnt/disk001/store001/vm/107/2022-08-07T21:00:02Z/

Worked. We're not at disaster recovery now, but it's always good to sort these things out before we get there. Likely don't even need that snapshot, but I never know what will happen.

Jarvar · Oct 22, 2022

@fabian or anybody knowledgeable about this.
Is it possible to sync with two sources to the same location?
Local site A1, Remote site A2 and Remote site A3.
Will one complete the uncompleted syncs of the other or will it cause trouble?
Or what would be the best setup? A2 is offsite location. A3 is a faster connection but remote storage.
Thank you for any advice.

fabian · Oct 24, 2022

I guess you mean like in the following situation?

Code:

A --> B -> D
  \-> C -/

If you do that I would setup two namespaces on D, so that the two sync jobs with two different sources share the chunk store (bulk of the data), but cannot interfere with eachother w.r.t. locking backup groups and snapshots.

Jarvar · Dec 3, 2022

@fabian
I'm going to post this on here,
How would I move VM backups from one datastore to another datastore in the same PBS Server?
I understand if it's just moving backups within the same store it will share the chunks.
What would be the best way to move or copy from one datastore to the other? I'd like to run GC and pruning on each set independently.
Also It'd be easier to backup each store separate instead of one large one depending on where the backups are coming from.
Thank you

fabian · Dec 5, 2022

either use proxmox-backup-manager pull or set up one or more sync jobs (they can be disabled and just executed via "Run Now").

[SOLVED] Proxmox Backup Remote Sync Error

Well-Known Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

We value your privacy