Experiencing a lot of failed sync and some garbage collect fail

kurdam

Active Member
Sep 29, 2020
45
3
28
34
Hi,
I have a PBS setup that is kinda special, that's why i included a diagram.
We currently have a PBS setup that works between two sites linked with an VPN IPSEC site to site tunnel (so everything appears as one big network for us).

In the blue bubble you have the structure of our site1 (simplified of course) where we have a proxmox server running on a dedicated machines with several VMs including our first PBS. This proxmox has a NFS mount to the filer NAS1 on which i created the first PBS datastore that contains all our backups.

In the green bubble, we have our second site where our proxmox is working directly on the second Filer. This proxmox virtualize our second PBS (and some other offsite backups and VMs) and is the mounting point for the second PBS datastore.

I have set up every 6 hours a complete sync of every backups present on site one to site 2.

First I have to say that it works remarkably well for a product this young. We are quite impressed with this product.
Unfortunately, we are experiencing some failed sync between site 1 and 2 (one every 2 or 3 times) with no particular explanations (i check the logs and all i can see is this message : TASK ERROR: sync failed with some errors

I suppose there is some packet drops due to the VPN connection or a high usage of NAS1 (we have a LOT of traffic going to that machine). What i'm asking myself is : Is there a way to implement a retry or a little wait mechanic to avoid the sync failed ? Because it seems like as soon as it sees a network packet drop, the sync process crashes.

Regarding the garbage collect, i have often an error that looks like this: TASK ERROR: unexpected error on datastore traversal: No such file or directory (os error 2) -
I suppose that it's also probably due to a light inaccessibility from the network.
I think that the 2 problems are caused by the same thing.

Thank you in advance for your help.

Regards,
Thomas
 

Attachments

  • archi PBS.png
    archi PBS.png
    89.3 KB · Views: 20
  • Capture.PNG
    Capture.PNG
    4.4 KB · Views: 17
please post the full version information and the sync task log, it should contain a lot more info than just the fact that the sync failed. syncing happens over HTTP/TLS/TCP, so dropping a packet here or there shouldn't fail the sync, the lower parts of the network stack should retry those transparently. of course, longer issues/timeouts can cause the sync to fail.

for datastores on a network share a service/network interruption can of course also lead to issues (blocking file access, files not found, ..), but that is usually all or nothing and not just failing for individual paths. you can play around with mount settings and the like, but a reliable connection is pretty much a must for reliable PBS operations..
 
Here is a typical error for a sync task

()

Task viewer: Sync Job PBS-:Filer1-WIN:DS_HDD:s-02f0435e-75fd - Remote Sync

OutputStatus

Stop
......
2022-03-14T06:02:45+01:00: re-sync snapshot "vm/1014/2022-03-09T08:00:02Z" done
2022-03-14T06:02:45+01:00: percentage done: 29.17% (14/48 groups)
2022-03-14T06:02:45+01:00: skipped: 17 snapshot(s) (2021-11-05T14:49:16Z .. 2022-03-02T08:00:02Z) older than the newest local snapshot
2022-03-14T06:02:46+01:00: percentage done: 31.25% (15/48 groups)
2022-03-14T06:02:46+01:00: sync group vm/1015 failed - unable to acquire lock on snapshot directory "/mnt/datastore/DS_HDD/vm/1015/2022-03-13T06:15:49Z" - internal error - tried creating snapshot that's already in use
2022-03-14T06:02:46+01:00: re-sync snapshot "vm/1016/2022-03-13T20:39:30Z"
2022-03-14T06:02:46+01:00: no data changes
2022-03-14T06:02:46+01:00: re-sync snapshot "vm/1016/2022-03-13T20:39:30Z" done
2022-03-14T06:02:46+01:00: percentage done: 33.33% (16/48 groups)
2022-03-14T06:02:46+01:00: skipped: 20 snapshot(s) (2021-11-28T20:05:37Z .. 2022-03-12T20:54:36Z) older than the newest local snapshot
2022-03-14T06:02:46+01:00: re-sync snapshot "vm/1018/2022-03-13T06:41:37Z"
2022-03-14T06:02:46+01:00: no data changes
2022-03-14T06:02:46+01:00: re-sync snapshot "vm/1018/2022-03-13T06:41:37Z" done
2022-03-14T06:02:46+01:00: percentage done: 35.42% (17/48 groups)
2022-03-14T06:02:46+01:00: skipped: 20 snapshot(s) (2021-11-28T06:27:18Z .. 2022-03-12T06:35:10Z) older than the newest local snapshot
2022-03-14T06:02:47+01:00: re-sync snapshot "vm/1019/2022-03-13T06:52:56Z"
2022-03-14T06:02:47+01:00: no data changes
2022-03-14T06:02:47+01:00: re-sync snapshot "vm/1019/2022-03-13T06:52:56Z" done
2022-03-14T06:02:47+01:00: percentage done: 37.50% (18/48 groups)
2022-03-14T06:02:47+01:00: skipped: 20 snapshot(s) (2021-11-28T06:38:39Z .. 2022-03-12T06:46:32Z) older than the newest local snapshot
2022-03-14T06:02:47+01:00: re-sync snapshot "vm/1020/2022-02-01T20:40:03Z"
2022-03-14T06:02:47+01:00: no data changes
2022-03-14T06:02:47+01:00: re-sync snapshot "vm/1020/2022-02-01T20:40:03Z" done
2022-03-14T06:02:47+01:00: percentage done: 39.58% (19/48 groups)
2022-03-14T06:02:47+01:00: skipped: 19 snapshot(s) (2021-11-28T20:05:00Z .. 2022-01-31T20:27:39Z) older than the newest local snapshot
2022-03-14T06:02:48+01:00: re-sync snapshot "vm/2000/2022-03-09T08:08:49Z"
2022-03-14T06:02:48+01:00: no data changes
2022-03-14T06:02:48+01:00: re-sync snapshot "vm/2000/2022-03-09T08:08:49Z" done
2022-03-14T06:02:48+01:00: percentage done: 41.67% (20/48 groups)
2022-03-14T06:02:48+01:00: skipped: 17 snapshot(s) (2021-11-05T14:49:16Z .. 2022-03-02T08:00:02Z) older than the newest local snapshot
2022-03-14T06:02:48+01:00: re-sync snapshot "vm/2001/2022-03-13T10:02:36Z"
2022-03-14T06:02:48+01:00: no data changes
2022-03-14T06:02:48+01:00: re-sync snapshot "vm/2001/2022-03-13T10:02:36Z" done
2022-03-14T06:02:48+01:00: percentage done: 43.75% (21/48 groups)
2022-03-14T06:02:48+01:00: skipped: 19 snapshot(s) (2021-12-26T10:07:45Z .. 2022-03-11T10:02:38Z) older than the newest local snapshot
2022-03-14T06:02:48+01:00: re-sync snapshot "vm/2002/2022-03-09T08:10:55Z"
2022-03-14T06:02:48+01:00: no data changes
2022-03-14T06:02:48+01:00: re-sync snapshot "vm/2002/2022-03-09T08:10:55Z" done
2022-03-14T06:02:48+01:00: percentage done: 45.83% (22/48 groups)
2022-03-14T06:02:48+01:00: skipped: 16 snapshot(s) (2021-11-05T14:49:16Z .. 2022-03-02T08:01:38Z) older than the newest local snapshot
2022-03-14T06:02:49+01:00: re-sync snapshot "vm/2003/2022-03-13T10:18:15Z"
2022-03-14T06:02:49+01:00: no data changes
2022-03-14T06:02:49+01:00: re-sync snapshot "vm/2003/2022-03-13T10:18:15Z" done
2022-03-14T06:02:49+01:00: percentage done: 47.92% (23/48 groups)
2022-03-14T06:02:49+01:00: skipped: 15 snapshot(s) (2022-02-20T10:20:55Z .. 2022-03-11T10:18:20Z) older than the newest local snapshot
2022-03-14T06:02:49+01:00: re-sync snapshot "vm/2004/2022-03-13T10:20:22Z"
2022-03-14T06:02:49+01:00: no data changes
2022-03-14T06:02:49+01:00: re-sync snapshot "vm/2004/2022-03-13T10:20:22Z" done
2022-03-14T06:02:49+01:00: percentage done: 50.00% (24/48 groups)
2022-03-14T06:02:49+01:00: skipped: 15 snapshot(s) (2022-02-20T10:22:32Z .. 2022-03-11T10:20:27Z) older than the newest local snapshot
2022-03-14T06:02:50+01:00: re-sync snapshot "vm/2501/2022-03-13T17:16:49Z"
2022-03-14T06:02:50+01:00: no data changes
2022-03-14T06:02:50+01:00: re-sync snapshot "vm/2501/2022-03-13T17:16:49Z" done
2022-03-14T06:02:50+01:00: percentage done: 52.08% (25/48 groups)
2022-03-14T06:02:50+01:00: skipped: 17 snapshot(s) (2022-02-06T17:07:55Z .. 2022-03-12T17:16:00Z) older than the newest local snapshot
2022-03-14T06:02:50+01:00: re-sync snapshot "vm/2502/2022-03-13T17:17:10Z"
2022-03-14T06:02:50+01:00: no data changes
2022-03-14T06:02:50+01:00: re-sync snapshot "vm/2502/2022-03-13T17:17:10Z" done
2022-03-14T06:02:50+01:00: percentage done: 54.17% (26/48 groups)
2022-03-14T06:02:50+01:00: skipped: 17 snapshot(s) (2022-02-06T17:08:17Z .. 2022-03-12T17:16:20Z) older than the newest local snapshot
2022-03-14T06:02:50+01:00: re-sync snapshot "vm/500/2022-03-13T10:00:02Z"
2022-03-14T06:02:50+01:00: no data changes
2022-03-14T06:02:50+01:00: re-sync snapshot "vm/500/2022-03-13T10:00:02Z" done
2022-03-14T06:02:50+01:00: percentage done: 56.25% (27/48 groups)
2022-03-14T06:02:50+01:00: skipped: 20 snapshot(s) (2021-11-28T10:00:02Z .. 2022-03-11T10:00:02Z) older than the newest local snapshot
2022-03-14T06:02:51+01:00: re-sync snapshot "vm/501/2022-03-13T10:06:18Z"
2022-03-14T06:02:51+01:00: no data changes
2022-03-14T06:02:51+01:00: re-sync snapshot "vm/501/2022-03-13T10:06:18Z" done
2022-03-14T06:02:51+01:00: percentage done: 58.33% (28/48 groups)
2022-03-14T06:02:51+01:00: skipped: 18 snapshot(s) (2022-01-30T10:08:22Z .. 2022-03-11T10:06:22Z) older than the newest local snapshot
2022-03-14T06:02:51+01:00: re-sync snapshot "vm/502/2022-03-13T10:06:23Z"
2022-03-14T06:02:51+01:00: no data changes
2022-03-14T06:02:51+01:00: re-sync snapshot "vm/502/2022-03-13T10:06:23Z" done
2022-03-14T06:02:51+01:00: percentage done: 60.42% (29/48 groups)
2022-03-14T06:02:51+01:00: skipped: 20 snapshot(s) (2021-11-28T10:06:19Z .. 2022-03-11T10:06:28Z) older than the newest local snapshot
2022-03-14T06:02:52+01:00: re-sync snapshot "vm/503/2022-03-13T17:00:08Z"
2022-03-14T06:02:52+01:00: no data changes
2022-03-14T06:02:52+01:00: re-sync snapshot "vm/503/2022-03-13T17:00:08Z" done
2022-03-14T06:02:52+01:00: percentage done: 62.50% (30/48 groups)
2022-03-14T06:02:52+01:00: skipped: 18 snapshot(s) (2022-01-30T17:00:07Z .. 2022-03-12T17:00:06Z) older than the newest local snapshot
2022-03-14T06:02:52+01:00: re-sync snapshot "vm/504/2022-03-13T10:00:02Z"
2022-03-14T06:02:52+01:00: re-sync snapshot "vm/504/2022-03-13T10:00:02Z" done
2022-03-14T06:02:52+01:00: percentage done: 64.58% (31/48 groups)
2022-03-14T06:02:52+01:00: skipped: 19 snapshot(s) (2021-11-24T18:00:02Z .. 2022-03-11T10:00:02Z) older than the newest local snapshot
2022-03-14T06:02:53+01:00: re-sync snapshot "vm/505/2022-03-13T09:00:01Z"
2022-03-14T06:02:53+01:00: re-sync snapshot "vm/505/2022-03-13T09:00:01Z" done
2022-03-14T06:02:53+01:00: percentage done: 66.67% (32/48 groups)
2022-03-14T06:02:53+01:00: skipped: 21 snapshot(s) (2021-10-31T09:00:02Z .. 2022-03-12T09:00:02Z) older than the newest local snapshot
2022-03-14T06:02:53+01:00: re-sync snapshot "vm/506/2022-03-13T10:00:09Z"
2022-03-14T06:02:53+01:00: re-sync snapshot "vm/506/2022-03-13T10:00:09Z" done
2022-03-14T06:02:53+01:00: percentage done: 68.75% (33/48 groups)
2022-03-14T06:02:53+01:00: skipped: 20 snapshot(s) (2021-11-28T10:00:02Z .. 2022-03-11T10:00:13Z) older than the newest local snapshot
2022-03-14T06:02:54+01:00: re-sync snapshot "vm/507/2022-03-13T00:30:08Z"
2022-03-14T06:02:54+01:00: re-sync snapshot "vm/507/2022-03-13T00:30:08Z" done
2022-03-14T06:02:54+01:00: percentage done: 70.74% (33/48 groups, 21/22 snapshots in group #34)
2022-03-14T06:02:54+01:00: sync snapshot "vm/507/2022-03-14T00:30:05Z"
2022-03-14T06:02:54+01:00: sync archive qemu-server.conf.blob
2022-03-14T06:02:55+01:00: sync archive drive-scsi1.img.fidx
2022-03-14T06:11:39+01:00: downloaded 5894055751 bytes (10.74 MiB/s)
2022-03-14T06:11:39+01:00: sync archive drive-scsi0.img.fidx
2022-03-14T06:11:44+01:00: downloaded 51483852 bytes (10.68 MiB/s)
2022-03-14T06:11:44+01:00: got backup log file "client.log.blob"
2022-03-14T06:11:44+01:00: sync snapshot "vm/507/2022-03-14T00:30:05Z" done
2022-03-14T06:11:44+01:00: percentage done: 70.83% (34/48 groups)
2022-03-14T06:11:44+01:00: skipped: 20 snapshot(s) (2021-11-28T00:30:01Z .. 2022-03-12T00:30:05Z) older than the newest local snapshot
2022-03-14T06:11:44+01:00: re-sync snapshot "vm/508/2022-03-13T10:07:43Z"
2022-03-14T06:11:45+01:00: re-sync snapshot "vm/508/2022-03-13T10:07:43Z" done
2022-03-14T06:11:45+01:00: percentage done: 72.92% (35/48 groups)
2022-03-14T06:11:45+01:00: skipped: 20 snapshot(s) (2021-11-28T10:07:41Z .. 2022-03-11T10:07:49Z) older than the newest local snapshot
2022-03-14T06:11:45+01:00: re-sync snapshot "vm/509/2022-03-13T10:00:14Z"
2022-03-14T06:11:45+01:00: re-sync snapshot "vm/509/2022-03-13T10:00:14Z" done
2022-03-14T06:11:45+01:00: percentage done: 75.00% (36/48 groups)
2022-03-14T06:11:45+01:00: skipped: 16 snapshot(s) (2022-02-13T10:00:14Z .. 2022-03-11T10:00:17Z) older than the newest local snapshot
2022-03-14T06:11:45+01:00: re-sync snapshot "vm/510/2022-03-13T10:08:00Z"
2022-03-14T06:11:45+01:00: re-sync snapshot "vm/510/2022-03-13T10:08:00Z" done
2022-03-14T06:11:45+01:00: percentage done: 77.08% (37/48 groups)
2022-03-14T06:11:45+01:00: skipped: 19 snapshot(s) (2021-11-28T10:07:45Z .. 2022-03-11T10:07:57Z) older than the newest local snapshot
2022-03-14T06:11:46+01:00: re-sync snapshot "vm/511/2022-03-13T10:00:21Z"
2022-03-14T06:11:46+01:00: re-sync snapshot "vm/511/2022-03-13T10:00:21Z" done
2022-03-14T06:11:46+01:00: percentage done: 79.17% (38/48 groups)
2022-03-14T06:11:46+01:00: skipped: 20 snapshot(s) (2021-11-28T10:06:43Z .. 2022-03-11T10:00:22Z) older than the newest local snapshot
2022-03-14T06:11:46+01:00: re-sync snapshot "vm/512/2022-03-13T17:00:30Z"
2022-03-14T06:11:46+01:00: re-sync snapshot "vm/512/2022-03-13T17:00:30Z" done
2022-03-14T06:11:46+01:00: percentage done: 81.25% (39/48 groups)
2022-03-14T06:11:46+01:00: skipped: 8 snapshot(s) (2021-11-25T18:00:02Z .. 2022-03-12T17:00:27Z) older than the newest local snapshot
2022-03-14T06:11:47+01:00: re-sync snapshot "vm/513/2022-03-13T10:00:22Z"
2022-03-14T06:11:47+01:00: re-sync snapshot "vm/513/2022-03-13T10:00:22Z" done
2022-03-14T06:11:47+01:00: percentage done: 83.33% (40/48 groups)
2022-03-14T06:11:47+01:00: skipped: 18 snapshot(s) (2021-12-26T18:00:02Z .. 2022-03-11T10:00:23Z) older than the newest local snapshot
2022-03-14T06:11:47+01:00: re-sync snapshot "vm/514/2022-03-13T17:00:36Z"
2022-03-14T06:11:47+01:00: re-sync snapshot "vm/514/2022-03-13T17:00:36Z" done
2022-03-14T06:11:47+01:00: percentage done: 85.42% (41/48 groups)
2022-03-14T06:11:47+01:00: skipped: 19 snapshot(s) (2021-12-26T18:00:12Z .. 2022-03-12T17:00:32Z) older than the newest local snapshot
2022-03-14T06:11:47+01:00: re-sync snapshot "vm/515/2022-01-02T18:00:27Z"
2022-03-14T06:11:47+01:00: no data changes
2022-03-14T06:11:47+01:00: re-sync snapshot "vm/515/2022-01-02T18:00:27Z" done
2022-03-14T06:11:47+01:00: percentage done: 87.50% (42/48 groups)
2022-03-14T06:11:47+01:00: skipped: 16 snapshot(s) (2021-12-05T18:00:22Z .. 2022-01-01T18:03:45Z) older than the newest local snapshot
2022-03-14T06:11:48+01:00: re-sync snapshot "vm/516/2022-01-13T17:02:09Z"
2022-03-14T06:11:48+01:00: no data changes
2022-03-14T06:11:48+01:00: re-sync snapshot "vm/516/2022-01-13T17:02:09Z" done
2022-03-14T06:11:48+01:00: percentage done: 89.58% (43/48 groups)
2022-03-14T06:11:48+01:00: skipped: 17 snapshot(s) (2021-12-10T18:00:02Z .. 2022-01-12T17:02:19Z) older than the newest local snapshot
2022-03-14T06:11:48+01:00: re-sync snapshot "vm/517/2022-01-13T17:22:49Z"
2022-03-14T06:11:48+01:00: no data changes
2022-03-14T06:11:48+01:00: re-sync snapshot "vm/517/2022-01-13T17:22:49Z" done
2022-03-14T06:11:48+01:00: percentage done: 91.67% (44/48 groups)
2022-03-14T06:11:48+01:00: skipped: 17 snapshot(s) (2021-12-10T18:00:07Z .. 2022-01-12T17:20:04Z) older than the newest local snapshot
2022-03-14T06:11:48+01:00: re-sync snapshot "vm/518/2022-01-19T17:02:24Z"
2022-03-14T06:11:48+01:00: no data changes
2022-03-14T06:11:48+01:00: re-sync snapshot "vm/518/2022-01-19T17:02:24Z" done
2022-03-14T06:11:48+01:00: percentage done: 93.75% (45/48 groups)
2022-03-14T06:11:48+01:00: skipped: 17 snapshot(s) (2021-12-19T18:00:01Z .. 2022-01-18T17:01:39Z) older than the newest local snapshot
2022-03-14T06:11:49+01:00: re-sync snapshot "vm/519/2022-01-19T17:18:19Z"
2022-03-14T06:11:49+01:00: no data changes
2022-03-14T06:11:49+01:00: re-sync snapshot "vm/519/2022-01-19T17:18:19Z" done
2022-03-14T06:11:49+01:00: percentage done: 95.83% (46/48 groups)
2022-03-14T06:11:49+01:00: skipped: 17 snapshot(s) (2021-12-19T18:00:05Z .. 2022-01-18T17:14:10Z) older than the newest local snapshot
2022-03-14T06:11:49+01:00: re-sync snapshot "vm/520/2022-01-25T17:09:06Z"
2022-03-14T06:11:49+01:00: no data changes
2022-03-14T06:11:49+01:00: re-sync snapshot "vm/520/2022-01-25T17:09:06Z" done
2022-03-14T06:11:49+01:00: percentage done: 97.92% (47/48 groups)
2022-03-14T06:11:49+01:00: skipped: 17 snapshot(s) (2021-12-26T18:00:24Z .. 2022-01-24T17:02:33Z) older than the newest local snapshot
2022-03-14T06:11:50+01:00: re-sync snapshot "vm/521/2022-02-27T17:05:15Z"
2022-03-14T06:11:50+01:00: no data changes
2022-03-14T06:11:50+01:00: re-sync snapshot "vm/521/2022-02-27T17:05:15Z" done
2022-03-14T06:11:50+01:00: percentage done: 100.00% (48/48 groups)
2022-03-14T06:11:50+01:00: skipped: 18 snapshot(s) (2021-12-26T18:00:28Z .. 2022-02-26T17:02:55Z) older than the newest local snapshot
2022-03-14T06:11:50+01:00: TASK ERROR: sync failed with some errors.
 
Here is a typical error for a GC:

2022-03-14T00:00:00+01:00: starting garbage collection on store Filer1-WIN
2022-03-14T00:00:00+01:00: task triggered by schedule 'daily'
2022-03-14T00:00:00+01:00: Start GC phase1 (mark used chunks)
2022-03-14T00:00:00+01:00: TASK ERROR: unexpected error on datastore traversal: No such file or directory (os error 2) - "/mnt/datastore/Filer1-WIN/vm/1008/2022-02-26T20:00:02Z"
 
Here is a typical verify error:

2022-03-12T09:17:39+01:00: percentage done: 70.00% (35/50 groups)
2022-03-12T09:17:39+01:00: verify group Filer3-WIN:vm/1014 (18 snapshots)
2022-03-12T09:17:39+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-03-09T08:00:02Z (recently verified)
2022-03-12T09:17:39+01:00: percentage done: 70.11% (35/50 groups, 1/18 snapshots in group #36)
2022-03-12T09:17:39+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-03-02T08:00:02Z (recently verified)
2022-03-12T09:17:39+01:00: percentage done: 70.22% (35/50 groups, 2/18 snapshots in group #36)
2022-03-12T09:17:39+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-02-23T08:00:01Z (recently verified)
2022-03-12T09:17:39+01:00: percentage done: 70.33% (35/50 groups, 3/18 snapshots in group #36)
2022-03-12T09:17:39+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-02-16T08:00:02Z (recently verified)
2022-03-12T09:17:39+01:00: percentage done: 70.44% (35/50 groups, 4/18 snapshots in group #36)
2022-03-12T09:17:39+01:00: verify Filer3-WIN:vm/1014/2022-02-09T08:00:02Z
2022-03-12T09:17:39+01:00: check qemu-server.conf.blob
2022-03-12T09:17:39+01:00: check drive-ide0.img.fidx
2022-03-12T09:28:40+01:00: verified 24327.49/47080.00 MiB in 661.32 seconds, speed 36.79/71.19 MiB/s (0 errors)
2022-03-12T09:28:40+01:00: percentage done: 70.56% (35/50 groups, 5/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-02-02T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 70.67% (35/50 groups, 6/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-01-26T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 70.78% (35/50 groups, 7/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-01-19T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 70.89% (35/50 groups, 8/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-01-12T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 71.00% (35/50 groups, 9/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2022-01-05T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 71.11% (35/50 groups, 10/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2021-12-29T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 71.22% (35/50 groups, 11/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2021-12-22T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 71.33% (35/50 groups, 12/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2021-12-15T08:00:02Z (recently verified)
2022-03-12T09:28:40+01:00: percentage done: 71.44% (35/50 groups, 13/18 snapshots in group #36)
2022-03-12T09:28:40+01:00: verify Filer3-WIN:vm/1014/2021-12-08T08:00:02Z
2022-03-12T09:28:40+01:00: check qemu-server.conf.blob
2022-03-12T09:28:40+01:00: check drive-ide0.img.fidx
2022-03-12T09:33:16+01:00: verified 10309.23/21232.00 MiB in 275.44 seconds, speed 37.43/77.08 MiB/s (0 errors)
2022-03-12T09:33:16+01:00: percentage done: 71.56% (35/50 groups, 14/18 snapshots in group #36)
2022-03-12T09:33:16+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2021-12-01T08:00:02Z (recently verified)
2022-03-12T09:33:16+01:00: percentage done: 71.67% (35/50 groups, 15/18 snapshots in group #36)
2022-03-12T09:33:16+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2021-11-24T08:00:02Z (recently verified)
2022-03-12T09:33:16+01:00: percentage done: 71.78% (35/50 groups, 16/18 snapshots in group #36)
2022-03-12T09:33:16+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2021-11-10T08:00:01Z (recently verified)
2022-03-12T09:33:16+01:00: percentage done: 71.89% (35/50 groups, 17/18 snapshots in group #36)
2022-03-12T09:33:16+01:00: SKIPPED: verify Filer3-WIN:vm/1014/2021-11-05T14:49:16Z (recently verified)
2022-03-12T09:33:16+01:00: percentage done: 72.00% (36/50 groups)
2022-03-12T09:33:16+01:00: verify group Filer3-WIN:vm/1015 (22 snapshots)
2022-03-12T09:33:16+01:00: verify Filer3-WIN:vm/1015/2022-03-12T06:09:20Z
2022-03-12T09:33:16+01:00: check qemu-server.conf.blob
2022-03-12T09:33:16+01:00: check drive-ide1.img.fidx
2022-03-12T10:00:10+01:00: verified 69701.73/69804.00 MiB in 1613.50 seconds, speed 43.20/43.26 MiB/s (0 errors)
2022-03-12T10:00:10+01:00: check drive-ide0.img.fidx
2022-03-12T10:08:09+01:00: verified 10525.49/22032.00 MiB in 479.05 seconds, speed 21.97/45.99 MiB/s (0 errors)
2022-03-12T10:08:09+01:00: percentage done: 72.09% (36/50 groups, 1/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-11T06:07:12Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.18% (36/50 groups, 2/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-10T06:09:00Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.27% (36/50 groups, 3/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-09T06:17:33Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.36% (36/50 groups, 4/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-08T06:06:24Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.45% (36/50 groups, 5/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-07T06:07:17Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.55% (36/50 groups, 6/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-06T06:06:58Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.64% (36/50 groups, 7/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-05T06:07:54Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.73% (36/50 groups, 8/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-04T06:08:04Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.82% (36/50 groups, 9/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-03T06:07:32Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 72.91% (36/50 groups, 10/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-02T06:07:07Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.00% (36/50 groups, 11/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-03-01T06:08:23Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.09% (36/50 groups, 12/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-02-28T06:08:18Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.18% (36/50 groups, 13/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-02-27T06:08:05Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.27% (36/50 groups, 14/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-02-26T06:07:22Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.36% (36/50 groups, 15/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-02-25T06:07:19Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.45% (36/50 groups, 16/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-02-19T06:08:34Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.55% (36/50 groups, 17/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-02-13T06:08:54Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.64% (36/50 groups, 18/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-02-06T06:10:10Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.73% (36/50 groups, 19/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2022-01-30T06:02:14Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.82% (36/50 groups, 20/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2021-12-26T06:01:56Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 73.91% (36/50 groups, 21/22 snapshots in group #37)
2022-03-12T10:08:09+01:00: SKIPPED: verify Filer3-WIN:vm/1015/2021-11-28T06:01:23Z (recently verified)
2022-03-12T10:08:09+01:00: percentage done: 74.00% (37/50 groups)
2022-03-12T10:08:09+01:00: verify group Filer3-WIN:vm/1016 (21 snapshots)
2022-03-12T10:08:09+01:00: verify Filer3-WIN:vm/1016/2022-03-11T20:46:06Z
2022-03-12T10:08:09+01:00: check qemu-server.conf.blob
2022-03-12T10:08:09+01:00: check drive-ide1.img.fidx
2022-03-12T10:25:51+01:00: verified 34329.90/101164.00 MiB in 1062.05 seconds, speed 32.32/95.25 MiB/s (0 errors)
2022-03-12T10:25:51+01:00: check drive-ide0.img.fidx
2022-03-12T10:51:19+01:00: verified 49531.96/88820.00 MiB in 1528.11 seconds, speed 32.41/58.12 MiB/s (0 errors)
2022-03-12T10:51:19+01:00: TASK ERROR: verification failed - job aborted
 
for sync, the following looks suspicious:

Code:
2022-03-14T06:02:46+01:00: sync group vm/1015 failed - unable to acquire lock on snapshot directory "/mnt/datastore/DS_HDD/vm/1015/2022-03-13T06:15:49Z" - internal error - tried creating snapshot that's already in use

was there some other operation on that group at the time? (overlapping sync, prune, backup tasks?)

the verification job was aborted, and the GC seems to run into an issue with your NAS (the system logs around that time might give you more information)
 
Yeah, it looks like there is some overlapping, but it doesn't seems to be the cause of the sync failed because the process continues after that.
I'm gonna look into it in the meantime and maybe space a bit more the sync task and the GC task.
What is really weird is that it's not failing all the time. Sometimes it does, sometimes not.
 
well the sync skips that group, syncs the rest, but still marks the job as failed since it had issues with that specific group at that specific point in time..
 
Ok. If it's how that works I understand better.
I have changed the times of these actions (sync, CG ...) a bit i'll see in the next few days if it did the trick

Thank you,
Regards
Thomas
 
It looks like that moving the syncs a bit later solved the problem for me.
Thank you for your help
 
  • Like
Reactions: fabian

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!