Since in the 'national' forum no one gaves an answer...
I have a problem with a big sync job.
I sync all the data from one PBS to an other at an other location.
After many tries I have now from all VMs something on the sync PBS.
But it fails daily on 2 images: ct/129 and vm/100 with - broken pipe
I don't think that it is a network problem, because why should it always fail on these 2 images and never on others.
Here is the log from today: (search for failed)
Also it is anoying, that on the master PBS every job is listed as error cause of lost connection. But this is an other 'problem'.
The one above is a real problem, because I have no actual backup of these 2 images at the other location.
I have a problem with a big sync job.
I sync all the data from one PBS to an other at an other location.
After many tries I have now from all VMs something on the sync PBS.
But it fails daily on 2 images: ct/129 and vm/100 with - broken pipe
I don't think that it is a network problem, because why should it always fail on these 2 images and never on others.
Here is the log from today: (search for failed)
Code:
Proxmox
Backup Server 1.0-13
()
2021-04-13T04:00:00+02:00: Starting datastore sync job 'pm-pbs-01:ednt:ednt:s-f47b89b8-4867'
2021-04-13T04:00:00+02:00: task triggered by schedule '*-*-* 04:00:00'
2021-04-13T04:00:00+02:00: Sync datastore 'ednt' from 'pm-pbs-01/ednt'
2021-04-13T04:00:01+02:00: found 22 groups to sync
2021-04-13T04:00:01+02:00: re-sync snapshot "ct/129/2021-04-11T21:00:15Z"
2021-04-13T04:00:01+02:00: no data changes
2021-04-13T04:00:01+02:00: re-sync snapshot "ct/129/2021-04-11T21:00:15Z" done
2021-04-13T04:00:01+02:00: percentage done: 4.24% (0 of 22 groups, 14 of 15 group snapshots)
2021-04-13T04:00:01+02:00: sync snapshot "ct/129/2021-04-12T21:00:23Z"
2021-04-13T04:00:01+02:00: sync archive pct.conf.blob
2021-04-13T04:00:01+02:00: sync archive root.pxar.didx
2021-04-13T04:12:53+02:00: percentage done: 4.55% (0 of 22 groups, 15 of 15 group snapshots)
2021-04-13T04:12:53+02:00: sync group ct/129 failed - broken pipe
2021-04-13T04:12:53+02:00: re-sync snapshot "ct/130/2021-04-11T22:09:24Z"
2021-04-13T04:12:53+02:00: no data changes
2021-04-13T04:12:53+02:00: re-sync snapshot "ct/130/2021-04-11T22:09:24Z" done
2021-04-13T04:12:53+02:00: percentage done: 8.79% (1 of 22 groups, 14 of 15 group snapshots)
2021-04-13T04:12:53+02:00: sync snapshot "ct/130/2021-04-12T22:09:14Z"
2021-04-13T04:12:53+02:00: sync archive pct.conf.blob
2021-04-13T04:12:53+02:00: sync archive root.pxar.didx
2021-04-13T04:13:01+02:00: downloaded 235033279 bytes (29.31 MiB/s)
2021-04-13T04:13:01+02:00: sync archive catalog.pcat1.didx
2021-04-13T04:13:01+02:00: downloaded 771040 bytes (5.18 MiB/s)
2021-04-13T04:13:01+02:00: got backup log file "client.log.blob"
2021-04-13T04:13:01+02:00: sync snapshot "ct/130/2021-04-12T22:09:14Z" done
...
...
2021-04-13T04:22:18+02:00: sync archive catalog.pcat1.didx
2021-04-13T04:22:18+02:00: downloaded 942561 bytes (3.04 MiB/s)
2021-04-13T04:22:18+02:00: got backup log file "client.log.blob"
2021-04-13T04:22:18+02:00: sync snapshot "ct/177/2021-04-12T21:13:38Z" done
2021-04-13T04:22:18+02:00: percentage done: 45.45% (9 of 22 groups, 14 of 14 group snapshots)
2021-04-13T04:22:18+02:00: delete vanished snapshot "ct/177/2021-03-28T21:12:28Z"
2021-04-13T04:22:19+02:00: re-sync snapshot "ct/191/2021-04-12T00:12:12Z"
2021-04-13T04:22:19+02:00: no data changes
2021-04-13T04:22:19+02:00: re-sync snapshot "ct/191/2021-04-12T00:12:12Z" done
2021-04-13T04:22:19+02:00: percentage done: 49.70% (10 of 22 groups, 14 of 15 group snapshots)
2021-04-13T04:22:19+02:00: sync snapshot "ct/191/2021-04-13T00:14:31Z"
2021-04-13T04:22:19+02:00: sync archive pct.conf.blob
2021-04-13T04:22:19+02:00: sync archive root.pxar.didx
2021-04-13T04:22:40+02:00: downloaded 44116503 bytes (1.98 MiB/s)
2021-04-13T04:22:40+02:00: sync archive catalog.pcat1.didx
2021-04-13T04:22:40+02:00: downloaded 169802 bytes (2.00 MiB/s)
2021-04-13T04:22:40+02:00: got backup log file "client.log.blob"
2021-04-13T04:22:40+02:00: sync snapshot "ct/191/2021-04-13T00:14:31Z" done
2021-04-13T04:22:40+02:00: percentage done: 50.00% (10 of 22 groups, 15 of 15 group snapshots)
2021-04-13T04:22:40+02:00: delete vanished snapshot "ct/191/2021-03-28T01:00:51Z"
2021-04-13T04:22:40+02:00: re-sync snapshot "vm/100/2021-04-04T21:00:02Z"
2021-04-13T04:22:40+02:00: no data changes
2021-04-13T04:22:40+02:00: re-sync snapshot "vm/100/2021-04-04T21:00:02Z" done
2021-04-13T04:22:40+02:00: percentage done: 51.95% (11 of 22 groups, 6 of 14 group snapshots)
2021-04-13T04:22:40+02:00: sync snapshot "vm/100/2021-04-05T21:00:02Z"
2021-04-13T04:22:40+02:00: sync archive qemu-server.conf.blob
2021-04-13T04:22:40+02:00: sync archive drive-scsi4.img.fidx
2021-04-13T04:46:31+02:00: percentage done: 52.27% (11 of 22 groups, 7 of 14 group snapshots)
2021-04-13T04:46:31+02:00: sync group vm/100 failed - broken pipe
2021-04-13T04:46:31+02:00: re-sync snapshot "vm/108/2021-04-11T21:09:28Z"
2021-04-13T04:46:31+02:00: no data changes
2021-04-13T04:46:31+02:00: re-sync snapshot "vm/108/2021-04-11T21:09:28Z" done
2021-04-13T04:46:31+02:00: percentage done: 58.77% (12 of 22 groups, 13 of 14 group snapshots)
2021-04-13T04:46:31+02:00: sync snapshot "vm/108/2021-04-12T21:10:52Z"
2021-04-13T04:46:31+02:00: sync archive qemu-server.conf.blob
2021-04-13T04:46:32+02:00: sync archive drive-sata0.img.fidx
2021-04-13T04:46:47+02:00: downloaded 22690831 bytes (1.43 MiB/s)
2021-04-13T04:46:47+02:00: got backup log file "client.log.blob"
2021-04-13T04:46:47+02:00: sync snapshot "vm/108/2021-04-12T21:10:52Z" done
2021-04-13T04:46:47+02:00: percentage done: 59.09% (12 of 22 groups, 14 of 14 group snapshots)
...
...
2021-04-13T06:42:15+02:00: sync snapshot "vm/170/2021-04-12T21:06:18Z" done
2021-04-13T06:42:15+02:00: percentage done: 95.45% (20 of 22 groups, 14 of 14 group snapshots)
2021-04-13T06:42:15+02:00: delete vanished snapshot "vm/170/2021-03-28T21:10:55Z"
2021-04-13T06:42:18+02:00: re-sync snapshot "vm/175/2021-04-11T21:13:49Z"
2021-04-13T06:42:18+02:00: no data changes
2021-04-13T06:42:18+02:00: re-sync snapshot "vm/175/2021-04-11T21:13:49Z" done
2021-04-13T06:42:18+02:00: percentage done: 99.68% (21 of 22 groups, 13 of 14 group snapshots)
2021-04-13T06:42:18+02:00: sync snapshot "vm/175/2021-04-12T21:19:19Z"
2021-04-13T06:42:18+02:00: sync archive qemu-server.conf.blob
2021-04-13T06:42:18+02:00: sync archive drive-scsi0.img.fidx
2021-04-13T06:58:43+02:00: downloaded 7418314244 bytes (7.19 MiB/s)
2021-04-13T06:58:43+02:00: got backup log file "client.log.blob"
2021-04-13T06:58:43+02:00: sync snapshot "vm/175/2021-04-12T21:19:19Z" done
2021-04-13T06:58:43+02:00: percentage done: 100.00% (21 of 22 groups, 14 of 14 group snapshots)
2021-04-13T06:58:43+02:00: delete vanished snapshot "vm/175/2021-03-28T21:18:26Z"
2021-04-13T06:58:43+02:00: TASK ERROR: sync failed with some errors.
Also it is anoying, that on the master PBS every job is listed as error cause of lost connection. But this is an other 'problem'.
The one above is a real problem, because I have no actual backup of these 2 images at the other location.