Hey guys,
I’m running a PBS sync at one of our offsite locations that pulls datastore contents from our main site.
Recently, these sync jobs have started failing due to timeouts:
On the local site I see this:
We have an IPsec site-to-site tunnel established between those locations.
The tunnel itself is functioning correctly - I tested running a job during both the re-authentication and rekeying phases, and it completed without issues. The timeouts must be caused by something else. Anybody else had a similar issue?
I’m running a PBS sync at one of our offsite locations that pulls datastore contents from our main site.
Recently, these sync jobs have started failing due to timeouts:
Code:
2025-09-08T04:12:40+03:00: sync snapshot vm/126/2025-09-07T21:17:02Z
2025-09-08T04:12:40+03:00: sync archive qemu-server.conf.blob
2025-09-08T04:12:40+03:00: sync archive drive-virtio0.img.fidx
2025-09-08T04:38:15+03:00: removing backup snapshot "/mnt/datastore/prx-cluster-backup-lab/ns/bmd-prod-prx-bck-ns/vm/126/2025-09-07T21:17:02Z"
2025-09-08T04:38:15+03:00: percentage done: 100.00% (26/26 groups)
2025-09-08T04:38:15+03:00: sync group vm/126 failed - timed out
2025-09-08T04:38:15+03:00: Finished syncing namespace bmd-prod-prx-ns, current progress: 25 groups, 2 snapshots
2025-09-08T04:38:15+03:00: TASK ERROR: sync failed with some errors.
On the local site I see this:
Code:
2025-09-06T04:05:21+03:00: download chunk "/mnt/datastore/proxmox_vm_backups/.chunks/f299/f299c82ed672bce5854ba50acbb137145141d4d04db8c1de09cf7913c49a9ca4"
2025-09-06T04:05:21+03:00: GET /chunk
2025-09-06T04:05:21+03:00: download chunk "/mnt/datastore/proxmox_vm_backups/.chunks/bc2b/bc2b8b8ebf0765fa7437acd7a0a3f77078730b3ab80099cdffe0615e65da2844"
2025-09-06T04:05:21+03:00: GET /chunk
2025-09-06T04:05:21+03:00: download chunk "/mnt/datastore/proxmox_vm_backups/.chunks/f0c2/f0c243f603abe11be9e6e5689d47b16a1b63791c8dba6101a2490f594a5e410c"
2025-09-06T04:21:12+03:00: TASK ERROR: connection error: timed out: timed out
We have an IPsec site-to-site tunnel established between those locations.
The tunnel itself is functioning correctly - I tested running a job during both the re-authentication and rekeying phases, and it completed without issues. The timeouts must be caused by something else. Anybody else had a similar issue?