Sync job failed everytime

filippoclikkami · May 22, 2025

Hi, i got a local pbs that should be synced to a remote one everynight but fails constantly. Pbs now running on a vm in my pve, i tried to reinstall from scratch but same ending.

Both pbs local and remote runs 3.4.1, started getting error since the first sync, so i thinked job was too heavy and i try with only one vm group which has only one snapshot, but every time i get this error:

...
2025-05-22T01:00:03+02:00: Found 1 groups to sync (out of 9 total)
2025-05-22T01:01:58+02:00: Percentage done: 16.67% (1/6 snapshots)
2025-05-22T01:01:58+02:00: Encountered errors: error:0A0003FC:SSL routines:ssl3_read_bytes:sslv3 alert bad record mac:../ssl/record/rec_layer_s3.c:1605:SSL alert number 20
2025-05-22T01:01:58+02:00: Failed to push group vm/106 to remote!
2025-05-22T01:01:58+02:00: Finished syncing root namespace, current progress: 0 groups, 1 snapshots
2025-05-22T01:01:58+02:00: TASK ERROR: Sync failed with some errors!

I searched for errors on SSLCERT, but I don't think that's the problem. I also have other PBS that sync regularly without SSL certificate. What could be causing it?

Thanks in advance.

Chris · May 22, 2025

Hi,
did you specify the correct fingerprint when setting up the remote? Note that for self signed certificates this must be set, otherwise no TLS session can be established.

filippoclikkami · May 22, 2025

Chris said:
Hi,
did you specify the correct fingerprint when setting up the remote? Note that for self signed certificates this must be set, otherwise no TLS session can be established.

Yes, i specified fingerprint of the datastore used for syncing

Chris · Friday at 08:39

filippoclikkami said:
2025-05-22T01:00:03+02:00: Found 1 groups to sync (out of 9 total)
2025-05-22T01:01:58+02:00: Percentage done: 16.67% (1/6 snapshots)

Ah, yes. Sorry I missed that the sync starts out just fine as seen from the logs above. So this is more likely a networking issue. How are you connecting the local PBS instance to the remote? Is the traffic going trough some VPN and if so which VPN solution are you using?

Edit: Also have a look at https://forum.proxmox.com/threads/pbs-sync-failed-each-time.113921/post-573939

filippoclikkami · Friday at 12:10

There is no vpn involved. Local pbs is under a firewall, the remote one is public in this exact moment. Nic of local pbs vm is VirtIO, in acccording of linked post, his problem was with e1000e. Same pve that runs pbs vm also run other vms and they work fine

Wibla · Saturday at 09:35

It is my experience that the sync process is very brittle. It fails without (apparently) retrying, leaving you with the only option of having to re-start the sync job in its entirety and hope it works the second time. This effectively means you have to babysit the thing, you can't trust the scheduled sync job to actually finish properly on its own.

I see this quite a bit using VPN (tailscale) to transfer backups, which probably isn't a "supported config" anyway, but I've never had a problem using rsync to transfer pve backups over the same link before...

alietz · Saturday at 11:59

I haven't found that the sync process is brittle. We sync GBytes of backup data with PBS every day via an IPSEC VPN (between locations) and an external PBS provider with no issues. Of course, it will stop with an error when there are network issues, but that almost never happens for us.

You said you were running PBS in a VM -- what is the underlying hardware? PBS needs fast I/O, so it should be running on (enterprise) SSDs ideally, or with a special ZFS device for metadata if you are running on HDs.

filippoclikkami · Sunday at 13:38

alietz said:
I haven't found that the sync process is brittle. We sync GBytes of backup data with PBS every day via an IPSEC VPN (between locations) and an external PBS provider with no issues. Of course, it will stop with an error when there are network issues, but that almost never happens for us.

You said you were running PBS in a VM -- what is the underlying hardware? PBS needs fast I/O, so it should be running on (enterprise) SSDs ideally, or with a special ZFS device for metadata if you are running on HDs.

I think hw isn't the issue, host mounts all SSDs on an hw raid controller, no error on controller or disks

Wibla said:
It is my experience that the sync process is very brittle. It fails without (apparently) retrying, leaving you with the only option of having to re-start the sync job in its entirety and hope it works the second time. This effectively means you have to babysit the thing, you can't trust the scheduled sync job to actually finish properly on its own.

I see this quite a bit using VPN (tailscale) to transfer backups, which probably isn't a "supported config" anyway, but I've never had a problem using rsync to transfer pve backups over the same link before...

I don't use vpn in this case and retried also with "manual" running sync job but every time the same. Also repeat the process from scratch doesn't work, new vm, only one snapshot of only one vm and then start sync job... no luck. I also tried sync between 2 vm pbs instances, on the same pve node, to eachother, same error. Just for clearance, other PBSs syncs without issue and everything is similar to this one, also same firewall, that by default doesn't block this traffic. I'm quite sure it's a little minor thing that is causing everything

EDIT:
if can help, datastore is on a qnap nas and backup jobs works nice (like in other pbs where syncs works)

Chris · Monday at 09:38

filippoclikkami said:
I also tried sync between 2 vm pbs instances, on the same pve node, to eachother, same error. Just for clearance, other PBSs syncs without issue and everything is similar to this one, also same firewall, that by default doesn't block this traffic.

So if it is limited to this node/VMs, you should double check your network settings for this in particular. Are the VMs attached to the same bridge? Does a direct backup to the local target VM work? Do you have some custom network settings in you /etc/network/interfaces for this node or maybe a different routing setup?

filippoclikkami · Monday at 10:50

Backup from VMs to local pbs works fine (same node), only when i sync doesn't work, both local or external. VMs are all on the same vmbr, network settings is as usual, no customization. I post the configuration, maybe i'm missing something, network is a /16

PVE:
auto lo
iface lo inet loopback

auto eno49
iface eno49 inet manual

iface eno1 inet manual

iface eno2 inet manual

iface eno3 inet manual

iface eno4 inet manual

iface eno50 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.8.221/16
gateway 192.168.8.1
bridge-ports eno49
bridge-stp off
bridge-fd 0

VMPBS:
auto lo
iface lo inet loopback

auto enp6s18
iface enp6s18 inet static
address 192.168.1.247/16
gateway 192.168.1.3

Chris · Monday at 16:31

So just to double check: You tested backing up a VM from the same PVE node attached to the same bridge to the PBS instance which is also the target (remote) for your local sync test, and while the backup works, the sync job fails? Or did you only test backups to the PBS instance acting as source for the sync job?

I'm asking since the sync job in push direction uses the backup api of the target, so it should behave just like a backup job. And if the backups work, but the sync does not, the source PBS might be the issue.

filippoclikkami · Monday at 18:54

List of every tests taken until now:

BK VMs to localpbs - works
SYNC localpbs to remotepbs - failed
BK VMs to localpbs2 - works
SYNC localpbs2 to localpbs - failed
SYNC localpbs2 to remotepbs - failed
SYNC localpbs(different datastore) to remotepbs - failed

Chris · Tuesday at 09:32

That is strange, can you check if the same issue also appears when you set up a pull sync job from localpbs2 to localpbs and also vice versa?

Chris · Tuesday at 17:33

Also, although vrtio-net devices should be able to pass packets without segmentation from guest to host, can you check if disabling segmentation offloading inside the PBS VM has an effect? There is a nice article explaining some more details https://zenn.dev/yutarohayakawa/articles/eafcc91dc290ab#virtio-net

filippoclikkami · Tuesday at 18:50

Chris said:
That is strange, can you check if the same issue also appears when you set up a pull sync job from localpbs2 to localpbs and also vice versa?

I tested a little more and.... push n pull sync works between localpbs and localpbs2, it sounds very strange. I can't figure out why before wouldn't work.

Chris said:
Also, although vrtio-net devices should be able to pass packets without segmentation from guest to host, can you check if disabling segmentation offloading inside the PBS VM has an effect? There is a nice article explaining some more details https://zenn.dev/yutarohayakawa/articles/eafcc91dc290ab#virtio-net

Now i try with this one

Search

Search

Sync job failed everytime

filippoclikkami

New Member

Chris

Proxmox Staff Member

filippoclikkami

New Member

Chris

Proxmox Staff Member

filippoclikkami

New Member

Wibla

New Member

alietz

New Member

filippoclikkami

New Member

Chris

Proxmox Staff Member

filippoclikkami

New Member

Chris

Proxmox Staff Member

filippoclikkami

New Member

Chris

Proxmox Staff Member

Chris

Proxmox Staff Member

filippoclikkami

New Member

We value your privacy