Sync job failed everytime

filippoclikkami

New Member
Dec 21, 2023
22
2
3
Hi, i got a local pbs that should be synced to a remote one everynight but fails constantly. Pbs now running on a vm in my pve, i tried to reinstall from scratch but same ending.

Both pbs local and remote runs 3.4.1, started getting error since the first sync, so i thinked job was too heavy and i try with only one vm group which has only one snapshot, but every time i get this error:

...
2025-05-22T01:00:03+02:00: Found 1 groups to sync (out of 9 total)
2025-05-22T01:01:58+02:00: Percentage done: 16.67% (1/6 snapshots)
2025-05-22T01:01:58+02:00: Encountered errors: error:0A0003FC:SSL routines:ssl3_read_bytes:sslv3 alert bad record mac:../ssl/record/rec_layer_s3.c:1605:SSL alert number 20
2025-05-22T01:01:58+02:00: Failed to push group vm/106 to remote!
2025-05-22T01:01:58+02:00: Finished syncing root namespace, current progress: 0 groups, 1 snapshots
2025-05-22T01:01:58+02:00: TASK ERROR: Sync failed with some errors!

I searched for errors on SSLCERT, but I don't think that's the problem. I also have other PBS that sync regularly without SSL certificate. What could be causing it?

Thanks in advance.
 
Hi,
did you specify the correct fingerprint when setting up the remote? Note that for self signed certificates this must be set, otherwise no TLS session can be established.
 
  • Like
Reactions: filippoclikkami
Hi,
did you specify the correct fingerprint when setting up the remote? Note that for self signed certificates this must be set, otherwise no TLS session can be established.
Yes, i specified fingerprint of the datastore used for syncing
 
2025-05-22T01:00:03+02:00: Found 1 groups to sync (out of 9 total)
2025-05-22T01:01:58+02:00: Percentage done: 16.67% (1/6 snapshots)
Ah, yes. Sorry I missed that the sync starts out just fine as seen from the logs above. So this is more likely a networking issue. How are you connecting the local PBS instance to the remote? Is the traffic going trough some VPN and if so which VPN solution are you using?

Edit: Also have a look at https://forum.proxmox.com/threads/pbs-sync-failed-each-time.113921/post-573939
 
Last edited:
  • Like
Reactions: filippoclikkami
There is no vpn involved. Local pbs is under a firewall, the remote one is public in this exact moment. Nic of local pbs vm is VirtIO, in acccording of linked post, his problem was with e1000e. Same pve that runs pbs vm also run other vms and they work fine
 
Last edited:
It is my experience that the sync process is very brittle. It fails without (apparently) retrying, leaving you with the only option of having to re-start the sync job in its entirety and hope it works the second time. This effectively means you have to babysit the thing, you can't trust the scheduled sync job to actually finish properly on its own.

I see this quite a bit using VPN (tailscale) to transfer backups, which probably isn't a "supported config" anyway, but I've never had a problem using rsync to transfer pve backups over the same link before...
 
Last edited:
  • Like
Reactions: filippoclikkami
I haven't found that the sync process is brittle. We sync GBytes of backup data with PBS every day via an IPSEC VPN (between locations) and an external PBS provider with no issues. Of course, it will stop with an error when there are network issues, but that almost never happens for us.

You said you were running PBS in a VM -- what is the underlying hardware? PBS needs fast I/O, so it should be running on (enterprise) SSDs ideally, or with a special ZFS device for metadata if you are running on HDs.
 
  • Like
Reactions: filippoclikkami
I haven't found that the sync process is brittle. We sync GBytes of backup data with PBS every day via an IPSEC VPN (between locations) and an external PBS provider with no issues. Of course, it will stop with an error when there are network issues, but that almost never happens for us.

You said you were running PBS in a VM -- what is the underlying hardware? PBS needs fast I/O, so it should be running on (enterprise) SSDs ideally, or with a special ZFS device for metadata if you are running on HDs.
I think hw isn't the issue, host mounts all SSDs on an hw raid controller, no error on controller or disks
It is my experience that the sync process is very brittle. It fails without (apparently) retrying, leaving you with the only option of having to re-start the sync job in its entirety and hope it works the second time. This effectively means you have to babysit the thing, you can't trust the scheduled sync job to actually finish properly on its own.

I see this quite a bit using VPN (tailscale) to transfer backups, which probably isn't a "supported config" anyway, but I've never had a problem using rsync to transfer pve backups over the same link before...
I don't use vpn in this case and retried also with "manual" running sync job but every time the same. Also repeat the process from scratch doesn't work, new vm, only one snapshot of only one vm and then start sync job... no luck. I also tried sync between 2 vm pbs instances, on the same pve node, to eachother, same error. Just for clearance, other PBSs syncs without issue and everything is similar to this one, also same firewall, that by default doesn't block this traffic. I'm quite sure it's a little minor thing that is causing everything :rolleyes:

EDIT:
if can help, datastore is on a qnap nas and backup jobs works nice (like in other pbs where syncs works)
 
Last edited:
I also tried sync between 2 vm pbs instances, on the same pve node, to eachother, same error. Just for clearance, other PBSs syncs without issue and everything is similar to this one, also same firewall, that by default doesn't block this traffic.
So if it is limited to this node/VMs, you should double check your network settings for this in particular. Are the VMs attached to the same bridge? Does a direct backup to the local target VM work? Do you have some custom network settings in you /etc/network/interfaces for this node or maybe a different routing setup?
 
  • Like
Reactions: filippoclikkami
Backup from VMs to local pbs works fine (same node), only when i sync doesn't work, both local or external. VMs are all on the same vmbr, network settings is as usual, no customization. I post the configuration, maybe i'm missing something, network is a /16

PVE:
auto lo
iface lo inet loopback

auto eno49
iface eno49 inet manual

iface eno1 inet manual

iface eno2 inet manual

iface eno3 inet manual

iface eno4 inet manual

iface eno50 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.8.221/16
gateway 192.168.8.1
bridge-ports eno49
bridge-stp off
bridge-fd 0


VMPBS:
auto lo
iface lo inet loopback

auto enp6s18
iface enp6s18 inet static
address 192.168.1.247/16
gateway 192.168.1.3
 
So just to double check: You tested backing up a VM from the same PVE node attached to the same bridge to the PBS instance which is also the target (remote) for your local sync test, and while the backup works, the sync job fails? Or did you only test backups to the PBS instance acting as source for the sync job?

I'm asking since the sync job in push direction uses the backup api of the target, so it should behave just like a backup job. And if the backups work, but the sync does not, the source PBS might be the issue.
 
List of every tests taken until now:

BK VMs to localpbs - works
SYNC localpbs to remotepbs - failed
BK VMs to localpbs2 - works
SYNC localpbs2 to localpbs - failed
SYNC localpbs2 to remotepbs - failed
SYNC localpbs(different datastore) to remotepbs - failed
 
That is strange, can you check if the same issue also appears when you set up a pull sync job from localpbs2 to localpbs and also vice versa?
 
  • Like
Reactions: filippoclikkami
  • Like
Reactions: filippoclikkami
That is strange, can you check if the same issue also appears when you set up a pull sync job from localpbs2 to localpbs and also vice versa?
I tested a little more and.... push n pull sync works between localpbs and localpbs2, it sounds very strange. I can't figure out why before wouldn't work.
Also, although vrtio-net devices should be able to pass packets without segmentation from guest to host, can you check if disabling segmentation offloading inside the PBS VM has an effect? There is a nice article explaining some more details https://zenn.dev/yutarohayakawa/articles/eafcc91dc290ab#virtio-net
Now i try with this one