How to push sync only a subset of backups?

soupdiver

Active Member
Feb 24, 2021
33
2
28
55
I have a local PBS server to which I backup from different local Proxmox hosts. I keep backups like 12x hourly, 7x daily, 4 weekly ....

I want to push sync this PBS to an offsite. But I dont want to sync all of the local backups. Just a subset like 1x hourly, 1x daily ...
or just the last 1-2 of hourly/daily/weekly

is there a way to configure my sync job like this?
 
Hey, that's a great question, but the answer is broad and goes in various directions.
You need to look at the Sync Job, and might separately consider Namespaces.

--------------------------------------------------------------------
Sync Job Advanced Setting

Put a check in the Advanced box at the bottom of your Sync Job.
You'll get a setting called "Transfer Last".
Adjust it from All to however many you want.


1735842408019.png

--------------------------------------------------------------------
Sync Job Group Filter Tab

You can either Include or Exclude, so if you can come up with a criteria, you can use it to filter.

Type (VM, Container, or Host)
Group (lousy name that means VMID)
Regex (and this can do anything at all, but take you weeks to figure out)

1735841961059.png


--------------------------------------------------------------------
Namespaces

You can create virtual directories that allow you to separate your backups and manage them.
They aren't really 'folders' in a file system, but they work pretty much the same way.
So if you wanted to break out the backups from these various hosts you mention into their own folders, this is how.


1735842543279.png
 
Last edited:
  • Like
Reactions: Johannes S
Thanks a lot for your question @soupdiver - I am also facing this issue. I just want to transfer my weekly backups to my offsite pbs, but it doesnt work.

1738784328181.png

@tcabernoch Can you go deeper into the namespace topic? How do I move the weeklys automatically to a namespace (for example to a namespace named "weekly") and after that how do I sync it to the offsite pbs?

Thanks a lot
atomique
 
I couldnt figure it out either... I just push the last 1 backup for now and do pruning on the receiving side. Not really what I wanted but it works for now
 
For namespaces you have several options to implement your backup scheme:

- First you would create several namespaces for your local backups like localdaily, localweekly, localmonthly and for your remote backups (meaning the backups you want to sync to your remote PBS) remotedaily, remoteweekly, remotemonthly etc backups. Depending on your usecase you don't need all of them, just do what you are comfortable with.
- Create a prunejob on your local namespaces (e.g. keep 24 hourly, 30 daily backups, 5 weekly and 12 monthly)
- Afterwards you have several options:
  1. Configure additional backup jobs with a schedule of your wish. Data which is already in the datastore won't be added again
  2. First create a sync job on your local PBS to sync from your local namespace/s to your remote namespace, e.G. localdaily to remotedaily. Afterwards create a prune job with your intended schedule for the remote backups for the namespace you want to sync to your remote pbs (e.g. keep 6 hourly, 14 daily, 2 weekly and 6 monthly). As final step create a sync job to push the remote namespace to your remote pbs.

I personally do it like this (I actually have different numbers, modified to make things easier):
  • A local PBS has a namespace pve-cluster for the local backups, it's prune schedule is something like twenty-four hourly, thirty daily, six weekly and twelve monthly backups.
  • I have a second namespace called remotesync, a local syncjob sync all backups from pve-cluster to remotesync every hour (e.G. 0:00, 1:00, 2:00 etc), a prunejob makes sure that I just keep the last seven daily, six weekly and twelve monthly backups.
  • A pull-sync job on my remote PBS pulls everything new from the remotesync namespace on my local PBS. If I wanted I could also do this as a push-sync but I prefer pull-sync for security reasons (since no local server of mine need write-access to the remote PBS->Great for ransomware protection).
 
  • Like
Reactions: tcabernoch and UdoB
> First you would create several namespaces for your local backups like localdaily, localweekly, localmonthly and for your remote backups

Ah okay yea that kinda makes sense. I did not think about difference namespaces for the different intervals. I currently have one namespace for each of my machines I backup... I guess I would to multipy that then. I will see if this outweights the benefit

thanks!
 
> First you would create several namespaces for your local backups like localdaily, localweekly, localmonthly and for your remote backups

Ah okay yea that kinda makes sense. I did not think about difference namespaces for the different intervals. I currently have one namespace for each of my machines I backup... I guess I would to multipy that then. I will see if this outweights the benefit
Data which is already stored in the datastore (no matter in which namespace and from which vms/hosts) will not be saved again. Thus this shouldn't cause much overhead. Actually I think it's enough to have one namespace per cluster for your local backups and another one for the remote. This is (see above) how I do this in my homelab ;)

Having dedicated namespaces per cluster is a good idea since this ensures that no vms/lxcs backups from several clusters and identical ids would get mixed up. Having dedicated namespaces per node inside one cluster shouldn't hurt but also shouldn't be needed at all.
 
Data which is already stored in the datastore (no matter in which namespace and from which vms/hosts) will not be saved again. Thus this shouldn't cause much overhead. Actually I think it's enough to have one namespace per cluster for your local backups and another one for the remote. This is (see above) how I do this in my homelab ;)

Having dedicated namespaces per cluster is a good idea since this ensures that no vms/lxcs backups from several clusters and identical ids would get mixed up. Having dedicated namespaces per node inside one cluster shouldn't hurt but also shouldn't be needed at all.
I meant dealing with multiple namespaces with overhead, not storage wise.
But year doing it the way you described makes sense
 
  • Like
Reactions: Johannes S