[SOLVED] Understanding Garbage Collection with Remotes

Tmanok

Renowned Member
Hi Everyone,

I am trying to understand garbage collection a bit better, I'll lay out two scenarios and ask you what happens so that I can better understand.

PVE --> Backs up to --> Local PBS

Remote PBS <-- Pulls from Local PBS

1.
Garbage Collection on Local PBS: Keep last 10
Garbage Collection on Remote PBS: Keep last 20

Does the Remote PBS actually get to keep the last 20? Hopefully this is a simple question, I really need to be certain.

2.
Same topology as above but different GC
Garbage Collection on Local PBS: Keep last 20
Garbage Collection on Remote PBS: Keep last 10

I'm pretty sure that I know this one from another forum post that I asked, but to confirm: Last 20 on Local PBS with transfer, but next Prune and Garbage Collection run on the Remote PBS will erase 11-20?

Cheers everyone!


Tmanok
 
Hi,

first please note that "Garbage Collection" and "Retention Policy" are not the same thing exactly.
Garbage Collection cleans up chunks that are not referred in any backup index since >24 hours while retention policy is applied on prune to determine which backup indexes can be removed.

Garbage Collection on Local PBS: Keep last 10
Garbage Collection on Remote PBS: Keep last 20

Does the Remote PBS actually get to keep the last 20? Hopefully this is a simple question, I really need to be certain.
Normally yes, but that really depends on how often you sync to remote, how often you back up to local and how often you run a prune job.
For example, if you only sync once a day but backup and prune every hour it may not be true, or at least not what you may expect. As once you get to the sync, only the last ten of the last 24 local backups that actually happened since the last sync exist anymore, as prune already "took care" of the trailing older ones. But, the remote would still keep 20, it would be 10 from sync time to ten hours ago, then a hole and then ten from last sync time to ten hours past that.

So, if you really need to be certain I'd post also the backup, sync and prune frequency here, as that information is required to tell for sure.

That can actually be an argument to use the keep- hours/days/... retention policies ove the keep-last one (or at least a combination of), as there you can ensure that you also got some older backup snapshots even if you backup with higher frequency in general and maybe do some manual ones on top here and then.

If you want to be sure that every backup that's made to local PBS gets also on the remote PBS you'd need to tune the frequencies of sync, backup and prune such that at least:

F(sync) > F_max(backup) > F(prune) and F(sync) > 2 * F(prune)

holds true. For example, if you back up daily you could sync also daily and trigger a prune weekly.

Garbage Collection on Local PBS: Keep last 20
Garbage Collection on Remote PBS: Keep last 10

I'm pretty sure that I know this one from another forum post that I asked, but to confirm: Last 20 on Local PBS with transfer, but next Prune and Garbage Collection run on the Remote PBS will erase 11-20?
Yes and no, we normally do not sync older backups from a group if a newer one exists, so there's a cut-off and the whole thing depends again also on the other parameters (like above) to tell for sure.

For example, you prune on remote hourly and backup + sync daily on local, now each prune on remote drops all but the last ten backups, but the next sync will not sync over old backups again only to delete them, it will look for newer ones since the newest backup it got from the last sync.
 
Last edited:
Hi Thomas,
Normally yes, but that really depends on how often you sync to remote
Ah right, I made some late night assumptions when asking this post. Daily pruning, GC, and the above retention policy scenarios. Nightly/daily backups with weekly syncs. So with that in mind, it sounds unlikely that a remote could collect more than 1 additional backup more than the local server per week. Or.... your next comment:

But, the remote would still keep 20, it would be 10 from sync time to ten hours ago, then a hole and then ten from last sync time to ten hours past that.
Ok good to point out gaps in backups, that sounds somewhat horrific.

So I suppose that under the first scenario, the remote could collect the desired 20 backups over the course of three weeks if the weekly sync was on a Saturday and the daily Pruning and Garbage Collection were weekly on a Sunday instead. That also assumes that I have the bandwidth to pull from the local to the remote within 24 hours of course.


For example, you prune on remote hourly and backup + sync daily on local, now each prune on remote drops all but the last ten backups, but the next sync will not sync over old backups again only to delete them, it will look for newer ones since the newest backup it got from the last sync.
Ok this is very interesting to me, it conflicts with my (possibly incorrect) interpretation of another developer's comment from my other post:
you can set up more aggressive pruning on the sync target - that won't stop you from transferring lots of data, but reduces the amount stored long-term ;)
- Fabian https://forum.proxmox.com/threads/l...ote-off-site-backup-server.97474/#post-422144
My understanding from his comment with more context from my original post was that syncs try to pull/download all chunks that are on the local but not on the remote. If this is "smarter" than that and the remote can actually differentiate between old that it does not need to pull and new chunks which need to be pulled, then PBS is far more advanced than I had thought (congrats!).

Thanks Thomas, have a great day.


Tmanok
 
Ok this is very interesting to me, it conflicts with my (possibly incorrect) interpretation of another developer's comment from my other post:

My understanding from his comment with more context from my original post was that syncs try to pull/download all chunks that are on the local but not on the remote. If this is "smarter" than that and the remote can actually differentiate between old that it does not need to pull and new chunks which need to be pulled, then PBS is far more advanced than I had thought (congrats!).

Thanks Thomas, have a great day.


Tmanok
my comment refers to the following:

remote PBS has snapshots A, B, C (with whatever prune + GC schedule)
a sync happens
local PBS now also has A, B, C
local PBS has prune with --keep-last 1
A and B are dropped by prune (and the chunks only referenced by those two snapshots are cleaned out with the next GC)

before the next sync happens, remote PBS gets new snapshots D and E

now you run a sync - the sync will transfer D and E. the next prune will drop D (and B). the next GC will then drop all the chunks referenced by D (and B).

so if you look at the two syncs, for the first one the A and B snapshots (and all the chunks only referenced by those two) and for the second sync the D chunks (+ some metadata) were transferred unnecessarily. now repeat this thought experiment, but with long sync intervals and you might end up with days or weeks of chunks transferred, only to be pruned and GC'ed soon after. this won't be a problem in many setups (basically, the prune/GC, backup and sync schedules have to align just so, and you need to have either your transfer pipe or your local datastore as bottlenecks that you want to optimize for).

what PBS doesn't do is transfer all chunks and then checking - it operates on a group/snapshot level:
- get remote groups
- get remote snapshots per group
- transfer snapshots (+ needed chunks) one-by-one if later than last local snapshot in group (or all snapshots, if group does not exist or is empty)

so snapshots which already exist locally won't be re-synced. chunks which already exist locally are not re-downloaded. chunks which are not part of any snapshots being synced are not downloaded either ;)

hope this helps :)
 
Hi Fabian,

Ok I see that you used the reverse perspective of local = pulling instance aka off-site backup and remote = pulled from instance (which is the perspective used in the software itself). Note that my original Q was the asked the other way around local = on-site and remote = off-site as in the one performing the sync task. In future I will go with your perspective as it is used in the software itself and makes sense.
now you run a sync - the sync will transfer D and E. the next prune will drop D (and B). the next GC will then drop all the chunks referenced by D (and B).
Wait what? A & B were already dropped at this point. Do you mean the next prune will drop D (and C)?

what PBS doesn't do is transfer all chunks and then checking - it operates on a group/snapshot level:
But it does transfer all "new" chunks that it has not seen before, even if some or most of the new chunks will be pruned and deleted after the sync? This was what I was attempting to point out in scenario 2 which I think you just confirmed.

hope this helps :)
Of course, and I appreciate both your time and Thomas' time explaining this, it means a lot to me- forum posts like this are what keep me expanding the "Proxmox Revolution" in British Columbia, Canada lol. Kicking VMWare and Hyper-V out wherever I find that crap now. XCP-NG I work alongside with because of my boss however. :rolleyes: :D
 
I meant - A&B on the remote side are pruned, but it does not make much difference ;) the thing is, sync is completely independent from prune and GC, so it does not know that of the "new" snapshots on the remote side, only a subset is relevant because the rest will be pruned locally anyway. so in the example, it will see D and E as new snapshots after the previously synced snapshot C, and it will download both of them and all the chunks required, even though the snapshot D (and all the chunks ONLY referenced by it) will be pruned and GCed soon after.

I'm working on a feature that allows filtering what gets synced (currently on the group level, so it's possible to say "only sync backup groups of type 'host'", or "only sync groups foo, bar and baz", or "only sync groups matching regular expression 'vm/1.*'") - that could easily be extended in the future to allow specifying stuff like "only sync last X snapshots in each group" or "only sync fully encrypted snapshots", but that part is not yet written ;) I have to think a bit whether adding an option to "apply pruning to local + syncable snapshots" before syncing to filter the syncable snapshots list makes sense, as a sort of optimization for this scenario (remote end keeps more snapshots than local end).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!