Most efficient sync / prune / garbage collect strategy

sukerman

Well-Known Member
Aug 29, 2019
57
7
48
52
Hola,

I have a production server. I do nightly backups which take about 6 hours. I want local backups to restore as fast as possible so I have proxmox backup installed locally. In case the whole server is lost, I remote sync the backups to another proxmox backup server remotely. Backup data for both backup servers are on spinning drive raid arrays.

q1) Should the production backup server avoid doing any prunes / garbage collect when I know the backup is likely still running? or is it / will it be someday automatically paused?

q2) Should the remote server avoid remote syncing with the production server whilst it is backing up / pruning / gc? Should the remote sync use 'delete vanished', or allow the copy then do local prune/gc. In that case should I avoid any kind of prune / gc during the remote sync?

q3) Should I avoid setting prune / gc, to hourly otherwise potentially they are running at the same time?

q4) Does 'delete vanish' actually delete anything? or does it require a prune, gc, or both?

Gut feeling says getting any hdd to do two things at once is probably not the best idea but any advice please on the least resource intensive / fastest method please?

Cheers,

Jack
 
Last edited:
  • Like
Reactions: greavette
q1) Should the production backup server avoid doing any prunes / garbage collect when I know the backup is likely still running? or is it / will it be someday automatically paused?
In the currently public version (0.9.1) there is a bug where prune and GC started at the same time may lead to a failed GC, this will be fixed in the next release. Doing prune/GC/verify while a backup is happening is perfectly fine (concurrency is one of the strong points of PBS, nothing should interfere with anything else if at all avoidable).

In general though, the right order of operations would be to run prune first, and then (schedule an hour later or so) do GC. This way GC can collect any chunks no longer necessary after the prune. If you want, you can also schedule a verify later again, though keep in mind that verify scheduling will be improved in coming versions too.

q2) Should the remote server avoid remote syncing with the production server whilst it is backing up / pruning / gc? Should the remote sync use 'delete vanished', or allow the copy then do local prune/gc. In that case should I avoid any kind of prune / gc during the remote sync?
Again, not withstanding bugs, all operations should be safe to do concurrently.

Using 'delete vanished' removes any snapshots on the sync target that are no longer existing on the source, which I'd recommend if you want the same 'prune' schedule on both nodes. Alternatively, leave it off and set up a different 'prune' schedule on the target - that way you can for example keep older snapshots alive on your secondary, without wasting space on your production machine.

For scheduling, it probably makes sense to do after 'prune', so it doesn't need to do any unnecessary syncs.

q3) Should I avoid setting prune / gc, to hourly otherwise potentially they are running at the same time?
Prune/GC every hour is not necessary. I'd recommend scheduling them once per day, with about an hour in between (depending on your average runtimes). See my answer to q1.

q4) Does 'delete vanish' actually delete anything? or does it require a prune, gc, or both?
it behaves the same as doing 'forget'. Thus, it deletes the snapshot metadata, not the actual chunks - you still need to schedule a GC on the target node for that.

In general, also measure your disk utilization (using 'atop' or similar) and see for yourself how your system scales.
 
Thanks a lot it means a great deal to get quality support and replies when you're relying on something for a backup, I'm glad I switched from restic to the pbs, which of course makes sense if you use proxmox and I'm glad it's good.

Good work! thanks again.
 
I had assumed that a remote sync target did not need GC , since it should be a duplicate of the main pbs system. however that is not the case.

the remote had 2TB+ more disk usage after a couple months of syncs. running gc fixed that.

question: once pbs is stable, should gc still be needed at the remote?
 
question: once pbs is stable, should gc still be needed at the remote?
Yes, that is how it's intended to work. The chunk store and the snapshot metadata are treated as two seperate entities. This is what allows deduplication even between backups of different hosts/VMs/CTs. The only connection between the two is writing chunks on backup, and removing them on GC - and since "sync" works on the snapshot level, it cannot remove chunks that are no longer needed. Running GC on target nodes is thus recommended, if you run with "remove vanished" enabled.
 
  • Like
Reactions: RobFantini
Yes, that is how it's intended to work. The chunk store and the snapshot metadata are treated as two seperate entities. This is what allows deduplication even between backups of different hosts/VMs/CTs. The only connection between the two is writing chunks on backup, and removing them on GC - and since "sync" works on the snapshot level, it cannot remove chunks that are no longer needed. Running GC on target nodes is thus recommended, if you run with "remove vanished" enabled.
that makes sense. so we will always have gc set up at the remotes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!