Prune and Retention policy

snagles

Active Member
Aug 22, 2019
4
0
41
49
Hi,
I am trying to configure correctly the retention policy of my PBS. I want to have the following config:
- to keep the last one
- to have 72h per some of the LXCs which is backed up on hourly basis
- to have 7 days backup of all LXC's even this which is backed up on an hourly basis. The others were taken on a daily basis
When implementing this configuration it was normal after 72h to keep the last archive for the day. For example, after 72h must have only one backup for the next 7 days. On day 14 must stay only the last one which is marked as keep last 1. The last one should be removed manually.
Actually, the prune-simulator shows exactly the same.
The reality is different based on what I see.
All archives which are on a daily basis was marked as hourly archives. This means that the first clean should happen after 72 + 7 archives. This means that the data will be kept for 2.5 months.
Another option that I cannot understand is whether it is possible to configure purging per container or VM. Do I need to configure the clean options from PVE or everything must happen on the PBS side?

Here is an example from the log
Code:
2023-06-15T07:18:21+02:00: retention options: --max-depth 0 --keep-last 1 --keep-hourly 72 --keep-daily 7
2023-06-15T07:18:21+02:00: Pruning group :"vm/74109"
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-05-27T02:00:01Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-05-28T02:00:04Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-05-29T02:00:02Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-05-30T02:01:37Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-05-31T02:00:33Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-01T02:00:28Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-02T02:00:59Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-03T02:53:59Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-04T02:51:41Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-05T02:52:32Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-06T02:51:24Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-07T02:53:09Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-08T02:48:42Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-09T02:49:15Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-10T02:52:29Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-11T02:50:05Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-12T02:52:15Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-13T02:49:20Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-14T02:50:27Z
2023-06-15T07:18:21+02:00: would keep vm/74109/2023-06-15T02:50:40Z
2023-06-15T07:18:21+02:00: TASK OK

These log is generated from Prun ALL simulation from the PBS datastore section.

The pictures attached show Prun per container what will happen if 72h is present and with no hourly retention value.
Please for some advice on how to make this config useful in my case.
 

Attachments

  • image_2023-06-15_08-43-29-466.jpg
    image_2023-06-15_08-43-29-466.jpg
    218.5 KB · Views: 40
  • image_2023-06-15_08-43-29-539.jpg
    image_2023-06-15_08-43-29-539.jpg
    226.2 KB · Views: 37
Hi,
yes, you need different prune schedules to be able to implement your solution. This can be done by using different datastores or configuring the retention on the Proxmox VE side as part of the backup job. See also the prune simulator for trying it out and for a description of the algorithm: https://pbs.proxmox.com/docs/prune-simulator/index.html

EDIT: as @VictorSTS suggested below, it's actually better to use the same datastore but different namespaces, because of deduplication.
 
Last edited:
Hey Fiona,

Yes, I already used this simulator but the simulator is valid only for a single LXC. Actually, I have to separate all types per different datastore. This is hard work and serious change.
Am I able to move already taken backup to a different datastore without re-becup everything from scratch
Thanks for your answer of course :)
 
A side note: "keep last" does not mean "keep the oldest backup". What "keep last" does is "keep this many of the last taken backups", that is, the newest ones. This is useful to keep some amount of backups for some time regardless of when they were taken.

Also, if you don't add new backups (i.e. removed the LXC), current backups won't be purged unless they really should. That is, a keep 10 daily means "keep the last backup taken each day for the last 10 days we have backups for" and not "keep the last backup taken each day of the last 10 days". I mention this because I don't get what you mean with "The last one should be removed manually."

In your case I would just:
- Create one PBS datastore with two namespaces: hourly, daily
- Add both as PBS storages to Proxmox.
- Setup two backup tasks: hourly saving backups to the hourly PBS storage and daily saving to the daily PBS storage. Include all LXC into the daily backup task but place just that few LXC that need hourly backups in the hourly backup task.
- In PBS, configure Prune like:
· hourly namespace: keep 72 hourly
· daily namespace: keep last 1, keep 14 daily

LXCs with hourly backups will have a "duplicated" daily + hourly backup for 3 days, but as data is deduplicated within all namespaces of the same datastore it should not be an issue from a backup storage usage perspective.
 
  • Like
Reactions: fiona
Hi Victor,

Thanks for your explanation and suggestion. I will implement precisely the same as you advised me.
Meaning or my understanding of "The last one should be removed manually" refers to "keep last". If you have 10 daily backups and remove the LXC after 10 days, the prune will purge all related backups of that LXC. I thought if there is a rule "keep last" the very last taken backup will be preserved even after the retention period and you should validate and remove it manually as a physical backup from PBS.

Still asking if am I able to transfer backup from one to another data storage, or need to start backup after creating a data store from the beginning.
One more question related to deduplication. If I have 2 data stores hen take 1 daily and 1 hourly later, the consumed physical space will be - the amount of LXC taken daily + only differences taken in an hourly backup. For example, if the container is 100 MB when taking daily the consumed space will be 100MB in the daily datastore and after 1 hour if I have some change of 5 MB the hourly backup will take only this difference of 5 MB and put it into the hourly datastore.
When restoring the hourly taken backup PBS will collect all data from hourly + daily and will restore the LXC with full data collected from both datastores.
 
No, if you remove an LXC, purge will *not* remove any backup. Also, "last" does not work the way you think. I explained that before:
A side note: "keep last" does not mean "keep the oldest backup". What "keep last" does is "keep this many of the last taken backups", that is, the newest ones. This is useful to keep some amount of backups for some time regardless of when they were taken.

Also, if you don't add new backups (i.e. removed the LXC), current backups won't be purged unless they really should. That is, a keep 10 daily means "keep the last backup taken each day for the last 10 days we have backups for" and not "keep the last backup taken each day of the last 10 days".


Regarding deduplication, PBS does *not* deduplicate backups in different datastores. It does, however, deduplicate backups on different namespaces of the same datastore. This is what I suggested before:

- Create one PBS datastore with two namespaces: hourly, daily


Regarding backup transfers, fiona already gave you instructions on using local sync jobs for that:
There is currently no local pull support, but you should be able to add the local PBS itself as a remote and create a sync job between your local datastores: https://pbs.proxmox.com/docs/managing-remotes.html


Regarding the restores: all backups in PBS are "full" in the sense that they hold all data you backed up at that time regardless of how much of it is deduplicated in the datastore. So yes, if you restore any hourly backup it will restore all data as it was at that time.

I suggest you to install a test PBS and play around with a few small LXC and practice all these concepts we've exposed here.
 
  • Like
Reactions: snagles
Hi all,
Finally, all jobs passed and everything looks as described above.
Big thanks for your support.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!