General problem of understanding

AST

Well-Known Member
Nov 28, 2018
113
8
58
Dear Colleagues.

Incremental backups.
You have a full backup and then the incremental backups. An incremental backup builds on the previous backup. Otherwise they would be differential backups.
So far I am on the right track?

Now the backup server should delete old backups automatically.
But if it does: What happens in the background? Are backups migrated to the full backup?
If they are simply deleted, the backups should then be unusable because they are missing parts.

Additionally: Is it intended to trigger full backups periodically?

Greets, Patrick
 
Hi LachCraft,

PBS is not incremental in the traditional way you probably expect. It is deduplicating and based on chunks... lets give an example:
During the first backup it creates Chunks 1, 2, 3 and 4.
The next backup that happens the next day only Chunks 1, 2 and 3 are existent, data that was in chunk 4 has been deleted. It makes a note to itself that only those three chunks are present at that backups and it did not need to write any data since chunk 1, 2 and 3 existed before. It still keeps all four chunks because it still has its first "full" backup.

If you now were to remove the first backup, a traditional Backup software would need to consolidate the first and second backup into a single backup. PBS just removes Chunk 4 as that is no longer needed and forgets about the first backup.

This is also why you can't trigger a full backup, because there is no "full backup" - It just checks what chunks currently exists and writes those down, it doesn't matter if its the first or the second or the n-th backup.
 
Phew, these are completely new methods I have to learn ^^

Okay. We have the first backup, that takes the longest. Then we add the other, maybe 13 backups. Now I delete the first backup. If the following backups only remember the chunks, what happens during the restore if I want to restore the VM completely?
 
Phew, these are completely new methods I have to learn ^^

Okay. We have the first backup, that takes the longest. Then we add the other, maybe 13 backups. Now I delete the first backup. If the following backups only remember the chunks, what happens during the restore if I want to restore the VM completely?
The full data is there for a restore for every backup, totally independent of the others! PBS does not store deltas with respect to previous backups.

Let us assume that you have a file, which during the first backup landed in a chunk i. Now the file changed completely and you performed an additional backup. Let us also assume the file got way bigger and does not fit into one chunk anymore, so it will now be contained within chuncks [j, k]. After this backup you append some further data to the file and once again perform a backup. Let us assume that the data only changed at the end (no metadata update) and that the file now fits in the chunks [j,k,l] (not the deduplicated chunks [j,k], so the client now needs to only send chunk l and is therefore way faster). If you now delete the initial backup, chunk i is removed. If you remove the second backup, no chunk is removed (since all the chunks are still referenced by backup 3). And finally if you remove the last backup, all remaining chunks are removed (since they are no longer referenced).

Hope this small example helps to understand a bit better what is going on.
 
Last edited:
  • Like
Reactions: guletz
Well when you remove the backup it will only delete the chunks that are not in use by any of the following backups.

Lets break this down a bit more... the drive you backup will be splitted into many thousands of 4MiB chunks that are named after their checksum. You can see those in /datastorepath/.chunks/.
When you create a backup, PBS computes the checksums for all chunks that would be generated from the drive and writes the checksums into a file, IIRC that is in /datastorepath/vm/ID/timestamp/index.json.blob. If any "new" chunks that don't exist in the datastore yet appear, it will save those to disk and the checksum to the index file, already known ones just get their checksum written into the index file. This way you end up with a index file that contains all the checksums for chunks that existed at the backup time.

When you delete a Backup, PBS deletes the /datastorepath/vm/ID/timestamp folder, nothing more happens yet.
The "magic" happens when it runs Garbage Collection. During Garbage Collection PBS reads all the index files and builds a list of all the checksums that appear in those files. It then continues walking through the .chunks directory and removing every chunk that does not exist in its compiled list.

Maybe another way to describe it: It does full backups, but only transfers incremental data changes.
 
  • Like
Reactions: guletz
Oh, I think I got it now :-D
I looked at the backup entries analogous to backup files. Just like it is with PVE out of the box.
Just repeated in my own words:
In the chunk directory, my data is present as chunks, my "file" I see in the list, practically only the table of contents.
By deleting the table I only delete chunks that are not in any other table.
This is already very hot!
 
  • Like
Reactions: Cookiefamily
Oh, I think I got it now :-D
I looked at the backup entries analogous to backup files. Just like it is with PVE out of the box.
Just repeated in my own words:
In the chunk directory, my data is present as chunks, my "file" I see in the list, practically only the table of contents.
By deleting the table I only delete chunks that are not in any other table.
This is already very hot!
exactly :D
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!