Database backup deduplication

Piotr K

Active Member
Feb 8, 2019
13
2
43
45
Hi.

I driving some VMs which are not on PVE and i do not have direct access to snapshot etc.

Now for the backup task i am creating mounting an additional volume (via cloud api), then i make a dump of the mysql databases which i compress on the fly with gzip. Resulting in about 10GB of compressed archives. Now i start the proxmox-backup-client like below and send the backup of this volume to my PBS server.

Code:
proxmox-backup-client backup backup.pxar:/mnt/volume --repository ..

Most of the database data stays similar, they just get added log data.

How will deduplication work in this case?

Can I somehow check how much space each backup takes after deduplication?

Should I eventually use different process?
 
like dedup always work. eliminate same chunks.

u see only a dedup ratio of the datastore in the gui

explain!

Most tables in the databases i have are like "logs", so only new additions arrive at the end, old data stays untouched. So i generate one dump for each table, then this dumps are compressed with gz --fast. I am not sure how gz (or other) handles the files. So the table dumps are 95% same before compression so deduplication could make a good job. I dont know how it looks after compression, are most parts of the archive still same? If the archive alogrythm also works in chunks the is a great chance the chunks of the archive are same and can be deduplicated successfully - am i right or wrong?