Database backup deduplication

Piotr K · Mar 14, 2023

Hi.

I driving some VMs which are not on PVE and i do not have direct access to snapshot etc.

Now for the backup task i am creating mounting an additional volume (via cloud api), then i make a dump of the mysql databases which i compress on the fly with gzip. Resulting in about 10GB of compressed archives. Now i start the proxmox-backup-client like below and send the backup of this volume to my PBS server.

Code:

proxmox-backup-client backup backup.pxar:/mnt/volume --repository ..

Most of the database data stays similar, they just get added log data.

How will deduplication work in this case?

Can I somehow check how much space each backup takes after deduplication?

Should I eventually use different process?

floh8 · Mar 28, 2023

Piotr K said:
How will deduplication work in this case?

like dedup always work. eliminate same chunks.

Piotr K said:
Can I somehow check how much space each backup takes after deduplication?

u see only a dedup ratio of the datastore in the gui

Piotr K said:
Should I eventually use different process?

explain!

Dunuin · Mar 28, 2023

floh8 said:
like dedup always work. eliminate same chunks.

And chunks are 4MB in size, so not as fine-granular as for example a ZFS deduplication which works on a way smaller block level.

Piotr K · Mar 28, 2023

floh8 said:
like dedup always work. eliminate same chunks.

u see only a dedup ratio of the datastore in the gui

explain!

Most tables in the databases i have are like "logs", so only new additions arrive at the end, old data stays untouched. So i generate one dump for each table, then this dumps are compressed with gz --fast. I am not sure how gz (or other) handles the files. So the table dumps are 95% same before compression so deduplication could make a good job. I dont know how it looks after compression, are most parts of the archive still same? If the archive alogrythm also works in chunks the is a great chance the chunks of the archive are same and can be deduplicated successfully - am i right or wrong?

floh8 · Mar 28, 2023

there is a good chance that the dedup is extensive.

Search

Search

Database backup deduplication

Piotr K

Active Member

floh8

Renowned Member

Dunuin

Distinguished Member

Piotr K

Active Member

floh8

Renowned Member

We value your privacy