Deduplication Database backups

David Thistlethwaite · Mar 25, 2021

Can anyone please tell me how the dedup database/data is backed up, as without it all backups become unusable correct ?

Thanks for your time and attention

fabian · Mar 25, 2021

there is no dedup database.

for dynamic indices, a chunking algorithm splits the input stream into chunks which are hashed. if a chunk with hash A already exists, we can re-use it (and either not upload it, or not store it a second time depending on exact circumstances). for fixed indices, chunking simply happens every X bytes (4M for now), and the same deduplication mechanism (hash already known -> no need to store it twice) is employed. a datastore consisting of the .chunks directory containing the data, and the vm/ct/host/.. directories containing the snapshot metadata and indices is self-contained.

David Thistlethwaite · Mar 27, 2021

fabian said:
there is no dedup database.

for dynamic indices, a chunking algorithm splits the input stream into chunks which are hashed. if a chunk with hash A already exists, we can re-use it (and either not upload it, or not store it a second time depending on exact circumstances). for fixed indices, chunking simply happens every X bytes (4M for now), and the same deduplication mechanism (hash already known -> no need to store it twice) is employed. a datastore consisting of the .chunks directory containing the data, and the vm/ct/host/.. directories containing the snapshot metadata and indices is self-contained.

That is what I thought I understood from the docs, but I have been tainted by other backup systems that have a separate dedup database

Search

Search

Deduplication Database backups

David Thistlethwaite

Active Member

fabian

Proxmox Staff Member

David Thistlethwaite

Active Member