there is no dedup database.
for dynamic indices, a chunking algorithm splits the input stream into chunks which are hashed. if a chunk with hash A already exists, we can re-use it (and either not upload it, or not store it a second time depending on exact circumstances). for fixed indices, chunking simply happens every X bytes (4M for now), and the same deduplication mechanism (hash already known -> no need to store it twice) is employed. a datastore consisting of the .chunks directory containing the data, and the vm/ct/host/.. directories containing the snapshot metadata and indices is self-contained.