PBS: Less space on tape used than expected

phip

New Member
Aug 13, 2024
25
6
3
Hi all,

we're regularly taking backup snapshots of some VMs with very large disks (20TB), which go to PBS and then to daily and weekly rotated tapes. While everything appears to run smoothly, I'm a bit confused about the reported size:

  • The VM disk has a maximum size of 20T
  • Inside the VM, df reports 14T used
  • The disk is stored on ZFS with compression enabled and has a reported USEDDS of 8.4T
  • The PBS inventory reports a "Bytes used" of only 4.7T for the tape containing a snapshot of this and some other VMs disks
The media pool has an allocation policy of "always" and retention policy "2 days". The backup job is configured with "Latest only", namespace "Root" and max depth "Full".

The data belongs to a MySQL database, which certainly contains a lot of redundancy, as can be seen quite obviously in the compression ratio on ZFS. However, I find it a bit hard to believe that deduplication and compression of PBS could save another about 50% of the space on top of what ZFS compression already provides.

While the only reliable way to check if everything is fine is to restore such a snapshot to a new PBS and then verify the image, I'd also like to learn about other's experiences in this regard:

  • Did others also observe such a ratio between compressed disk size and resulting deduplicated size on tape?
  • Could it be that, with the given media pool and backup job settings, the snapshot is somehow distributed over multiple tapes, which would all be needed for a successful restoration?
  • Is there a way to check in PBS how much space a snapshot takes after deduplication, i.e., the total size of all individual chunks that it contains? I think it wouldn't be too hard to extract this from the fidx file, but maybe such a tool is already available?
Thanks!