Are there any caveats on using "Metadata Detection Mode"?

GrafZahl

New Member
Jan 3, 2025
1
0
1
Hi there,
I am running PBS to backup some CTs and VMs running ob PVE. As the storage has over 1.5TB and is placed on classical, rotaing hdds a backup-jobs runs nearly 6 hours. I used the default settings for the backup-job and used "Default" as "Change detection mode".
Now I tried "Metadata" as change detection mode and the backup was (with the second run) extremely faster.

This brings me to the question if using "Metadata" as change detection mode is save or if there are any caveats that could kill my backups? As this seems to be much better, from the perspection of speed, I wonder if it could be a loss not reading all blocks.

Can you give me an hint on that? Thanks a lot!
 
Hi,
This brings me to the question if using "Metadata" as change detection mode is save or if there are any caveats that could kill my backups? As this seems to be much better, from the perspection of speed, I wonder if it could be a loss not reading all blocks.
there is no danger in "killing" your backup. But there are some caveats:
  • Files might be reused instead of re-encoded if the file data changed, but the metadata did not. E.g. if a file was edited, but the file size remained the same and the mtime of the file was restored after the change, the change detection mode will see this as unchanged file and reuse it. That is why we also provide the change-detection-mode data, which always reads all files again.
  • The change detection mode might reuse existing chunks partially, leading to some padding. E.g. a file contained within a single chunk vanished in-between backup runs, but the chunk is reused. This additional padding can become wasted space if the previous snapshot actually referencing that file is pruned. This also has implication for sensitive data, please see the notes in https://pbs.proxmox.com/docs/maintenance.html#pruning. The client tired to minimize this by aligning chunk boundaries with file boundaries when possible and re-encode smaller files in some situations although they might not have changed.
  • Restore times can be slower, if there is additional padding which is downloaded by the client in any cases because part of a chunk.
 
One caveat I'm wondering about is the scheduled jobs are running under metadata detection mode, but when I hit backup now for a manual backup I believe that runs in default mode, which breaks the chain and causes the backups to take longer and take more space correct? Is there some way to change manual backups on proxmox to use metadata detection mode?
 
I do have that option when manually backing up a CT
Screenshot_20250610_213036_Chrome.jpg
Make sure you use a recent PVE version and the Storage is PBS.
 
Ah I see it on the containers but not on VMs. Is it just not implemented yet for VMs? And is there a workaround then?
 
Exactly, this does not apply for VMs. VMs are backed up on a block level and not the filesystem level. For VM's there is the dirty bitmap maintained by the Qemu process which gives indication about which blocks of the disk image have changed since the last backup. Therefore VM backups do not need this change detection mode, the backups were already pretty fast :)
 
Hi,

there is no danger in "killing" your backup. But there are some caveats:
  • Files might be reused instead of re-encoded if the file data changed, but the metadata did not. E.g. if a file was edited, but the file size remained the same and the mtime of the file was restored after the change, the change detection mode will see this as unchanged file and reuse it. That is why we also provide the change-detection-mode data, which always reads all files again.


Hi,
I can see this happening with regular files (editing a text file without changing mtime), but is this even possible with backups of LXC/VM's?
 
Hi,
I can see this happening with regular files (editing a text file without changing mtime), but is this even possible with backups of LXC/VM's?
Yes, this can happen if the mtime is deliberately restored to it's previous value after changing a files content (which updates the mtime), located on the host or container filesystem being backed up. But this is not typically the case, unless done on purpose or by some scripting/tooling. Also, the size must be unchanged as well, differences in size are detected.

But to emphasize this again, the change detection mode only regards host and LXC backups, not VM backups. VM backups are handled differently (at the block level), the client being completely agnostic to the contents of the block device. There might not even be a filesystem on the block device at all.
 
  • Like
Reactions: Lukas Wagner
Ah so then when the backup job is set to metadata detection in advanced settings, that's actually only applying to the containers in the job NOT the VMs?