PBS client Change detection mode

Sky_

New Member
Jan 27, 2023
8
2
3
So, there exists these new config flags since version 3.x

I just tested it with a big LXC container (Backup via Proxmox gui) and the difference after the initial backup with metadata mode was day and night.
That LXC normally takes about 20 minutes to backup and now its done in 14 seconds.

The metadata mode is obviously extremely fast but the question for me is: why even offer the data mode then?

Is there anything to worry about in terms of safety? Is the chance of incomplete backups higher with the metadata mode?

I really couldnt find much info other than the official doc, I guess that not much people have noticed it so far. Would be interested if someone has some more insights on this
 
  • Like
Reactions: Chris
Hi,
So, there exists these new config flags since version 3.x
allow me to be more precise here, so other users might not wonder why they do not see this mode: the change-detection-mode was introduced with proxmox-backup-client in version 3.2.5 and is backwards compatible to PBS hosts running a previous 3.x version, although some features are not available for snapshots created with the new mode on older server versions (e.g. browsing the contents via the WebUI).

I just tested it with a big LXC container (Backup via Proxmox gui) and the difference after the initial backup with metadata mode was day and night.
That LXC normally takes about 20 minutes to backup and now its done in 14 seconds.
Glad to see that you get such an nice performance improvement, if you are willing please share the task log for one of your backups with the new mode, this should provide us with some initial feedback regarding size, reused vs. re-encoded files, possible paddings ecc.

The metadata mode is obviously extremely fast but the question for me is: why even offer the data mode then?
There are mostly 2 reasons to provide this mode as well:
  • Hosts/CTs with high frequency data changes will not see an improvement with the change detection mode, in the worst case this might perform even worse as the current default mode. This stems from the fact, that the change detection mode uses the previous metadata archive as reference to see if a file can be reused or if the file contents have to be rechunked (because the metadata changed). So if most/many of the files change frequently, there is the additional unwanted overhead of these lookups. The data mode allows you to gain the benefits of improved compressability of the medadata archive and the now not required additional catalog, while rechunking all files without reference lookup overhead.
  • The metadata mode might introduce some wasted space (paddings) because of the reuse of already created and uploaded chunks on the server. This stems from the fact that a chunk boundary is not guaranteed to be aligned with a file boundary. So if a chunk can be reindexed, which however contains also contents from a file which did change and is therefore rechunked, this chunk contains some data not relevant for this snapshot (relevant for the previous snapshot however). Paddings are reduced internally to a threshold limit, can however take unwanted space. The data mode therefore allows to recreate a snapshot without paddings at any point in time.

Please not that backup snapshots are still self contained (for all 3 modes), the metadata mode only needs the previous snapshots metadata archive during snapshot creation, not for restore, ecc.

Is there anything to worry about in terms of safety? Is the chance of incomplete backups higher with the metadata mode?
This feature is currently flagged as experimental, not because it does not work as advertised, but because there might be some edge cases with respect to performance and usability we might not yet be aware of and find out with wider adoption. And backups are a critical part of your infrastructure, which you want to fully be able to rely on.

This is also why any form of testing and feedback is highly appreciated!

It is however strongly suggested to see this as experimental for the time being, I recommend to also have backups using the current default mode at hand.

I really couldnt find much info other than the official doc, I guess that not much people have noticed it so far. Would be interested if someone has some more insights on this
The documentation is currently still limited, will however be improved upon. Also based on feedback, which might show what needs some more in depth explanation.

I hope this clarifies your questions and am happy for further feedback!

Edit: fixed some typos
 
Last edited:
  • Like
Reactions: Lukas Wagner
Thanks Chris for the detailed write up.
i will paste the outputs for one lxc down below :)
So since you recommend to do some backups with the current default mode as well, it would be nice to have the possibility to chose the change-detection-mode individually for the backup jobs trough the gui (i guess i could still edit the jobs manually in /etc/pve/jobs.cfg ? ) . Since I for example do a stop backup every sunday anyway, I could then configure this job to use the legacy mode and have the backupjob that backups trough the week use the metadata mode.


The Container seen here is a debian12 lxc with prometheus/influxdb/grafana (data here should just grow but not really change)
I may be able to show examples for way bigger containers (~300GiB | don't laugh its a homelab :p) the next days as I'm converting a fileserver from a vm to lxc, can share the results here too if wanted.

Legacy mode:
Code:
INFO: Starting Backup of VM 116 (lxc)
INFO: Backup started at 2024-06-26 01:06:12
INFO: status = running
INFO: CT Name: Grafana
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp0 ('/mnt/data') in backup
INFO: backup mode: snapshot
INFO: bandwidth limit: 200000 KB/s
INFO: ionice priority: 7
INFO: suspend vm to make snapshot
INFO: create storage snapshot 'vzdump'
  Logical volume "snap_vm-116-disk-0_vzdump" created.
  Logical volume "snap_vm-116-disk-2_vzdump" created.
INFO: resume vm
INFO: guest is online again after 1 seconds
INFO: creating Proxmox Backup Server archive 'ct/116/2024-06-25T23:06:12Z'
INFO: set max number of entries in memory for file-based backups to 1048576
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.conf fw.conf:/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.fw root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --include-dev /mnt/vzsnap0/./mnt/data --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 116 --backup-time 1719356772 --entries-max 1048576 --repository dienst-backup-home@pbs@10.0.60.10:IRONWOLF_4TB_BOTTOM --ns HOME-SKY/Prod
INFO: Starting backup: [HOME-SKY/Prod]:ct/116/2024-06-25T23:06:12Z
INFO: Client name: Proxmox
INFO: Starting backup protocol: Wed Jun 26 01:06:13 2024
INFO: Downloading previous manifest (Tue Jun 25 01:05:45 2024)
INFO: Upload config file '/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.conf' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as pct.conf.blob
INFO: Upload config file '/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.fw' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as fw.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as root.pxar.didx
INFO: root.pxar: had to backup 4.145 GiB of 78.65 GiB (compressed 2.168 GiB) in 433.24s
INFO: root.pxar: average backup speed: 9.798 MiB/s
INFO: root.pxar: backup was done incrementally, reused 74.505 GiB (94.7%)
INFO: Uploaded backup catalog (1000.342 KiB)
INFO: Duration: 446.27s
INFO: End Time: Wed Jun 26 01:13:39 2024
INFO: adding notes to backup
INFO: cleanup temporary 'vzdump' snapshot
  Logical volume "snap_vm-116-disk-0_vzdump" successfully removed.
  Logical volume "snap_vm-116-disk-2_vzdump" successfully removed.
INFO: Finished Backup of VM 116 (00:07:30)
INFO: Backup finished at 2024-06-26 01:13:42

Metadata mode (second backup with metadata mode to make use of the previous archive):
Code:
INFO: Starting Backup of VM 116 (lxc)
INFO: Backup started at 2024-07-04 01:04:10
INFO: status = running
INFO: CT Name: Grafana
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp0 ('/mnt/data') in backup
INFO: backup mode: snapshot
INFO: bandwidth limit: 200000 KB/s
INFO: ionice priority: 7
INFO: suspend vm to make snapshot
INFO: create storage snapshot 'vzdump'
  Logical volume "snap_vm-116-disk-0_vzdump" created.
  Logical volume "snap_vm-116-disk-2_vzdump" created.
INFO: resume vm
INFO: guest is online again after 1 seconds
INFO: creating Proxmox Backup Server archive 'ct/116/2024-07-03T23:04:10Z'
INFO: set max number of entries in memory for file-based backups to 1048576
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.conf fw.conf:/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.fw root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --include-dev /mnt/vzsnap0/./mnt/data --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 116 --backup-time 1720047850 --change-detection-mode metadata --entries-max 1048576 --repository dienst-backup-home@pbs@10.0.60.10:IRONWOLF_4TB_BOTTOM --ns HOME-SKY/Prod
INFO: Starting backup: [HOME-SKY/Prod]:ct/116/2024-07-03T23:04:10Z
INFO: Client name: Proxmox
INFO: Starting backup protocol: Thu Jul  4 01:04:11 2024
INFO: Downloading previous manifest (Wed Jul  3 01:31:56 2024)
INFO: Upload config file '/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.conf' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as pct.conf.blob
INFO: Upload config file '/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.fw' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as fw.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as root.mpxar.didx
INFO: Using previous index as metadata reference for 'root.mpxar.didx'
INFO: Change detection summary:
INFO:  - 38535 total files (6 hardlinks)
INFO:  - 37180 unchanged, reusable files with 80.889 GiB data
INFO:  - 1349 changed or non-reusable files with 1.093 GiB data
INFO:  - 34.074 MiB padding in 19 partially reused chunks
INFO: root.ppxar: reused 80.922 GiB from previous snapshot for unchanged files (17615 chunks)
INFO: root.ppxar: had to backup 1.039 GiB of 82.014 GiB (compressed 531.89 MiB) in 12.56 s (average 84.679 MiB/s)
INFO: root.ppxar: backup was done incrementally, reused 80.976 GiB (98.7%)
INFO: root.mpxar: had to backup 6.608 MiB of 6.608 MiB (compressed 1.138 MiB) in 12.61 s (average 536.399 KiB/s)
INFO: Duration: 20.78s
INFO: End Time: Thu Jul  4 01:04:32 2024
INFO: adding notes to backup
INFO: cleanup temporary 'vzdump' snapshot
  Logical volume "snap_vm-116-disk-0_vzdump" successfully removed.
  Logical volume "snap_vm-116-disk-2_vzdump" successfully removed.
INFO: Finished Backup of VM 116 (00:00:23)
INFO: Backup finished at 2024-07-04 01:04:33
 
So since you recommend to do some backups with the current default mode as well, it would be nice to have the possibility to chose the change-detection-mode individually for the backup jobs trough the gui (i guess i could still edit the jobs manually in /etc/pve/jobs.cfg ? ) . Since I for example do a stop backup every sunday anyway, I could then configure this job to use the legacy mode and have the backupjob that backups trough the week use the metadata mode.
Hi, thank you for sharing your task logs.

Regarding the backup jobs, you can already set the mode to be used for each individual job in the Advanced settings in the PVE WebUI. You will have to navigate to Datacenter > Backups > <select-job> > Edit > Advanced, or setting this directly when creating a new backup job. To see this, you should have a pve-manager version 8.2.4 or higher.

Note that creating a default (legacy) backup in between the backups with metadata mode will break the chain if you backup into the same backup group (given by the container id). This is because the previous backup must contain a metadata archive for the change detection lookups.
So best is to add a dedicated namespace on the PBS host for one type of backup mode and expose this by adding it as additional PBS storage on the PVE host.

Also, please do note that chunks cannot be as efficiently deduplcated when storing backups created with different modes on the same datastore, since the file format for the default (legacy) mode and the data and metadata mode differ (using the pxar file format version 2 which supports the split data/metadata archive).

So keep an eye on the storage usage and set up quotas so you will not run out of space.
 
Last edited:
Oh, youre totally right, I dont know how i missed the gui option :D

Yeah i noticed the dedup difference after the first run during the sync to my offsite, it took a bit longer than usual :)

I will make new namespaces for the different modes as suggested :)

Thanks again for the detailed reply :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!