PBS client Change detection mode

Sky_ · Jul 3, 2024

So, there exists these new config flags since version 3.x

I just tested it with a big LXC container (Backup via Proxmox gui) and the difference after the initial backup with metadata mode was day and night.
That LXC normally takes about 20 minutes to backup and now its done in 14 seconds.

The metadata mode is obviously extremely fast but the question for me is: why even offer the data mode then?

Is there anything to worry about in terms of safety? Is the chance of incomplete backups higher with the metadata mode?

I really couldnt find much info other than the official doc, I guess that not much people have noticed it so far. Would be interested if someone has some more insights on this

Chris · Jul 4, 2024

Hi,

Sky_ said:
So, there exists these new config flags since version 3.x

allow me to be more precise here, so other users might not wonder why they do not see this mode: the change-detection-mode was introduced with proxmox-backup-client in version 3.2.5 and is backwards compatible to PBS hosts running a previous 3.x version, although some features are not available for snapshots created with the new mode on older server versions (e.g. browsing the contents via the WebUI).

Sky_ said:
I just tested it with a big LXC container (Backup via Proxmox gui) and the difference after the initial backup with metadata mode was day and night.
That LXC normally takes about 20 minutes to backup and now its done in 14 seconds.

Glad to see that you get such an nice performance improvement, if you are willing please share the task log for one of your backups with the new mode, this should provide us with some initial feedback regarding size, reused vs. re-encoded files, possible paddings ecc.

Sky_ said:
The metadata mode is obviously extremely fast but the question for me is: why even offer the data mode then?

There are mostly 2 reasons to provide this mode as well:

Hosts/CTs with high frequency data changes will not see an improvement with the change detection mode, in the worst case this might perform even worse as the current default mode. This stems from the fact, that the change detection mode uses the previous metadata archive as reference to see if a file can be reused or if the file contents have to be rechunked (because the metadata changed). So if most/many of the files change frequently, there is the additional unwanted overhead of these lookups. The data mode allows you to gain the benefits of improved compressability of the medadata archive and the now not required additional catalog, while rechunking all files without reference lookup overhead.
The metadata mode might introduce some wasted space (paddings) because of the reuse of already created and uploaded chunks on the server. This stems from the fact that a chunk boundary is not guaranteed to be aligned with a file boundary. So if a chunk can be reindexed, which however contains also contents from a file which did change and is therefore rechunked, this chunk contains some data not relevant for this snapshot (relevant for the previous snapshot however). Paddings are reduced internally to a threshold limit, can however take unwanted space. The data mode therefore allows to recreate a snapshot without paddings at any point in time.

Please not that backup snapshots are still self contained (for all 3 modes), the metadata mode only needs the previous snapshots metadata archive during snapshot creation, not for restore, ecc.

Sky_ said:
Is there anything to worry about in terms of safety? Is the chance of incomplete backups higher with the metadata mode?

This feature is currently flagged as experimental, not because it does not work as advertised, but because there might be some edge cases with respect to performance and usability we might not yet be aware of and find out with wider adoption. And backups are a critical part of your infrastructure, which you want to fully be able to rely on.

This is also why any form of testing and feedback is highly appreciated!

It is however strongly suggested to see this as experimental for the time being, I recommend to also have backups using the current default mode at hand.

Sky_ said:
I really couldnt find much info other than the official doc, I guess that not much people have noticed it so far. Would be interested if someone has some more insights on this

The documentation is currently still limited, will however be improved upon. Also based on feedback, which might show what needs some more in depth explanation.

I hope this clarifies your questions and am happy for further feedback!

Edit: fixed some typos

Sky_ · Jul 4, 2024

Thanks Chris for the detailed write up.
i will paste the outputs for one lxc down below

So since you recommend to do some backups with the current default mode as well, it would be nice to have the possibility to chose the change-detection-mode individually for the backup jobs trough the gui (i guess i could still edit the jobs manually in /etc/pve/jobs.cfg ? ) . Since I for example do a stop backup every sunday anyway, I could then configure this job to use the legacy mode and have the backupjob that backups trough the week use the metadata mode.

The Container seen here is a debian12 lxc with prometheus/influxdb/grafana (data here should just grow but not really change)
I may be able to show examples for way bigger containers (~300GiB | don't laugh its a homelab

) the next days as I'm converting a fileserver from a vm to lxc, can share the results here too if wanted.

Legacy mode:

Code:

INFO: Starting Backup of VM 116 (lxc)
INFO: Backup started at 2024-06-26 01:06:12
INFO: status = running
INFO: CT Name: Grafana
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp0 ('/mnt/data') in backup
INFO: backup mode: snapshot
INFO: bandwidth limit: 200000 KB/s
INFO: ionice priority: 7
INFO: suspend vm to make snapshot
INFO: create storage snapshot 'vzdump'
  Logical volume "snap_vm-116-disk-0_vzdump" created.
  Logical volume "snap_vm-116-disk-2_vzdump" created.
INFO: resume vm
INFO: guest is online again after 1 seconds
INFO: creating Proxmox Backup Server archive 'ct/116/2024-06-25T23:06:12Z'
INFO: set max number of entries in memory for file-based backups to 1048576
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.conf fw.conf:/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.fw root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --include-dev /mnt/vzsnap0/./mnt/data --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 116 --backup-time 1719356772 --entries-max 1048576 --repository dienst-backup-home@pbs@10.0.60.10:IRONWOLF_4TB_BOTTOM --ns HOME-SKY/Prod
INFO: Starting backup: [HOME-SKY/Prod]:ct/116/2024-06-25T23:06:12Z
INFO: Client name: Proxmox
INFO: Starting backup protocol: Wed Jun 26 01:06:13 2024
INFO: Downloading previous manifest (Tue Jun 25 01:05:45 2024)
INFO: Upload config file '/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.conf' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as pct.conf.blob
INFO: Upload config file '/var/tmp/vzdumptmp1806495_116/etc/vzdump/pct.fw' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as fw.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as root.pxar.didx
INFO: root.pxar: had to backup 4.145 GiB of 78.65 GiB (compressed 2.168 GiB) in 433.24s
INFO: root.pxar: average backup speed: 9.798 MiB/s
INFO: root.pxar: backup was done incrementally, reused 74.505 GiB (94.7%)
INFO: Uploaded backup catalog (1000.342 KiB)
INFO: Duration: 446.27s
INFO: End Time: Wed Jun 26 01:13:39 2024
INFO: adding notes to backup
INFO: cleanup temporary 'vzdump' snapshot
  Logical volume "snap_vm-116-disk-0_vzdump" successfully removed.
  Logical volume "snap_vm-116-disk-2_vzdump" successfully removed.
INFO: Finished Backup of VM 116 (00:07:30)
INFO: Backup finished at 2024-06-26 01:13:42

Metadata mode (second backup with metadata mode to make use of the previous archive):

Code:

INFO: Starting Backup of VM 116 (lxc)
INFO: Backup started at 2024-07-04 01:04:10
INFO: status = running
INFO: CT Name: Grafana
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp0 ('/mnt/data') in backup
INFO: backup mode: snapshot
INFO: bandwidth limit: 200000 KB/s
INFO: ionice priority: 7
INFO: suspend vm to make snapshot
INFO: create storage snapshot 'vzdump'
  Logical volume "snap_vm-116-disk-0_vzdump" created.
  Logical volume "snap_vm-116-disk-2_vzdump" created.
INFO: resume vm
INFO: guest is online again after 1 seconds
INFO: creating Proxmox Backup Server archive 'ct/116/2024-07-03T23:04:10Z'
INFO: set max number of entries in memory for file-based backups to 1048576
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.conf fw.conf:/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.fw root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --include-dev /mnt/vzsnap0/./mnt/data --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 116 --backup-time 1720047850 --change-detection-mode metadata --entries-max 1048576 --repository dienst-backup-home@pbs@10.0.60.10:IRONWOLF_4TB_BOTTOM --ns HOME-SKY/Prod
INFO: Starting backup: [HOME-SKY/Prod]:ct/116/2024-07-03T23:04:10Z
INFO: Client name: Proxmox
INFO: Starting backup protocol: Thu Jul  4 01:04:11 2024
INFO: Downloading previous manifest (Wed Jul  3 01:31:56 2024)
INFO: Upload config file '/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.conf' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as pct.conf.blob
INFO: Upload config file '/var/tmp/vzdumptmp2609800_116/etc/vzdump/pct.fw' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as fw.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'dienst-backup-home@pbs@10.0.60.10:8007:IRONWOLF_4TB_BOTTOM' as root.mpxar.didx
INFO: Using previous index as metadata reference for 'root.mpxar.didx'
INFO: Change detection summary:
INFO:  - 38535 total files (6 hardlinks)
INFO:  - 37180 unchanged, reusable files with 80.889 GiB data
INFO:  - 1349 changed or non-reusable files with 1.093 GiB data
INFO:  - 34.074 MiB padding in 19 partially reused chunks
INFO: root.ppxar: reused 80.922 GiB from previous snapshot for unchanged files (17615 chunks)
INFO: root.ppxar: had to backup 1.039 GiB of 82.014 GiB (compressed 531.89 MiB) in 12.56 s (average 84.679 MiB/s)
INFO: root.ppxar: backup was done incrementally, reused 80.976 GiB (98.7%)
INFO: root.mpxar: had to backup 6.608 MiB of 6.608 MiB (compressed 1.138 MiB) in 12.61 s (average 536.399 KiB/s)
INFO: Duration: 20.78s
INFO: End Time: Thu Jul  4 01:04:32 2024
INFO: adding notes to backup
INFO: cleanup temporary 'vzdump' snapshot
  Logical volume "snap_vm-116-disk-0_vzdump" successfully removed.
  Logical volume "snap_vm-116-disk-2_vzdump" successfully removed.
INFO: Finished Backup of VM 116 (00:00:23)
INFO: Backup finished at 2024-07-04 01:04:33

Chris · Jul 4, 2024

Sky_ said:
So since you recommend to do some backups with the current default mode as well, it would be nice to have the possibility to chose the change-detection-mode individually for the backup jobs trough the gui (i guess i could still edit the jobs manually in /etc/pve/jobs.cfg ? ) . Since I for example do a stop backup every sunday anyway, I could then configure this job to use the legacy mode and have the backupjob that backups trough the week use the metadata mode.

Hi, thank you for sharing your task logs.

Regarding the backup jobs, you can already set the mode to be used for each individual job in the Advanced settings in the PVE WebUI. You will have to navigate to Datacenter > Backups > <select-job> > Edit > Advanced, or setting this directly when creating a new backup job. To see this, you should have a pve-manager version 8.2.4 or higher.

Note that creating a default (legacy) backup in between the backups with metadata mode will break the chain if you backup into the same backup group (given by the container id). This is because the previous backup must contain a metadata archive for the change detection lookups.
So best is to add a dedicated namespace on the PBS host for one type of backup mode and expose this by adding it as additional PBS storage on the PVE host.

Also, please do note that chunks cannot be as efficiently deduplcated when storing backups created with different modes on the same datastore, since the file format for the default (legacy) mode and the data and metadata mode differ (using the pxar file format version 2 which supports the split data/metadata archive).

So keep an eye on the storage usage and set up quotas so you will not run out of space.

Sky_ · Jul 4, 2024

Oh, youre totally right, I dont know how i missed the gui option

Yeah i noticed the dedup difference after the first run during the sync to my offsite, it took a bit longer than usual

I will make new namespaces for the different modes as suggested

Thanks again for the detailed reply

esvee · Sep 5, 2024

Oh my this new mode is tops.. a previous daily backup of a cts mount point used to take 45mins everyday.. now it's 8 seconds.

Should verifications be faster now, too?

Chris · Sep 5, 2024

esvee said:
Oh my this new mode is tops.. a previous daily backup of a cts mount point used to take 45mins everyday.. now it's 8 seconds.

Should verifications be faster now, too?

No, verifications will not be faster. A verify task will check the integrity of all indexed chunks. And since the chunks for unchanged files have to be indexed in snapshot as well, this does not change.

mitchell5433 · Sep 5, 2024

Hello, is there any way to do an interactive restore on host backups that were created using the metadata mode?

Code:

root@server:/mnt/backup# proxmox-backup-client catalog shell host/server/2024-09-05T17:05:01Z server-storage.ppxar --ns Local --keyfile /root/pbs_client/server-storage.key
Error: Unable to open dynamic index "/mnt/backup/ns/Local/host/server/2024-09-05T17:05:01Z/catalog.pcat1.didx" - No such file or directory (os error 2)

EDIT:
Ah, looks like I can mount the backup with FUSE. This seems to work well when cherry-picking files to copy/restore.

Code:

proxmox-backup-client mount host/server/2024-09-05T17:05:01Z server-storage.ppxar /mnt/restore --keyfile /root/pbs_client/server-storage.key --ns Local

Chris · Sep 6, 2024

mitchell5433 said:
Hello, is there any way to do an interactive restore on host backups that were created using the metadata mode?

Code:

root@server:/mnt/backup# proxmox-backup-client catalog shell host/server/2024-09-05T17:05:01Z server-storage.ppxar --ns Local --keyfile /root/pbs_client/server-storage.key Error: Unable to open dynamic index "/mnt/backup/ns/Local/host/server/2024-09-05T17:05:01Z/catalog.pcat1.didx" - No such file or directory (os error 2)

EDIT:
Ah, looks like I can mount the backup with FUSE. This seems to work well when cherry-picking files to copy/restore.

Code:

proxmox-backup-client mount host/server/2024-09-05T17:05:01Z server-storage.ppxar /mnt/restore --keyfile /root/pbs_client/server-storage.key --ns Local

Hi,

thank you for feedback regarding the change detection mode.

Currently, the catalog dump and catalog shell functionality is not available for split pxar archives created with change-detection-mode set as metadata or data. There is however a series of patches [0] on the developer mailing list which will restore this functionality once applied.

As you already pointed out correctly, for the time being mounting the archive via FUSE can be used as workaround.

[0] https://lists.proxmox.com/pipermail/pbs-devel/2024-August/010504.html

kegloadam · Nov 26, 2024

Hi,
How about the file restore when backups created with proxmox-backup-client?
I see only mpxar.didx and ppxar.didx and index.json.blob files on PBS GUI. The file browsing does not yet work on GUI, I got it, but with CLI i cannot mount, map, interactive shell, nothing...
I just get the following errors:
- Error: Can only mount/map pxar archives and drive images.
- Error: Can only mount pxar archives.

The only relevant I got for command: proxmox-backup-client restore host/BV-PMX-01/2024-11-26T07:08:52Z index.json - --ns OtherHosts/hdd-Management --repository PVE-hdd-backup-PBS

Any recommendation how to restore / browse files if the backup is created with --change-detection-mode=meta with proxmox-backup-client?

Thank you!

fabian · Nov 26, 2024

kegloadam said:
Hi,
How about the file restore when backups created with proxmox-backup-client?
I see only mpxar.didx and ppxar.didx and index.json.blob files on PBS GUI. The file browsing does not yet work on GUI, I got it, but with CLI i cannot mount, map, interactive shell, nothing...
I just get the following errors:
- Error: Can only mount/map pxar archives and drive images.
- Error: Can only mount pxar archives.

The only relevant I got for command: proxmox-backup-client restore host/BV-PMX-01/2024-11-26T07:08:52Z index.json - --ns OtherHosts/hdd-Management --repository PVE-hdd-backup-PBS

Any recommendation how to restore / browse files if the backup is created with --change-detection-mode=meta with proxmox-backup-client?

Thank you!

the current PBS client and server packages should fully support the split archives across the board. please verify your client is up to date, then catalog shell, proxmox-file-restore and restore should all work. if your server packages are up to date, and the backup is not encrypted, file-restore using the web UI of the server should also work.

if that's not the case, please
- post "proxmox-backup-client version" (client side)
- post "proxmox-backup-manager version --verbose" (server side)
- the command you are running and its full output

kegloadam · Nov 26, 2024

Hi @fabian ,
client version: 3.2.9
server version: 3.2.2

And in case I am trying to open mpxar or ppxar from the GUI I got this error:
Bad Request (400)
unable to read dynamic index '"/mnt/PVE-hdd-backup-PBS/ns/OtherHosts/ns/hdd-Management/host/BV-PMX-01/2024-11-26T08:28:29Z/catalog.pcat1.didx"' - Unable to open dynamic index "/mnt/PVE-hdd-backup-PBS/ns/OtherHosts/ns/hdd-Management/host/BV-PMX-01/2024-11-26T08:28:29Z/catalog.pcat1.didx" - No such file or directory (os error 2)

CLI backup command:
#!/bin/bash
PBS_PASSWORD="XXXX" \
PBS_FINGERPRINT="XXXX" \
proxmox-backup-client backup Management-M.pxar:/mnt/hdd-pool/Management-M \
--change-detection-mode=metadata \
--skip-e2big-xattr \
--repository bv-databakuser@pbs@192.168.100.22:8007:PVE-hdd-backup-PBS --ns OtherHosts/hdd-Management

Chris · Nov 26, 2024

kegloadam said:
Hi,
How about the file restore when backups created with proxmox-backup-client?
I see only mpxar.didx and ppxar.didx and index.json.blob files on PBS GUI. The file browsing does not yet work on GUI, I got it, but with CLI i cannot mount, map, interactive shell, nothing...
I just get the following errors:
- Error: Can only mount/map pxar archives and drive images.
- Error: Can only mount pxar archives.

The only relevant I got for command: proxmox-backup-client restore host/BV-PMX-01/2024-11-26T07:08:52Z index.json - --ns OtherHosts/hdd-Management --repository PVE-hdd-backup-PBS

Any recommendation how to restore / browse files if the backup is created with --change-detection-mode=meta with proxmox-backup-client?

Thank you!

Hi, your command invocation sees wrong. You have to provide the archive name, not the index file for all restore and listing/mounting related operations.

E.g. proxmox-backup-client restore host/BV-PMX-01/2024-11-26T07:08:52Z <archive-name>.mpxar --ns ..., where <archive-name> is Management-M in your case I guess.

If this still does not work, please post the full command invokation.

In order to browse and restore contents via the PBS WebUI, the Proxmox Backup Server must have at least version 3.2.5. Please upgrade to the lastest version if you want to use the server side features.

kegloadam · Nov 26, 2024

Thank you @Chris , with server version 3.2.9 it works!

domwo · Nov 27, 2024

@Chris ist the metadata mode still experimental in Proxmox 8.3 with BAckup Client 3.2.9 ?

Chris · Nov 27, 2024

ositedw said:
@Chris ist the metadata mode still experimental in Proxmox 8.3 with BAckup Client 3.2.9 ?

No, since there where no critical issues found or reported during the prolonged internal and external testing period, starting with Proxmox VE 8.3 the new change detection modes data and metadata are not considered experimental anymore.

edit: fixed formatting issue

domwo · Nov 28, 2024

@Chris Thx,

another question i have read i can configure it as default on a node in /etc/vzdump.conf via pbs-change-detection-mode.

Is there a way to configure ist cluster wide ?

Chris · Nov 28, 2024

ositedw said:
@Chris Thx,

another question i have read i can configure it as default on a node in /etc/vzdump.conf via pbs-change-detection-mode.

Is there a way to configure ist cluster wide ?

Yes, you can set this in the nodes vzdump config the following settings are available

Code:

#pbs-change-detection-mode: legacy|data|metadata

But there is no cluster wide default setting possible. It is however planned to switch the default in the future. As this is a breaking change, this will however be done on a major version upgrade.

omgs · Nov 28, 2024

Chris said:
There are mostly 2 reasons to provide this mode as well:

Hosts/CTs with high frequency data changes will not see an improvement with the change detection mode, in the worst case this might perform even worse as the current default mode. This stems from the fact, that the change detection mode uses the previous metadata archive as reference to see if a file can be reused or if the file contents have to be rechunked (because the metadata changed). So if most/many of the files change frequently, there is the additional unwanted overhead of these lookups. The data mode allows you to gain the benefits of improved compressability of the medadata archive and the now not required additional catalog, while rechunking all files without reference lookup overhead.

The metadata mode might introduce some wasted space (paddings) because of the reuse of already created and uploaded chunks on the server. This stems from the fact that a chunk boundary is not guaranteed to be aligned with a file boundary. So if a chunk can be reindexed, which however contains also contents from a file which did change and is therefore rechunked, this chunk contains some data not relevant for this snapshot (relevant for the previous snapshot however). Paddings are reduced internally to a threshold limit, can however take unwanted space. The data mode therefore allows to recreate a snapshot without paddings at any point in time.

I¡m still unsure on how to interpretate that for my use cases. Let's say there are 2 cases:
1) Data is about 95% or more the same (the system part), being the changing part is log files. There's an exception when an upgrade takes place.
2) Data is "added", existing files are seldom modified (maybe some deleted). Think of a data storage for nextcloud or a mail server.

What would you advice for these use cases, where I still stick on the default mode?

Thanks in advance.

Chris · Nov 28, 2024

omgs said:
I¡m still unsure on how to interpretate that for my use cases. Let's say there are 2 cases:
1) Data is about 95% or more the same (the system part), being the changing part is log files. There's an exception when an upgrade takes place.
2) Data is "added", existing files are seldom modified (maybe some deleted). Think of a data storage for nextcloud or a mail server.

What would you advice for these use cases, where I still stick on the default mode?

Thanks in advance.

For both of your cases setting the change detection mode to metadata should give you significant runtime reduction as compared to the current default mode.

PBS client Change detection mode

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Member

New Member

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Attachments

Proxmox Staff Member

Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

We value your privacy