[SOLVED] dirty/new bitmaps

May 14, 2020
5
0
1
41
persisting the bitmap is not the issue - ensuring that it is still valid when starting the VM again is. we have no control over who/what accesses and potentially modifies the volume in-between, and any modification would lead to an invalid backup (and by the nature of incremental backups, this would propagate until the invalid chunk(s) are no longer part of the backup).
Would this still be an issue when using Backup + Stop Mode? ( Because PVE knows it is stopping the VM and will restart it shortly after)
Would you be consider keeping the dirty bitmap only if used with backup in stop mode?
This would help one use case: orderly shutdown, then immediate backup.

Also, would this even be possible on storage types such as ZFS volumes or RBDs, or only with qcow2-Image Files?
 

Dunuin

Famous Member
Jun 30, 2020
6,733
1,566
149
Germany
Would this still be an issue when using Backup + Stop Mode? ( Because PVE knows it is stopping the VM and will restart it shortly after)
Would you be consider keeping the dirty bitmap only if used with backup in stop mode?
This would help one use case: orderly shutdown, then immediate backup.

Also, would this even be possible on storage types such as ZFS volumes or RBDs, or only with qcow2-Image Files?
That sounds like a great idea if that would be possible. Its really annoying that "stop" mode backups can't make use of dirty bitmaps even if that VM is only stopped for a second to create the snapshot for the backup.
 

spirit

Famous Member
Apr 2, 2010
5,686
626
133
www.odiso.com
no it will not, but the point of this thread was to keep the dirty-bitmap for machines that turned off. in that case we have no way of knowing if the bitmap would still be valid besides from calculating it again and compare. @JamesT argued that it would be faster to hash it at the beginning than reading the disk live during backup.
Hi,
sorry to bump this old thread,
but could it be possible to save the bitmap on disk at vm stop + take a snapshot (if storage support it) of all disk + bitmap ? (on clean vm shutdown of course)
Then on next start, reload the snasphot + bitmap.


I was also thinking, if it could help to have backup through disk snapshot, to avoid impact of backup on vm with a lot of write.

(take snasphot + bitmap save, backup the snapshot offline with adding block backup support to proxmox-backup-client)
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,619
1,432
164
(take snasphot + bitmap save, backup the snapshot offline with adding block backup support to proxmox-backup-client)
proxmox-backup-client already has support for block-based backups (it would just need to have a mechanism to say "skip reading these blocks if the previous snapshot XXX exists" to mimic what we do with the bitmap in our qemu lib). one issue with such a snapshot based approach is that not all storages support directly accessing raw snapshot data (e.g., zvols don't offer a 'expose single snapshot' functionality, we'd either need to clone the snapshot, or expose all snapshots of a given zvol which can be quite many). basically it would be limited to storages/volumes that support "full clone from snapshot" in PVE. also note that there currently doesn't exist a vma binary equivalent (that has access to the PBS lib and qemu block drivers and can read the guest volumes from arbitrary storages and write a backup snapshot), which is why we actually start a VM for backing up templates to PBS for example, so we would need that if we want to avoid starting a (fake) VM for backing up such a snapshotted state.

but yeah, in theory this would work, and it's similar to what we do with live migration and replication, where we also use a bitmap to track which parts of the volume need to be mirrored on top of the already replicated state. similar to the replication case, we don't mind backing up a few blocks extra that the bitmap picked up even if they are already part of the previous snapshot (/replicated state). without that relaxed requirement we'd have the problem of ensuring consistency between external storage snapshots and qemu-internal bitmaps.

edit: I am not sure whether I like the 'snapshot on shutdown, rollback on start' approach - it would at least need some explicit marker to make it obvious what goes on, and that any changes done to the disks while the VM is powered off will be discarded when it's started again. else this could be quite problematic w.r.t. accidental data loss!
 
Last edited:

carsten2

Active Member
Mar 25, 2017
222
14
38
53
no it will not, but the point of this thread was to keep the dirty-bitmap for machines that turned off. in that case we have no way of knowing if the bitmap would still be valid besides from calculating it again and compare. @JamesT argued that it would be faster to hash it at the beginning than reading the disk live during backup.
Proxmox should now whether the VM has been switched on or not. If it keeps off all the time, PBS should not reread the whole disk everytime.
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,619
1,432
164
proxmox cannot know that the disk is still the same as when the bitmap was persisted - anybody and anything could have accessed and modified the disk on the storage layer in the meantime, and if we persist the bitmap, this inconsistency (which would manifest as a corrupt backup - and not in the sense of "checksum error that verify would detect", but "this backup contains bogus data that might or might not be usable at all") is then potentially propagated to ALL subsequent backup snapshots (since they would all build up on this wrong bitmap!).
 

carsten2

Active Member
Mar 25, 2017
222
14
38
53
1) It's true, that someone could modify the underlying storage. A malicious person can always corrupt data, but this is not the case in normal operation. An option to keep bitmaps, would be a good thing.
2) Another solution would be, that there is an option for PBS to skip backups of VMs, which have not been switched on after the last time of the backup.

Currently PBS wastes lots of the rereading the whole disks of switched of VMs. Here I have several VMs which are used only sometimes, and are switched of most of the time. Also VM templates are switched off.
 
Last edited:

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,619
1,432
164
I am not talking about malicious interference - just regular sysops stuff like mounting a volume (without any write I/O afterwards) can already trigger writes if you are not very very careful. keeping track of the last startup time and optionally, comparing that to the last backup snapshot and skipping the backup task might work (obviously, only if you don't do stuff that isn't recommended anyway like having a common backup target for multiple clusters with clashing guest IDs ;)).
 

carsten2

Active Member
Mar 25, 2017
222
14
38
53
How does PBS backup templates? Is there any special handling of them? They also do not get started and should be read every backup, unless they have been modified (by converting back to normal vm, starting and reconverting to template).

The option to skip VMs not started after the last backup would solve the problem for templates and those VMs which are switched of for other reasons.

Should a create a feature request?
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,619
1,432
164
How does PBS backup templates? Is there any special handling of them? They also do not get started and should be read every backup, unless they have been modified (by converting back to normal vm, starting and reconverting to template).

The option to skip VMs not started after the last backup would solve the problem for templates and those VMs which are switched of for other reasons.

Should a create a feature request?
templates are also started (in a read-only fashion) for backing up to PBS (there is no "vma"-like binary that can directly use Qemu block drivers to read the data, and talk to PBS to do the actual backup, although one could be created).

something I just realized (while thinking about the backup case) - we'd need to track both start of the guest and modification of the config, since a new disk can be added/imported/.. without starting the guest.
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,619
1,432
164
That means that PBS reads the whole disk of templates very time too?
if it is included in the backup - yes. most people would probably just not setup backup jobs for templates and manually back them up when they are changed I guess ;)
 

carsten2

Active Member
Mar 25, 2017
222
14
38
53
That is error prone. PBS should be able to efficiently backup unchanged, switched of VMs and containers. Unfortunately PBS is extremly bad in handling these cases, even though there are the simpliest cases for most other backup systems. So we have three problem:

1) Extremly slow backup of large LXC (hours to days).
2) Very slow backup of templates, even though, it would be easy to detect, that they have not been changed and can be skipped.
3) Very slow backup of switched off VMs or VMs that have not been switched on since the last backup.

PBS should really improve on this.
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
7,619
1,432
164
1) is very hard to improve, but we haven't forgotten about it
2 & 3) are what this thread is about - and as you can see by the discussion here, it is not as trivial as it looks from the outside ;)

even though a template cannot be started regularly, that doesn't mean it cannot be changed (you can add disks, including disks that have content that should be backed up, you can remove disks as long as there are no linked clones referencing it, ..). there is no magic way to detect that a VM (or rather, it's disks/volumes) hasn't change since the last backup short of calculating and comparing a checksum (which would bring us back to square one), and any heuristic needs to be mostly bullet proof else we risk either not backing up guests that have changed data (less "risky" case of just skipping backups) or creating corrupt backups (more risky case of basing a 're-use persisted bitmap' decision on it), which are both far worse than the status quo. tracking last start + a config digest might be a good start, but it requires analyzing all the possible edge cases before basing such important decisions on it - so it's not a small one-line change that can just be rolled out within a week.
 

carsten2

Active Member
Mar 25, 2017
222
14
38
53
1) is very hard to improve, but we haven't forgotten about it
2 & 3) are what this thread is about - and as you can see by the discussion here, it is not as trivial as it looks from the outside ;)

even though a template cannot be started regularly, that doesn't mean it cannot be changed (you can add disks, including disks that have content that should be backed up, you can remove disks as long as there are no linked clones referencing it, ..). there is no magic way to detect that a VM (or rather, it's disks/volumes) hasn't change since the last backup short of calculating and comparing a checksum (which would bring us back to square one), and any heuristic needs to be mostly bullet proof else we risk either not backing up guests that have changed data (less "risky" case of just skipping backups) or creating corrupt backups (more risky case of basing a 're-use persisted bitmap' decision on it), which are both far worse than the status quo. tracking last start + a config digest might be a good start, but it requires analyzing all the possible edge cases before basing such important decisions on it - so it's not a small one-line change that can just be rolled out within a week.
As alread said, if you fear corruption because of undetected changes, the safe way is to have an option to skip the backup of unstarted VMs. This garanties integrity. At most a backup could be missing, if special manual changes occur, but in this case the admin just has to force a backup for this VM manually.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!