[SOLVED] dirty/new bitmaps

tonci

Renowned Member
Jun 26, 2010
107
8
83
I installed PBS and TrueNas as nfs-datastore. Setup went well w/o any issues.
But after 1st full backup , the second/incremental one started with "creating new bitmaps":

INFO: virtio0: dirty-bitmap status: created new
INFO: virtio1: dirty-bitmap status: created new

Fullbackup lasted 16h and this incremental one will last cca 4h

It is obvious that pbs must go through whole vm-image of 1.5T ... The read-speed is about 100 Mibs which is not that "tragic" but why it cannot switch to fast-increment?

After gathering some experience with pbs/nfs combination I still cannot figure out when to expect above scenario vs this one bellow :

INFO: virtio0: dirty-bitmap status: OK (88.0 MiB of 15.0 GiB dirty)
INFO: virtio1: dirty-bitmap status: OK (352.0 MiB of 250.0 GiB dirty)
INFO: virtio2: dirty-bitmap status: OK (48.0 MiB of 32.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 488.0 MiB dirty of 297.0 GiB total

here, the system knows the difference " in advance " and transfers only that very quickly ....


Once I noticed that running vm's are backed with dirty-maps up and not-running ones with new bitmaps ... ?! ... or maybe I did too few tests ...


What are exactly the situations when dirty-bitmaps cannot be applied ?

Thank you in advance

BR
Tonci
 
The dirty bitmaps are only kept as long as the VM is running. Once it is shutdown, everything needs to be read again. But thanks to deduplication only new chunks are sent to the PBS.
 
Thank you ... all clear !!! ... so shutting down VM and powering it on again "resets" bitmaps ? ... but restart not ?
Ok it makes sense ... it would be ideal if bitmaps were always up-to-date with VM but this is also Ok
 
If you reboot inside the VM it should persist, but once the QEMU process is shutdown the dirty-bitmaps are discarded.
 
  • Like
Reactions: khisanthax
Is there a way to keep the bitmap persistent, because I have my backups offiste, which is bandwidth limited to 50mbit. It would be nice if there is a solution to sync only the changes over the network.
 
Even if the bitmaps are discarded because you stopped the qemu process it will only transfer chunks that changed and won't use your precious bandwith for unnecessary transfers. The process takes longer but does not use more bandwith.
 
OK .... it will transfer only the changed blocks ... understood !
but if it has to make dirty-bitmaps again , how does he do that ?
What does it compare ? source<>destination or source<>previous-dirty-bitmap ? If destination has been moved over to another location and connected with slower link this "comparison" could take long time ...

So what does really happen when we poweroff-on VM ? How does it affect the time and bandwith ?
 
What does it compare ? source<>destination or source<>previous-dirty-bitmap ? If destination has been moved over to another location and connected with slower link this "comparison" could take long time ...
When backing up the data is split into 4MiB "chunks". These chunks are named after their checksum. It also creates an index file with all the chunks used in that backup.
When it calculated the checksum of a chunk it wants to back up, it gets the indexes of backups it should know of (can't get indexes from other backup groups for privacy reasons). If it matches, no chunk is transmitted. If not, it will transmit the 4MiB chunk. While yes, the transmission of checksums does have some bandwith usage, it is very little.
Please correct me if I'm wrong here @proxmox-Team

but if it has to make dirty-bitmaps again , how does he do that ?
Just starts again from scratch just if it was your first backup. Create a dirty bitmap and keep track of the blocks that change. When the next backup is due, it has a list of blocks that changed and it can do a fast incremental backup again, then reset the dirty bitmap to track the changes for the next backup.

How does it affect the time and bandwith ?
This really comes down to how fast you can read your data, calculate checksums and how much of the disk image has changed.
 
Is there an issue for keeping bitmap persistent between restart?
I have an interest in this.
(as a temporary solution, is hibernation a possibility to keep bitmap persistent - assuming no resizing, moving stuff, ec)
 
Last edited:
According to the qcow2 documentation, it is possible to obtain a persistent dirty bitmap for the qcow2 disk format.
so shyly i ask if there are any plans for persistent dirty bitmap for qcow2 format in proxmox?
 
The annoying part with the dirty bitmaps is that I like to use "stop" mode for my backups. If I unterstand it right this will completely shutdown the VM and dirty bitmaps will be lost. So dirty bitmaps can't be used at all if using "stop" backups.
 
persisting the bitmap is not the issue - ensuring that it is still valid when starting the VM again is. we have no control over who/what accesses and potentially modifies the volume in-between, and any modification would lead to an invalid backup (and by the nature of incremental backups, this would propagate until the invalid chunk(s) are no longer part of the backup).
 
persisting the bitmap is not the issue - ensuring that it is still valid when starting the VM again is. we have no control over who/what accesses and potentially modifies the volume in-between, and any modification would lead to an invalid backup (and by the nature of incremental backups, this would propagate until the invalid chunk(s) are no longer part of the backup).
Just an idea - what if you hash the VM files, set a flag, and at start of next backup check hash vs on disk contents ? Hashing large files doesn't take that long for simply comparisons these days (https://stackoverflow.com/questions...um-for-large-files-in-c-sharp/1177712#1177712) and even if it took 5-10 or even minutes at the start and end of a backup job to hash a large VM disk, I'd happily take that hit to avoid a 5-6 hour processing time due to unnecessary loss of dirty bitmap. I have a couple of large fileservers that can take nearly 15 hours to process whenever I shut them down and lose the dirty bitmap. I know only the incremental data is transferred over the network but the issue is the long backup job running time.

It would be good as an option to enable with a warning "hashing may take long on big VM disk files"
 
Last edited:
  • Like
Reactions: voarsh
Just an idea - what if you hash the VM files, set a flag, and at start of next backup check hash vs on disk contents ?
that does not make sense, you now read the disk to avoid reading the disk ?
 
that does not make sense, you now read the disk to avoid reading the disk ?
Thanks for considering my idea =)
I'm not sure what type of reading of disk the backup job does, but as I mentioned it can take many hours if dirty bitmap is lost. Whereas A simple MD5 hash (as per the link I sent) can take seconds for a 1GB file and by extrapolation we calculate 1TB file maybe 16-20minutes.
If you take an MD5 hash after shutting down the machine, then take another one just before startup, you can tell if the vm disk contents have been modified in the interim. If the hash is the same, the dirty bitmap can still apply.

Obviously this will not suit everybody since on large VMs it will add many minutes to the shutdown and boot times.
But for myself and possibly some others, I would happily use this feature if available e.g. an option "Reset | Reboot | Shutdown | Shutdown - Preserve Dirty Bitmap(warning - may add many minutes) because I seldom shutdown my VMs and would happily accept 15-20min on startup and shutdown in order to avoid 15 hour backup job processing time on next backup job for large fileserver VMs.
 
Last edited:
I'm not sure what type of reading of disk the backup job does, but as I mentioned it can take many hours if dirty bitmap is lost. Whereas A simple MD5 hash (as per the link I sent) can take seconds for a 1GB file and by extrapolation we calculate 1TB file maybe 16-20minutes.
the backup simpy reads the disk to be backed up in 4MiB chunks, the only thing 'special' is that it reads the blocks the vm wants to write out of order (if not yet backed up) so it has a consistent state.
so for many storage setups, this already is as fast as you can go (reading in 4MiB chunks), and reading the disk image at the beginning just moves the time when that is done

i think if your setup takes hours to backup without a dirty bitmap, it would also take hours to hash it
but if you want you can simply test this:
call 'time sha256sum /path/to/disk/image'

make sure to clear the page cache before and that the vm is not running at the time
 
i think if your setup takes hours to backup without a dirty bitmap, it would also take hours to hash it
but if you want you can simply test this:
call 'time sha256sum /path/to/disk/image'

make sure to clear the page cache before and that the vm is not running at the time
I think seeing how much time it takes to hash is pointless, because it's not going to take that time every time a backup is ran with dirty bitmaps?
 
I think seeing how much time it takes to hash is pointless, because it's not going to take that time every time a backup is ran with dirty bitmaps?
no it will not, but the point of this thread was to keep the dirty-bitmap for machines that turned off. in that case we have no way of knowing if the bitmap would still be valid besides from calculating it again and compare. @JamesT argued that it would be faster to hash it at the beginning than reading the disk live during backup.