Some Windows Server generates way too much dirty backups

keson

Active Member
Dec 16, 2017
19
2
43
49
Dear Proxmox community members,
first A HUGE THANK for absolutely amazing tools, including PBS. Testing it on two places already some weeks and after understanding the retention setting I say it is a life saver. I am down from 4 hrs to 1 hr with daily virtual machines backups. Amazing technology.

I want to share one experience I have no explanation so far. This is NOT a complaint and I am aware it is kind of off topic here, but perhaps others can add their little bits here...
While backing ap many Containers take really just seconds, with Windows servers it is obviously heavier job to accomplish. For example, there is a 250 GB Windows 2012 R2 Terminal server with several dozens of users and although during weekends they barely work, I can see in dirty backups some 44 GB of data "changed".

INFO: Starting Backup of VM 201 (qemu)
INFO: Backup started at 2020-11-08 22:05:37
INFO: status = running
INFO: VM Name: ts01
INFO: include disk 'virtio0' 'local-zfs:vm-201-disk-1' 256G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/201/2020-11-08T21:05:37Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'a75976e4-b56b-48cd-a4d6-39ec5e565b0b'
INFO: resuming VM again
INFO: virtio0: dirty-bitmap status: OK (44.4 GiB of 256.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 44.4 GiB dirty of 256.0 GiB total
INFO: 0% (364.0 MiB of 44.4 GiB) in 3s, read: 121.3 MiB/s, write: 120.0 MiB/s
...truncated....
INFO: 100% (44.4 GiB of 44.4 GiB) in 22m 41s, read: 41.1 MiB/s, write: 41.1 MiB/s
INFO: backup is sparse: 8.00 MiB (0%) total zero data
INFO: backup was done incrementally, reused 213.10 GiB (83%)
INFO: transferred 44.36 GiB in 1361 seconds (33.4 MiB/s)

INFO: Finished Backup of VM 201 (00:23:01)
INFO: Backup finished at 2020-11-08 22:28:38

I am not complaining, just trying to understand what can happen in Windows machine during 24 hrs when barely used to generate 44 GB of changed files? The user data are on NAS, so the server only contains desktops and tem files and standard app data. Users do not use Outlooks so there should be no outlook databases and there is alo no SQL nor anything else creating large files with just one tiny change...

I am wondering if there is something what should be disable don the Windows server (or perhaps enabled) - to downsize the daily footprint?

For an illustration I have two domain controllers at the same location where one takes 1/10th of time and backups 1/10th of data comparing to the second one, while they both basically do nothing. Both have 32 GB partition and contain only AD, DNS. No DHCP, no print server...


INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2020-11-08 22:00:02
INFO: status = running
INFO: VM Name: dc01
INFO: include disk 'virtio0' 'local-zfs:vm-100-disk-1' 32G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/100/2020-11-08T21:00:02Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '97c1a0b9-6f84-4b8e-817e-bde10b6667fb'
INFO: resuming VM again
INFO: virtio0: dirty-bitmap status: OK (11.7 GiB of 32.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 11.7 GiB dirty of 32.0 GiB total

INFO: 3% (448.0 MiB of 11.7 GiB) in 3s, read: 149.3 MiB/s, write: 146.7 MiB/s
...truncated
INFO: 100% (11.7 GiB of 11.7 GiB) in 4m 57s, read: 54.7 MiB/s, write: 53.3 MiB/s
INFO: backup is sparse: 4.00 MiB (0%) total zero data
INFO: backup was done incrementally, reused 20.60 GiB (64%)
INFO: transferred 11.67 GiB in 297 seconds (40.2 MiB/s)
INFO: Finished Backup of VM 100 (00:05:07)
INFO: Backup finished at 2020-11-08 22:05:09

INFO: Starting Backup of VM 101 (qemu)
INFO: Backup started at 2020-11-08 22:05:09
INFO: status = running
INFO: VM Name: dc02
INFO: include disk 'virtio0' 'local-zfs:vm-101-disk-0' 32G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/101/2020-11-08T21:05:09Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '51e7bc07-8947-453f-8755-f5f74a50234b'
INFO: resuming VM again
INFO: virtio0: dirty-bitmap status: OK (1.2 GiB of 32.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 1.2 GiB dirty of 32.0 GiB total

INFO: 38% (492.0 MiB of 1.2 GiB) in 3s, read: 164.0 MiB/s, write: 152.0 MiB/s
...truncated
INFO: 100% (1.2 GiB of 1.2 GiB) in 20s, read: 36.0 MiB/s, write: 34.0 MiB/s
INFO: backup is sparse: 4.00 MiB (0%) total zero data
INFO: backup was done incrementally, reused 30.82 GiB (96%)
INFO: transferred 1.24 GiB in 20 seconds (63.6 MiB/s)
INFO: Finished Backup of VM 101 (00:00:28)
INFO: Backup finished at 2020-11-08 22:05:37
 
Hi,

the problem is that a block in the dirty bit map can get dirty without changing data.
E.g. if the timestamp gets updated. Or locks get set and removed.
I guess Windows does some allocation without data generation.
 
Hi,

the problem is that a block in the dirty bit map can get dirty without changing data.
E.g. if the timestamp gets updated. Or locks get set and removed.
I guess Windows does some allocation without data generation.
This is exactly what I think is happening... I will just accept it, it is Microsoft, they do not want us to ask...
 
we had reports early on with windows users that had some sort of autotrim feature active. from Qemu's point of view, trimming free space is akin to writing to it, so it dirties that part of the disk. might be worth a look ;)
 
we had reports early on with windows users that had some sort of autotrim feature active. from Qemu's point of view, trimming free space is akin to writing to it, so it dirties that part of the disk. might be worth a look ;)
Seems a valid opinion to me. I searched for something related to this topic and found basically only one thing - an example here: https://support.purestorage.com/Sol...es_and_Integrations/Windows_Space_Reclamation
I have compared all my servers and they are all in default (value =0) so it seems to me I should not touch it. Still I will give it a try on some test server...
 
Also check for defrag running.
Thanks, also a valid opinion. Didnt find any scheduled regularly running defrag task. Although the task exist, it does not have any activation. and it was last run a week ago, which would not match with daily "dirty backups"
 
I have the same problem. On a Windows Server 2003R2 with just 65 GB hard disk, 14,5 GB are "changed" every day even though the machine is NOT used at all! Even the deduplication doesn't seem to reduce the amount. Where do this tremendous changes come from?

Code:
INFO: creating Proxmox Backup Server archive 'vm/105/2020-11-25T01:23:51Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: enabling encryption
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '7dd142e0-5484-4cf8-bb3b-24f8fc77db8a'
INFO: resuming VM again
INFO: virtio0: dirty-bitmap status: OK (13.5 GiB of 65.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 13.5 GiB dirty of 65.0 GiB total
INFO:   0% (104.0 MiB of 13.5 GiB) in  3s, read: 34.7 MiB/s, write: 34.7 MiB/s
INFO:   1% (172.0 MiB of 13.5 GiB) in  6s, read: 22.7 MiB/s, write: 22.7 MiB/s
...

INFO:  99% (13.4 GiB of 13.5 GiB) in  8m 57s, read: 35.0 MiB/s, write: 35.0 MiB/s
INFO: 100% (13.5 GiB of 13.5 GiB) in  9m  2s, read: 27.2 MiB/s, write: 27.2 MiB/s
INFO: backup was done incrementally, reused 51.56 GiB (79%)
INFO: transferred 13.49 GiB in 542 seconds (25.5 MiB/s)
INFO: Finished Backup of VM 105 (00:09:02)
INFO: Backup finished at 2020-11-25 02:32:53
 
did you check the likely culprits mentioned earlier in this thread? something is triggering writes from within the guest..
 
I'm seeing the same issue as OP,
My windows that is basically idle has within 1 hour accumulated 44gb of dirty bitmap on one drive.
I have disabled the delete notification as mentioned by keson above and I have also disabled the schedueled task for defrag.
As a last hail-mary I also moved the pagefile to a separate drive that is excluded in the backup, and the same is still happening.

Anyone got any tips?
 
well, you could check the resulting images for differences - but that probably requires some knowledge of file system internals to gain any meaningful information..
 
well, you could check the resulting images for differences - but that probably requires some knowledge of file system internals to gain any meaningful information..
I was maybe hoping for some insight that does not require a phd in filesystems :-D
The server is a bog standard server2016 domain controller, there is not anything special installed.

The vm's disks are virtio, set with iotreahd on and discard on as the backing storage is zfs.
 
I am no expert w.r.t. Windows AD, maybe it has some sort of WAL or log mechanism that causes lots of churn?
 
Hi, I gave up on windows totally, this simply is a virus in IT world. So most severs are linux and those two ADs are accepted as necessary evil.
The schedule is done so that it does not interfere with users working (due to performance drop during backups). Same like updates in Windows servers - 30 GB of absolutely wasted space on all windows servers which cant be deleted (tried all methods except hard delete). Not worth the time wasted.
 
Hi,

We have 6 Windows Server 2012R2 VM wich do the same, generated high amount of dirty bitmap end of the day on daily backups.

Example:
DC 01 generates 41 GB of 61 GB,
File server generates 350 GB of 850 GB (Shared Folder, WS2012R2 deduplacation active)
File server generates 1000 GB of 1500 GB Video files (nothing changes in days...)
WSUS generates 300Gb of 1000Gb (700Gb free space...).

I the logs show that the free space is regenerated, at 2-3 Gigabits/s read and 0-5 Mb/s write. The backups did not take up much space at the end on PBS datastore, but they ran very slowly.

All the VM-s are from ESXi 5.5, converted VMDK to RAW (in LVM-Thin).

My "maybe" solution for the "problem":

Open, and run Powershell as administrator:

1st: Optimize-Volume -DriveLetter C -Defrag -Verbose

Run this all avilable partition (C,D,E, F etc..)

Then shut off the VM, then power on.

Run PowerShell again:

2nd: Optimize-Volume -DriveLetter C -Defrag -Retrim -Verbose

Run this all avilable partition (C,D,E, F etc..)


3rd: Set-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\FileSystem" -Name DisableDeleteNotification -Value 1

(Check the value: Get-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\FileSystem" -Name DisableDeleteNotification)

Then Power Off, Power on again.


After theese, the first backup is a slow-full, but then my experiences after 2 days of backups:

DC 01 generates 1,8 GB and 1,3Gb of 61 GB,
File server generates142 GB and 179 GB of 850 GB (Shared Folder, WS2012R2 deduplication active)
File server generates 8 MB and 8 MB GB of 1500 GB Video files (nothing changes days...)
WSUS generates 14,9 GB and 14,4 GB of 1000Gb

The daily backup has been reduced from 6 hours to 1 hour, and there are still 2 servers to be do the defrag.

So it seems that the free space was very fragmented after the migration, so any small change generated a huge dirty bitmap in the free space as well.

Please test the above method I suggested to make sure it's not just placebo :)

One more thing as a footnote... The HDD menu optimization (defrag) seems to run very fast, this made me suspicious. In the event log in the application section under event id 257 it seems that it could never run with error 0x8900002D, although it reports a successful run under event id 258 (you know, kind of a microsoft thing....). Anyway, running it from PowerShell does do something. In light of this, I have turned off the default weakly optimalization/defrag.

(For Windows 7 CMD: 1st: defrag C: -u 2nd: defrag C: -u -x 3rd: fsutil behavior set disabledeletenotify 1)


Sincerely: szelezola
 
Last edited:
Hi,

We have 6 Windows Server 2012R2 VM wich do the same, generated high amount of dirty bitmap end of the day on daily backups.

Example:
DC 01 generates 41 GB of 61 GB,
File server generates 350 GB of 850 GB (Shared Folder, WS2012R2 deduplacation active)
File server generates 1000 GB of 1500 GB Video files (nothing changes in days...)
WSUS generates 300Gb of 1000Gb (700Gb free space...).

I the logs show that the free space is regenerated, at 2-3 Gigabits/s read and 0-5 Mb/s write. The backups did not take up much space at the end on PBS datastore, but they ran very slowly.

All the VM-s are from ESXi 5.5, converted VMDK to RAW (in LVM-Thin).

My "maybe" solution for the "problem":

Open, and run Powershell as administrator:

1st: Optimize-Volume -DriveLetter C -Defrag -Verbose

Run this all avilable partition (C,D,E, F etc..)

Then shut off the VM, then power on.

Run PowerShell again:

2nd: Optimize-Volume -DriveLetter C -Defrag -Retrim -Verbose

Run this all avilable partition (C,D,E, F etc..)


3rd: Set-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\FileSystem" -Name DisableDeleteNotification -Value 1

(Check the value: Get-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\FileSystem" -Name DisableDeleteNotification)

Then Power Off, Power on again.


After theese, the first backup is a slow-full, but then my experiences after 2 days of backups:

DC 01 generates 1,8 GB and 1,3Gb of 61 GB,
File server generates142 GB and 179 GB of 850 GB (Shared Folder, WS2012R2 deduplication active)
File server generates 8 MB and 8 MB GB of 1500 GB Video files (nothing changes days...)
WSUS generates 14,9 GB and 14,4 GB of 1000Gb

The daily backup has been reduced from 6 hours to 1 hour, and there are still 2 servers to be do the defrag.

So it seems that the free space was very fragmented after the migration, so any small change generated a huge dirty bitmap in the free space as well.

Please test the above method I suggested to make sure it's not just placebo :)

One more thing as a footnote... The HDD menu optimization (defrag) seems to run very fast, this made me suspicious. In the event log in the application section under event id 257 it seems that it could never run with error 0x8900002D, although it reports a successful run under event id 258 (you know, kind of a microsoft thing....). Anyway, running it from PowerShell does do something. In light of this, I have turned off the default weakly optimalization/defrag.

(For Windows 7 CMD: 1st: defrag C: -u 2nd: defrag C: -u -x 3rd: fsutil behavior set disabledeletenotify 1)


Sincerely: szelezola
I will certainly give a try as if this works, you saved a kitten!
Sorry for my previous negative feedback, but I just feel that every day I find some bad Windows surprise so I am about to give up the fight, but this procedure, if found working will be a live saver! I will report in a few days after the backups are done.
 
  • Like
Reactions: hackinthebox
First thing to try: Disable the Trim on NTFS. It is enabled by default. “= 0” means enabled. See docs: https://docs.microsoft.com/en-us/pr...ows-server-2012-R2-and-2012/cc785435(v=ws.11)

Second thing to try: Disable NTFS compression if it’s enabled. RW’s are 64k blocks and are “trimmed” and highly fragmented if changed.

Third, I would test and compare snapshots from a shutdown state. Not that it’s a feasible solution for your config, but I’ve seen linux run into all sorts of NTFS errors if the f/s wasn’t a clean shutdown.
 
I will certainly give a try as if this works, you saved a kitten!
Sorry for my previous negative feedback, but I just feel that every day I find some bad Windows surprise so I am about to give up the fight, but this procedure, if found working will be a live saver! I will report in a few days after the backups are done.
Any news on your side?

Our 6 hour backup time reduced to 20-35 min. (depend on the big boys (fileserver and database server) file changes/usage).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!