Full backups after qcow snapshot?

gvs

New Member
Apr 4, 2025
9
0
1
I have a VM with large drives (2TB) which takes a long time to back up.
Lately, I've noticed that if I make and delete a snapshot of the VM (I do this for upgrades), it will trigger a full backup in PBS instead of the incremental snapshot and that doesn't finish inside of our backup window on weekdays...

Is this correct behavior or did I hit a bug? PBS 3.3 / PVE 8.3
 
Hi,
failed to reproduce your issue here. Taking a snapshot while the VM is running results in the dirty bitmap to be ok and only differential backups. Do you at some point of the process stop the VM? Are you always backing up to the same target (including namespace)? That would invalidate the bitmap.
 
I was tested a VM with made and removed snapshot before doing backup, it's reported using "fast incremental mode".
Code:
INFO: starting new backup job: vzdump 90011 --notification-mode auto --notes-template '{{cluster}}, {{guestname}}, {{node}}, {{vmid}}' --remove 0 --storage PBS --node pve65 --mode snapshot
INFO: Starting Backup of VM 90011 (qemu)
INFO: Backup started at 2025-04-04 17:06:01
INFO: status = running
INFO: VM Name: WinSrv2025Std-x64
INFO: include disk 'scsi0' 'local-zfs2:vm-90011-disk-0' 100G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/90011/2025-04-04T09:06:01Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '9b909c87-2a91-4dc3-9999-35ba87fedc50'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: OK (156.0 MiB of 100.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 156.0 MiB dirty of 100.0 GiB total
INFO: 100% (156.0 MiB of 156.0 MiB) in 3s, read: 52.0 MiB/s, write: 46.7 MiB/s
INFO: Waiting for server to finish backup validation...
INFO: backup was done incrementally, reused 99.86 GiB (99%)
INFO: transferred 156.00 MiB in 5 seconds (31.2 MiB/s)
INFO: adding notes to backup
INFO: Finished Backup of VM 90011 (00:00:06)
INFO: Backup finished at 2025-04-04 17:06:07
INFO: Backup job finished successfully
INFO: notified via target `mail-to-root`
TASK OK
I guess may your VM lost some backup between this two times full backup job, it because after the test end, I removed 5 backups (Incremental) belongs to the VM, and next time I do the backup again, the job was started as full backup, and the log reported "dirty-bitmap status: existing bitmap was invalid and has been cleared".
Code:
INFO: starting new backup job: vzdump 90011 --remove 0 --notification-mode auto --notes-template '{{cluster}}, {{guestname}}, {{node}}, {{vmid}}' --node pve65 --storage PBS --mode snapshot
INFO: Starting Backup of VM 90011 (qemu)
INFO: Backup started at 2025-04-04 17:08:52
INFO: status = running
INFO: VM Name: WinSrv2025Std-x64
INFO: include disk 'scsi0' 'local-zfs2:vm-90011-disk-0' 100G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/90011/2025-04-04T09:08:52Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '7589188e-413c-43ab-97be-51d9a1c56926'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO:   2% (2.8 GiB of 100.0 GiB) in 3s, read: 956.0 MiB/s, write: 262.7 MiB/s
   .
   .
   .
INFO:  92% (92.2 GiB of 100.0 GiB) in 27s, read: 8.6 GiB/s, write: 0 B/s
INFO: 100% (100.0 GiB of 100.0 GiB) in 30s, read: 2.6 GiB/s, write: 1.3 MiB/s
INFO: Waiting for server to finish backup validation...
INFO: backup is sparse: 77.57 GiB (77%) total zero data
INFO: backup was done incrementally, reused 97.43 GiB (97%)
INFO: transferred 100.00 GiB in 31 seconds (3.2 GiB/s)
INFO: adding notes to backup
INFO: Finished Backup of VM 90011 (00:00:32)
INFO: Backup finished at 2025-04-04 17:09:24
INFO: Backup job finished successfully
INFO: notified via target `mail-to-root`
TASK OK
How about your VM backup log output?
 
Hi,
failed to reproduce your issue here. Taking a snapshot while the VM is running results in the dirty bitmap to be ok and only differential backups. Do you at some point of the process stop the VM? Are you always backing up to the same target (including namespace)? That would invalidate the bitmap.
Yes! I do stop the VM. Removing the snapshot of the 2TB disk times out if I don't
Is that the cause? Will a VM stop always trigger a full backup?

And yes, always same target, same settings
 
I get:
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/105/2025-04-03T22:00:03Z'
INFO: started backup task '45b88572-f119-4353-861d-64b7f3fd8165'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO: scsi1: dirty-bitmap status: created new
 
Then my core problem becomes. what is going wrong that I cannot remove a snapshot from large running VM's within the time limits
Please share the snapshot delete task log, the VM config obtained via qm config <VMID> --current as well as the storage config cat /etc/pve/storage.cfg.
 
Please share the snapshot delete task log, the VM config obtained via qm config <VMID> --current as well as the storage config cat /etc/pve/storage.cfg.
Config:
balloon: 0
Code:
boot: order=scsi0;ide2;net0
cores: 8
ide2: none,media=cdrom
memory: 49152
meta: creation-qemu=6.1.0,ctime=1639999078
name: zimbra
net0: virtio=B6:79:01:BF:8E:E6,bridge=vmbr0
numa: 1
onboot: 1
ostype: l26
parent: Ticket22893
scsi0: Pool1:105/vm-105-disk-0.qcow2,cache=writeback,size=20G
scsi1: Pool1:105/vm-105-disk-1.qcow2,cache=writeback,size=2T
scsihw: virtio-scsi-pci
smbios1: uuid=c3380739-06c2-4741-9fb3-866f07ea8285
sockets: 1
vmgenid: 135e71cb-bed8-419f-9b2d-343528b453c2

Storage:
dir: local
Code:
        path /var/lib/vz
 content vztmpl,iso,backup

lvmthin: local-lvm
 thinpool data
 vgname pve
 content rootdir,images

nfs: Pool2
 export /mnt/Pool1/VM
 path /mnt/pve/Pool2
 server 10.0.210.12
 content iso,images
 options vers=4.1
 prune-backups keep-all=1

zfspool: LocalZFS
 disable
 pool VM
 content rootdir,images
 mountpoint /VM
 nodes prox00
 sparse 0

nfs: Pool1
 export /mnt/Pool1/VM
 path /mnt/pve/Pool1
 server 10.0.210.11
 content iso,images
 options vers=4.1
 prune-backups keep-all=1

nfs: backup
 disable
 export /export/backup
 path /mnt/pve/backup
 server 192.168.210.71
 content iso,backup
 prune-backups keep-all=1

pbs: PBS
 datastore eindhoven
 server 142.132.150.159
 content backup
 fingerprint xxx
 prune-backups keep-all=1
 username xxxx

zfspool: localZFSpool
 pool dpool
 content images,rootdir
 nodes prox00
sparse 1

Log:
TASK ERROR: VM 105 qmp command 'blockdev-snapshot-delete-internal-sync' failed - got timeout
 
Config:
balloon: 0
Code:
boot: order=scsi0;ide2;net0
cores: 8
ide2: none,media=cdrom
memory: 49152
meta: creation-qemu=6.1.0,ctime=1639999078
name: zimbra
net0: virtio=B6:79:01:BF:8E:E6,bridge=vmbr0
numa: 1
onboot: 1
ostype: l26
parent: Ticket22893
scsi0: Pool1:105/vm-105-disk-0.qcow2,cache=writeback,size=20G
scsi1: Pool1:105/vm-105-disk-1.qcow2,cache=writeback,size=2T
scsihw: virtio-scsi-pci
smbios1: uuid=c3380739-06c2-4741-9fb3-866f07ea8285
sockets: 1
vmgenid: 135e71cb-bed8-419f-9b2d-343528b453c2

Storage:
dir: local
Code:
        path /var/lib/vz
 content vztmpl,iso,backup

lvmthin: local-lvm
 thinpool data
 vgname pve
 content rootdir,images

nfs: Pool2
 export /mnt/Pool1/VM
 path /mnt/pve/Pool2
 server 10.0.210.12
 content iso,images
 options vers=4.1
 prune-backups keep-all=1

zfspool: LocalZFS
 disable
 pool VM
 content rootdir,images
 mountpoint /VM
 nodes prox00
 sparse 0

nfs: Pool1
 export /mnt/Pool1/VM
 path /mnt/pve/Pool1
 server 10.0.210.11
 content iso,images
 options vers=4.1
 prune-backups keep-all=1

nfs: backup
 disable
 export /export/backup
 path /mnt/pve/backup
 server 192.168.210.71
 content iso,backup
 prune-backups keep-all=1

pbs: PBS
 datastore eindhoven
 server 142.132.150.159
 content backup
 fingerprint xxx
 prune-backups keep-all=1
 username xxxx

zfspool: localZFSpool
 pool dpool
 content images,rootdir
 nodes prox00
sparse 1

Log:
TASK ERROR: VM 105 qmp command 'blockdev-snapshot-delete-internal-sync' failed - got timeout
So what I feared, you are running the VM with the qcow2 on NFS. A possible workaround is https://forum.proxmox.com/threads/g...est-disks-on-local-storage.113854/post-494013