[SOLVED] Snapshot function does not clean up properly.

Jul 20, 2022
133
10
18
I have had different situations where a failed VM snapshot does not clean up and prevents further snapshots. Here is the latest.
Code:
Formatting '/mnt/xfs-12tb-usb/images/105/vm-105-state-Maint3.raw', fmt=raw size=34884026368 preallocation=off
snapshot create failed: starting cleanup
TASK ERROR: VM 105 qmp command 'savevm-start' failed - VM snapshot already started
This is after I had started a snapshot and then stopped it in the GUI. I also have one or two "deleted" snapshots where the deletion fails for no apparent reason, and have a status of "delete" in the GUI. I can't get rid of them. How should I recover from this?
Thanks.
 
Hi,
what version of QEMU is the VM running qm status 105 --verbose | grep running-qemu? There was a commit a while ago to improve aborting/failing snapshots. Does it work if you retry taking the snapshot or do you get the same error again?

If you are stuck in snapshot-delete, use qm unlock <ID> and qm delsnapshot <ID> <snapshot name> --force. The force flag allows to try and remove already partially removed snapshots.
 
  • Like
Reactions: karnz
Code:
root@leghorn:~# qm status 105 --verbose | grep running-qemu
running-qemu: 7.1.0
root@leghorn:~#

The delsnapshot command worked (it no longer appears in the GUI), but gave an error:
Code:
root@leghorn:~# qm delsnapshot 105 Maint1 --force
VM 105 qmp command 'blockdev-snapshot-delete-internal-sync' failed - Snapshot with id 'null' and name 'Maint1' does not exist on device 'drive-scsi2'
root@leghorn:~#

This is what happens when trying to run the snapshot again, naming it "Maint3":
Code:
. . .
5.30 GiB in 28s
5.53 GiB in 29s
5.72 GiB in 30s
completed saving the VM state in 32s, saved 5.91 GiB
snapshotting 'drive-scsi0' (Zvol:vm-105-disk-0)
snapshotting 'drive-scsi1' (Zvol:vm-105-disk-1)
snapshotting 'drive-scsi2' (xfs-12tb-usb:105/vm-105-disk-0.qcow2)
snapshot create failed: starting cleanup
TASK ERROR: VM 105 qmp command 'blockdev-snapshot-internal-sync' failed - Snapshot with name 'Maint3' already exists on device 'drive-scsi2'

But, using "Maint4" appears to be working. (It's still running.) So, there seems to be a Maint3 file hanging around somewhere. Can you tell me where it might be?

Thanks.

By the way -- I've been thinking about enabling the pve-no-subscription repository. Do you think this is warranted in this situation? I'm running a home server, but it is depended upon. Thanks again.
 
Last edited:
Code:
root@leghorn:~# qm status 105 --verbose | grep running-qemu
running-qemu: 7.1.0
root@leghorn:~#
That version should already include the improvement. So likely, the cancel command didn't get through when the snapshot operation was aborted, and then the next time it was detected as already started. Stopping a task is currently rather aggressive in Proxmox VE. It will terminate the process group and kill off everything after 5 seconds, not always leaving enough time for proper cleanup.

The delsnapshot command worked (it no longer appears in the GUI), but gave an error:
Code:
root@leghorn:~# qm delsnapshot 105 Maint1 --force
VM 105 qmp command 'blockdev-snapshot-delete-internal-sync' failed - Snapshot with id 'null' and name 'Maint1' does not exist on device 'drive-scsi2'
root@leghorn:~#
That just means that the snapshot was already partially deleted. Proxmox VE still tries to remove all parts of the snapshot again for good measure.

This is what happens when trying to run the snapshot again, naming it "Maint3":
Code:
. . .
5.30 GiB in 28s
5.53 GiB in 29s
5.72 GiB in 30s
completed saving the VM state in 32s, saved 5.91 GiB
snapshotting 'drive-scsi0' (Zvol:vm-105-disk-0)
snapshotting 'drive-scsi1' (Zvol:vm-105-disk-1)
snapshotting 'drive-scsi2' (xfs-12tb-usb:105/vm-105-disk-0.qcow2)
snapshot create failed: starting cleanup
TASK ERROR: VM 105 qmp command 'blockdev-snapshot-internal-sync' failed - Snapshot with name 'Maint3' already exists on device 'drive-scsi2'

But, using "Maint4" appears to be working. (It's still running.) So, there seems to be a Maint3 file hanging around somewhere. Can you tell me where it might be?
It's inside the qcow2 file. You can use qm mon 105 and then enter snapshot_delete_blkdev_internal drive-scsi2 Maint3 to remove the snapshot from the qcow2-backed drive. Alternatively, when the VM is shut down, use pvesm path xfs-12tb-usb:105/vm-105-disk-0.qcow2 to get the path and then qemu-img snapshot /path/to/105/vm-105-disk-0.qcow2 -d Maint3.

Thanks.

By the way -- I've been thinking about enabling the pve-no-subscription repository. Do you think this is warranted in this situation? I'm running a home server, but it is depended upon. Thanks again.
The enterprise repository contains more well-tested packages and is the one intended for production use. If you need any specific new versions of packages with fixes or new features, you can temporarily enable the no-subscription or test repository, install the desired package and disable it again afterwards.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!