Snapshot removal "jams" the VM

hac3ru

Member
Mar 6, 2021
43
0
11
32
Hello,

I'm running Jenkins inside a VM ran on a Proxmox VE 7.1.8. The issue is that when I delete a snapshot, the VM enters a locked state, which in turn stops it from responding to requests. I don't know if this is the normal behavior and, if it is, why is it running this way? In my opinion, a snapshot removal should not freeze a VM until the snapshot is removed.
P.S. I'm not talking about snapshots taken weeks/months ago. I just took a snapshot, updated the OS and Jenkins and when I removed the snapshot the VM froze for ~ 1 - 1.5 minutes, until the snapshot was removed.

Any advice is welcome.

Thank you!
 
You didnt told us what storage you are using and if you include a RAM dump or not. There is no univeral snapshoting. PVE uses the snapshot features of the underlaying storage. So depending on if you are using LVM, qcow2, ZFS, ceph and so on you got completely different working snapshots with different features and limitations.
 
Hello,

I have what it looks to be the exact same issue.

Usually when we create a snapshot before doing some updates in a VM and then remove the snapshot if everything went well, it took a few seconds to do so.

Today I did the same routine again but when deleting the snapshot, the VM froze for 2 minutes until the snapshot was finally removed.

I found this strange so I did a test on another VM and the issue was even worse. The VM froze until the snapshot delete command timed out.
As the snapshot delete command timed out the snapshot was still listed and the VM was in a locked state.

I was able to unlock it but then when trying to delete the snapshot again, it says an error occured and the VM went locked again.
Trying to remove the snapshot from command line, qemu returns that the doesn't exists in the related qcow2 file.

But in the <vmid>.conf there was still an entry for the snapshot, so I cleaned it manually and everything went back to normal.

It looks like there is something wrong with snapshots in latest releases.

The storage is a NFS share on a fast Huawei SAN and we never had such issue with snapshots before.

Sample VM config:

Code:
agent: 1
bios: ovmf
boot: order=scsi0
cores: 12
cpu: SandyBridge
efidisk0: Huawei_NFS:121/vm-121-disk-0.qcow2,size=128K
machine: q35
memory: 12228
name: es-node03
net0: virtio=5E:32:B0:13:AB:50,bridge=vmbr0,tag=10
numa: 1
ostype: l26
protection: 1
scsi0: Huawei_NFS:121/vm-121-disk-1.qcow2,discard=on,iothread=1,size=40G,ssd=1
scsi1: Huawei_NFS:121/vm-121-disk-2.qcow2,discard=on,iothread=1,size=300G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=5c3d90f9-90cf-4761-9c7e-f3aa7d1ae38a
sockets: 1
tablet: 0
vga: qxl
vmgenid: 44b49263-be54-46d4-b5ef-953f8f80b3f7


Quick and dirty of write test on the storage:

Code:
root@pm04:/mnt/pve/Huawei_NFS# dd if=/dev/zero of=test bs=4k oflag=direct count=1000
1000+0 records in
1000+0 records out
4096000 bytes (4.1 MB, 3.9 MiB) copied, 0.244436 s, 16.8 MB/s

Code:
root@pm04:/mnt/pve/Huawei_NFS# dd if=/dev/zero of=test bs=1024k oflag=direct count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 2.03062 s, 516 MB/s

Code:
pve-manager 7.2-5
Linux pm04 5.15.35-3-pve #1 SMP PVE 5.15.35-6 (Fri, 17 Jun 2022 13:42:35 +0200) x86_64 GNU/Linux

Any ideas ? :)
 
I'm sorry for the really late reply.
I have the same issue no matter what the storage is. I tried with NFS, local and LVM-iSCSI. When taking the snapshot, I did not take a RAM dump.

@tanfoglionet, unfortunately I have no info and never got to the bottom of this...
 
This night I try to No ram dump, and i noticed, it seems faster and seamless. I've only tried on test machines
Thank you
 
Well, at least for me, it only happened after giving the snapshot a few hours / days. So when you delete the snapshot, I would guess a consolidation happens, and that freezes the machine.

Please try to take a snapshot, give it 1-2 days and then remove it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!