Some issues with snapshots on PVE 7.2

Oct 8, 2021
22
1
8
38
The other day, I had to do some updates on several VMs across our network and wanted to take some snapshots including RAM beforehand. The machines are running on different PVE clusters, all clusters consisting of three nodes each with PVE 7.2-4 and hyperconverged ceph 16.2.9 underneath. All VMs have 32GB of RAM. When snapshot tasks of the machines with a lot of RAM utilization didn't complete after some time, I took a look at the logs and noticed that they were still trying to save their RAM with little to no progress over several minutes. A look at the ceph logs revealed that after some time during this process, IOPS went up to about 3000 while write speed dropped to ~3MB/s. When I stopped the snapshot tasks, the VMs crashed, leaving them stopped and locked.
Although the snapshot processes didn't finish they would still be listed, but trying to remove them through the GUI wouldn't work. The only way to remove the faulty snapshots was through the CLI with qm delsnapshot <vmid> <snapshotname> --force.

So, I'm facing two major issues at the moment:

1. Stopping a snapshot process under PVE 7.2-4 ends up crashing and locking the VM
2. while snapshotting RAM of a VM, ceph starts to underperform heavily after some time

I did some testing with another PVE cluster in our network, still running PVE 6.4 and ceph 14.2.22 and didn't encounter any of those issues. Furthermore, I was unable to replicate issue #2 with any other storage backend than ceph.
 
The patch hasn't been accepted yet. Once it is accepted, a new package has to be built containing the fix.
This will then be distributed to the internal staging repository where we're testing it here.
Once we deem it stable it is released to the `pvetest` repository, then to `pve-no-subscription` and if no issues arise during that time, it will be moved to `pve-enterprise`.

So assume at least 2-3 weeks after it was accepted, if not more, before it is released on `pve-enterprise`.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!