Snapshot causes VM to become unresponsive.

FYI, the fix is included in pve-qemu-kvm >= 9.0.2-4 currently available in the testing repository: https://pve.proxmox.com/wiki/Package_Repositories#sysadmin_test_repo

If you'd like to install the package, you can temporarily enable the repository (e.g. via the Repositories section in the UI), run apt update, run apt install pve-qemu-kvm and disable the repository again, then run apt update again.

EDIT: a VM has to be shutdown and started again (Reboot button in the UI also works, reboot inside the guest is not enough) or migrated to an upgraded node to start using the new version.
 
Last edited:
Hi Fiona, Not to be a pain but I can't run test repos on production environment. I'm presuming it will make its way there at some point. I've only recently moved over to Proxmox from VMWare so am unfamiliar with the timescales for releases.

Thanks.
 
Hi Fiona, Not to be a pain but I can't run test repos on production environment. I'm presuming it will make its way there at some point. I've only recently moved over to Proxmox from VMWare so am unfamiliar with the timescales for releases.

Thanks.
You don't need to run the whole testing repository. Just install the single package and disable it again. Otherwise, you can also downgrade the QEMU package for the time being. I can't give you an ETA for when it will be in the enterprise repository, because I won't be the one moving it there, but I'd expect it to be about 1-2 weeks.
 
Hi @fiona.
We have this problem as well.
Host CPU AMD EPYC 7313 running PVE 8.2.7 from Enterprise repo with pve-qemu-kvm 9.0.2-3.
The VM we are experiencing this with is a q35 machine with x86-64-v2-AES CPU running Rocky Linux 8.
Virtual disks (virtio-scsi-single, discard, iothread, io_uring) are backed by Ceph pools (NVME and SATA).
At some point during the snapshot (with memory) we observe (with atop in the VM) incredibly high I/O pressure (utilization & wait), rendering the VM unusable.
Machine state returns to normal after reboot (which naturally takes ages).
No problem when snapshotting without memory.
Can provide further details if needed.
 
Hi,
Hi @fiona.
We have this problem as well.
Host CPU AMD EPYC 7313 running PVE 8.2.7 from Enterprise repo with pve-qemu-kvm 9.0.2-3.
The VM we are experiencing this with is a q35 machine with x86-64-v2-AES CPU running Rocky Linux 8.
Virtual disks (virtio-scsi-single, discard, iothread, io_uring) are backed by Ceph pools (NVME and SATA).
At some point during the snapshot (with memory) we observe (with atop in the VM) incredibly high I/O pressure (utilization & wait), rendering the VM unusable.
Machine state returns to normal after reboot (which naturally takes ages).
No problem when snapshotting without memory.
Can provide further details if needed.
if you can't hold off from snapshots with memory until a fix is available in the enterprise repository, I'd suggest to downgrade QEMU
apt install pve-qemu-kvm=8.1.5-6. You can also try the proposed fix by upgrading just the QEMU package to the version from the testing repository: https://forum.proxmox.com/threads/snapshot-causes-vm-to-become-unresponsive.153483/post-719882
 
Hi,

if you can't hold off from snapshots with memory until a fix is available in the enterprise repository, I'd suggest to downgrade QEMU
apt install pve-qemu-kvm=8.1.5-6. You can also try the proposed fix by upgrading just the QEMU package to the version from the testing repository: https://forum.proxmox.com/threads/snapshot-causes-vm-to-become-unresponsive.153483/post-719
Thanks Fiona
The update to pve-qemu-kvm 9.0.2-4 fixed the problem for us.
Best regards
Stefan
 
  • Like
Reactions: fiona
Hi,
i tested the pve-qemu-kvm 9.0.2-4 on my second cluster machine and no more saturation/stalling with my tests cases.
It seams OK now

I have a question @fiona concerning this fixed issue, the bug seams to be propagated via updated packages to Enterprise repository as mentioned by @dave10x and @Stefan Radman, then could you explain us the timeline between updated packages in test/pve-no-subscription/enterprise and the first detection of problem (this thread) and the resolution via package.
The goal is to now how many times the bug was present and active in different repositories, this will give us some more intel, than https://pve.proxmox.com/wiki/Package_Repositories, on how your update system work.
Regards.
 
  • Like
Reactions: dave10x
Hi,
Hi,
i tested the pve-qemu-kvm 9.0.2-4 on my second cluster machine and no more saturation/stalling with my tests cases.
It seams OK now

I have a question @fiona concerning this fixed issue, the bug seams to be propagated via updated packages to Enterprise repository as mentioned by @dave10x and @Stefan Radman, then could you explain us the timeline between updated packages in test/pve-no-subscription/enterprise and the first detection of problem (this thread) and the resolution via package.
The goal is to now how many times the bug was present and active in different repositories, this will give us some more intel, than https://pve.proxmox.com/wiki/Package_Repositories, on how your update system work.
Regards.
unfortunately, the cause of the issue was identified (October 29) only after the affected pve-qemu-kvm package was already on the enterprise repository (October 21). There was not much noise about the issue, just a couple of reports in the community forum and by the sheer number of users, we do get such reports all the time (in relation to qcow2 snapshots on slow NFS storages for example). So it was not clear that it's a general issue and it was not even clear that it is an issue in the QEMU package (could've been kernel too for example). That only became clear after reports that downgrading the QEMU package helped (also October 29). If you have further questions, please refer to the enterprise support.

EDIT: forgot to mention, that the package with the fix should be available in the following days in the enterprise repository. I can't give you an exact time, because I'm not the one deciding the exact time.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!