pve-qemu-kvm >= 9.0.2-4
currently available in the testing repository: https://pve.proxmox.com/wiki/Package_Repositories#sysadmin_test_repoRepositories
section in the UI), run apt update
, run apt install pve-qemu-kvm
and disable the repository again, then run apt update
again.Reboot
button in the UI also works, reboot inside the guest is not enough) or migrated to an upgraded node to start using the new version.You don't need to run the whole testing repository. Just install the single package and disable it again. Otherwise, you can also downgrade the QEMU package for the time being. I can't give you an ETA for when it will be in the enterprise repository, because I won't be the one moving it there, but I'd expect it to be about 1-2 weeks.Hi Fiona, Not to be a pain but I can't run test repos on production environment. I'm presuming it will make its way there at some point. I've only recently moved over to Proxmox from VMWare so am unfamiliar with the timescales for releases.
Thanks.
if you can't hold off from snapshots with memory until a fix is available in the enterprise repository, I'd suggest to downgrade QEMUHi @fiona.
We have this problem as well.
Host CPU AMD EPYC 7313 running PVE 8.2.7 from Enterprise repo with pve-qemu-kvm 9.0.2-3.
The VM we are experiencing this with is a q35 machine with x86-64-v2-AES CPU running Rocky Linux 8.
Virtual disks (virtio-scsi-single, discard, iothread, io_uring) are backed by Ceph pools (NVME and SATA).
At some point during the snapshot (with memory) we observe (with atop in the VM) incredibly high I/O pressure (utilization & wait), rendering the VM unusable.
Machine state returns to normal after reboot (which naturally takes ages).
No problem when snapshotting without memory.
Can provide further details if needed.
apt install pve-qemu-kvm=8.1.5-6
. You can also try the proposed fix by upgrading just the QEMU package to the version from the testing repository: https://forum.proxmox.com/threads/snapshot-causes-vm-to-become-unresponsive.153483/post-719882Thanks FionaHi,
if you can't hold off from snapshots with memory until a fix is available in the enterprise repository, I'd suggest to downgrade QEMU
apt install pve-qemu-kvm=8.1.5-6
. You can also try the proposed fix by upgrading just the QEMU package to the version from the testing repository: https://forum.proxmox.com/threads/snapshot-causes-vm-to-become-unresponsive.153483/post-719
unfortunately, the cause of the issue was identified (October 29) only after the affectedHi,
i tested the pve-qemu-kvm 9.0.2-4 on my second cluster machine and no more saturation/stalling with my tests cases.
It seams OK now
I have a question @fiona concerning this fixed issue, the bug seams to be propagated via updated packages to Enterprise repository as mentioned by @dave10x and @Stefan Radman, then could you explain us the timeline between updated packages in test/pve-no-subscription/enterprise and the first detection of problem (this thread) and the resolution via package.
The goal is to now how many times the bug was present and active in different repositories, this will give us some more intel, than https://pve.proxmox.com/wiki/Package_Repositories, on how your update system work.
Regards.
pve-qemu-kvm
package was already on the enterprise repository (October 21). There was not much noise about the issue, just a couple of reports in the community forum and by the sheer number of users, we do get such reports all the time (in relation to qcow2 snapshots on slow NFS storages for example). So it was not clear that it's a general issue and it was not even clear that it is an issue in the QEMU package (could've been kernel too for example). That only became clear after reports that downgrading the QEMU package helped (also October 29). If you have further questions, please refer to the enterprise support.