I can provide the same data for a VM that just got frozen:
# strace -c -p $(cat /var/run/qemu-server/375.pid)
strace: Process 3239800 attached
^Cstrace: Process 3239800 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- ---------...
on our setup we use opt-in kernel, so ever since we moved from 7.1 to 7.2 on some of the clusters, we switched to 5.19.
That combination, pve 7.2 and kernel 5.19, had these freezes. Then we upgraded all the way to current PVE 7 and kernel 6.2.
Always having freezes.
On the other hand one of the...
just my 2 cents.
I have this post, we're having similar issues.
I wanted to comment that we have 5 different clusters, and one of them is PVE 7.1 and it's not having any problem.
Our problems started with 7.2.
Again, a migration unfreezes VM instantly, like nothing happened.
edit: typo
Hi @walacio, yes, we normally assign multiple CPUs to VMs, and yes, we can see CPU usage goes up when it happens.
our last few issues were on a cluster that still don't have intel-microcode installed, we suspect it might be related.
we have another cluster, that has the microcode installed and...
Hi
Thanks for your response.
On syslog we have no event, nothing, for the entire time the VM is frozen, there is a gap on timestamp.
It can be minutes or hours, depending on how much time we wait until migrating it.
On Host Node we have this:
Apr 29 06:59:40 pvea44 pvedaemon[2015313]: VM 1099...
We went without a single issue for a few weeks, but just now it happened again.
This is a 7.3 version cluster, we upgraded it a few weeks ago, and also applied limits.conf settings.
This is the gdb command output:
Hi, thank you for your response.
We did a series of changes. On some hosts upgrading to 6.1 kernel, some adding custom limits as suggested here
So far we had this issue re-occurring on 6.1 kernels that didn't have custom limits config, but it might be a coincidence.
We are now in the process of...
yes, we don't use lvm-thin on local storage, just plain lvm. We saw aio=native works, so we are going to change that setting on running VMs before updating qemu
Hello.
I think we have an issue with this qemu version.
We tried it on one cluster where we use NFS and LVM over iSCSI storages.
We can't move disks from NFS to LVM storage (no even local-lvm), when VM is running and AIO is the default io_uring
We get:
TASK ERROR: storage migration failed...
Hello everyone.
We are having this issue on a couple of clusters.
VMs will randomly freeze, no network response, CPU stuck between 100 and 102%.
When we start Live migration to a different Node, migration works flawlessly
and VM starts working just fine on new node.
While VM is frozen, there is...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.