[SOLVED] VMs freeze with 100% CPU

Well, you didn't open a new thread, so you're forcing everybody to follow this discussion.

That's an odd number of CPUs. While it shouldn't be an issue, I think guests are more comfortable with an even number.

How much memory does the host have?

What kind of storage is this? If it's a network storage, is the connection stable?

Please also check the host's system log/journal when the issue occurs next time.


That's an odd number of CPUs. While it shouldn't be an issue, I think guests are more comfortable with an even number.
On VM is set 2 socket 25 cores(finally 50) :)


How much memory does the host have?
On host is installed 512GB RAM


What kind of storage is this? If it's a network storage, is the connection stable?

At the moment, all is OK with network storage(NFS), NVMe disks, 20Gbps stable connection.
As I specify on Proxmox 7.4-16 all is OK, but on Proxmox 8.04 not.
Initially VM runs on local NVMe disk(ZFS, mirror, compression lz4)(when start problem).


Please also check the host's system log/journal when the issue occurs next time.
I will return with details in the following days.
 
I can confirm the bug on my cluster as well on my cluster with kernel 6.2.11-1, not sure exactly how long it took to trigger it, but it was less than 7 hours. That is, for the Debian VMs, when I keep switching ballooning between 4GB and 100GB of RAM.

Sorry it took me a while, but I am happy to confirm that for me switching to kernel version 6.2.16-11-bpo11-pve, indeed seems to have resolved the problem. I was able to reproduce the bug before, but not after applying the upgrade. The mmu_invalidate_seq counter is now well into the 3 billion, and the VM is still running perfectly as it should. Finally it looks like this problem is resolved.

Thank you everybody for their support.
 
Hello,

Running a Home Lab Proxmox ... I used this to learn different technology (OS, DB, Docker, etc ...)

I'm running Proxmox 8.0.4 (running kernel: 6.2.16-12-pve) + Ceph 17.2.6 (quincy)

I have exactly the same problem, after booting Windows Servers 2022, a few moment later got CPU 100% and remote desktop & console not responding anymore ! I force stop vm and try to reboot with no success, vm does not boot anymore and go to EFI Shell ...

VM is configured a follow :

agent: 1
args: -cpu SandyBridge,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,+vmx,+hv-evmcs
bios: ovmf
boot: order=sata0;net0
cores: 4
efidisk0: ec-nvme:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
machine: pc-q35-8.0
memory: 8192
meta: creation-qemu=8.0.2,ctime=1694273908
name: Hyper-V
net0: virtio=2E:31:9A:13:45:15,bridge=vmbr0
numa: 0
onboot: 1
ostype: win11
sata0: ec-nvme:vm-101-disk-1,size=64G
scsihw: virtio-scsi-pci
smbios1: uuid=a1197739-c915-423c-b9dd-7301e20b3137
sockets: 1
tpmstate0: ec-nvme:vm-101-disk-2,size=4M,version=v2.0
vga: qxl,memory=128
vmgenid: 5c1e8c65-86f3-4248-967f-2012726fbb4f

Ballonning, KSM activated. VM use nested virt, so can also run Hyper V for testing / Learning ...

VM is part of a HA Group when crashing !

Actually i removed VM from HA Group and until now did not crashed with cpu 100% yet ...

I'm running with 3 nodes with same model computer, cpu, configuration ! Hardware 100% same for the 3 nodes !

What can cause this symptom of crash ?
 
Hi!
Tested the last 10 days with PVE 8.0.4 with kernel 6.2.16-11-bpo11-pve and VMs did not freeze.
Can someone tell me whether this patch has now been moved to production?
Thanks a lot
Yes, the kernel with the fix for this has been rolled out to the enterprise repository for both Proxmox VE 7 and Proxmox VE 8, last week.
 
I've marked this thread as solved, due 6.2.16-11-bpo11-pve solved the issue.
Thanks!
Hope the cherry picking was not forgotten in further kernels.
We keep such patches checked into the git repository in a separate directory, so it will always be applied for newer kernels until either the patch is now in the kernel version we update too (so can be dropped) or the source changed that applying fails, which we actively notice and can then investigate and react respectively (rebase on top of new source, see if obsolete due to some other change in some other parts, ...)
 
  • Like
Reactions: udo
Is it possible to re-enable KSM & Ballooning with the latest kernel without problems?

Or is the problem only fixed for Enterprise currently or also without subscription?
 
I hope I'm not out of line here. I've been reading through this thread and the symptoms are practically identical to ones that I have been having on an AlmaLinux 8 system. The ppoll consumes 100% during these freezes.

I'm wondering if the same mmu_invalidate_seq issue is present in the AlmaLinux 8 kernel?

I am not able to run the bpftrace script, I'm given an error:

ERROR: Struct/union of type 'struct kvm' does not contain a field named 'mmu_invalidate_seq'

I do understand that Proxmox and AlmaLinux are two different things. I'm just not finding a lot of information in my Google searches for this issue. This thread seems to be the most recently active and seems to describe almost identical to the problems I am having on AlmaLinux 8.

I'm just wondering if there is something that AlmaLinux 8 kernel developers need to be made aware of that might fix this issue in their kernel?
 
I hope I'm not out of line here. I've been reading through this thread and the symptoms are practically identical to ones that I have been having on an AlmaLinux 8 system. The ppoll consumes 100% during these freezes.

I'm wondering if the same mmu_invalidate_seq issue is present in the AlmaLinux 8 kernel?

I am not able to run the bpftrace script, I'm given an error:

ERROR: Struct/union of type 'struct kvm' does not contain a field named 'mmu_invalidate_seq'

I do understand that Proxmox and AlmaLinux are two different things. I'm just not finding a lot of information in my Google searches for this issue. This thread seems to be the most recently active and seems to describe almost identical to the problems I am having on AlmaLinux 8.

I'm just wondering if there is something that AlmaLinux 8 kernel developers need to be made aware of that might fix this issue in their kernel?
I think it's this commit

https://lists.proxmox.com/pipermail/pve-devel/2023-September/058995.html
 
Hi,
I hope I'm not out of line here. I've been reading through this thread and the symptoms are practically identical to ones that I have been having on an AlmaLinux 8 system. The ppoll consumes 100% during these freezes.

I'm wondering if the same mmu_invalidate_seq issue is present in the AlmaLinux 8 kernel?

I am not able to run the bpftrace script, I'm given an error:

ERROR: Struct/union of type 'struct kvm' does not contain a field named 'mmu_invalidate_seq'
Maybe you can try this version (with kernel headers installed): https://forum.proxmox.com/threads/vms-freeze-with-100-cpu.127459/post-586756

I do understand that Proxmox and AlmaLinux are two different things. I'm just not finding a lot of information in my Google searches for this issue. This thread seems to be the most recently active and seems to describe almost identical to the problems I am having on AlmaLinux 8.

I'm just wondering if there is something that AlmaLinux 8 kernel developers need to be made aware of that might fix this issue in their kernel?
What kernel version do they have? The patch fixing the issue is 82d811ff566594de3676f35808e8a9e19c5c864c in stable v6.1.51. The commit introducing the issue was a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update")
 
I'm having random freezes with the latest version of Proxmox (8.0.4 on Linux 6.2.16-15).
I've tested with an existing VM and a new VM, both with WIndows 10 installs.

I was not having freezing issues prior to Windows updates on the fresh install.
Running 22H2 (10.0.19045.3516).

A trace ran exclusively during a long (~10 second) freeze:
Code:
strace -c -p $(cat /var/run/qemu-server/100.pid)
strace: Process 59280 attached
^Cstrace: Process 59280 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.48  721.428003      137913      5231           ppoll
  2.16   16.148197         796     20272           write
  0.62    4.669689         894      5220           read
  0.49    3.628124         731      4958           recvmsg
  0.20    1.522290       15222       100           sendmsg
  0.04    0.323616       16180        20           accept4
  0.00    0.001229          30        40           fcntl
  0.00    0.000155           7        20           close
  0.00    0.000037           1        20           getsockname
  0.00    0.000010           2         4           futex
------ ----------- ----------- --------- --------- ----------------
100.00  747.721350       20836     35885           total


A trace ran during a few short freezes and generally stuttery-ness:
Code:
strace -c -p $(cat /var/run/qemu-server/100.pid)
strace: Process 59280 attached
^Cstrace: Process 59280 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 91.98 2645.416200       20348    130007           ppoll
  3.21   92.196400        3555     25932      3993 futex
  3.03   87.047001         224    387331           write
  0.85   24.409985         183    133377           read
  0.64   18.437890         205     89820           recvmsg
  0.18    5.219277         274     18996           io_submit
  0.09    2.483004        1379      1800           sendmsg
  0.01    0.247045         686       360           accept4
  0.01    0.202085         561       360           getsockname
  0.01    0.170705         237       720           fcntl
  0.00    0.053465         169       316           mmap
  0.00    0.042156          66       632           rt_sigprocmask
  0.00    0.033994         107       316           mprotect
  0.00    0.032163          89       360           close
------ ----------- ----------- --------- --------- ----------------
100.00 2875.991370        3638    790327      3993 total

The freezes will occur randomly while doing mundane tasks, or during heavy loads. Doesn't seem to matter.
The duration is random. Some last a half a second, some last up to 30 seconds.

During this period all threads allocated to the VM are at 100% and the VM is completely unresponsive and unusable.
 
Hi,

Maybe you can try this version (with kernel headers installed): https://forum.proxmox.com/threads/vms-freeze-with-100-cpu.127459/post-586756


What kernel version do they have? The patch fixing the issue is 82d811ff566594de3676f35808e8a9e19c5c864c in stable v6.1.51. The commit introducing the issue was a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update")

I did have the kernel-headers and kernel-devel packages installed for AlmaLinux.

The key... at least I think... was changing mmu_invalidate_seq to mmu_notifier_seq in the bpftrace script.

The AlmaLinux 8 kernel is based on the 4.18 kernel... but AlmaLinux (and RHEL before it) was known to backport a lot of stuff from other packages and keep the version number the same, so it's somewhat difficult to know what may be included in this kernel.

The kernel-devel package for the latest kernel - kernel-devel-4.18.0-477.27.2.el8_8.x86_64 - does not seem to have a is_page_fault_stale() function. So that's why I'm not sure if all of this discussion really relates to my AlmaLinux 8 issue - except for the fact that this whole thread is describing my symptoms to a tee.

There is a mention of mmu_notifier_seq in the /usr/src/kernels/4.18.0-477.27.2.el8_8.x86_64/include/linux/kvm_host.h file, but it declares mmu_notifier_seq as an unsigned long.

And in fact, I have run the modified bpftrace script looking at mmu_notifier_seq counts, and one server is showing this value to be 3,405,720,771 which is north of the 2,147,483,647 value for an int. But this particular server has never had this VM freezing issue.

I suppose one question would be if mmu_notifier_seq correlates directly to mmu_invalidate_seq?

The other question would be if the CPU in use plays a role in this somehow.

The server that has never had this VM freezing issue (the one with a mmu_notifier_seq count of 3,405,720,771) is using an AMD Ryzen 9 3900X CPU.

The other servers that are experiencing this VM freezing issue are using CPUs:

Intel Xeon E3-1230v2
Intel Xeon E3-1270v2
Intel Core i9-11900

The VM freezes happen randomly and I've never been able to find any cause. The last VM froze up after 2 days of uptime. Another froze up after 130+ days of uptime.

When the freeze ups happen, the qemu-kvm process is running at 100% of CPU. All of the CPUs dedicated to that VM (these are all single tenant node servers - only one VM running on the server) are showing 0% idle and just get stuck.

Again, sorry for muddying up this thread - as I said, I'm not using Proxmox - but I've been pulling my hair out for months trying to figure this out. I found this thread through a Google search and other than being AlmaLinux and not Proxmox, everything this thread describes seems to be happening during my freeze ups. A Google search doesn't seem to reveal any other AlmaLinux users experiencing this issue, which is puzzling itself.
 
Hi,
I'm having random freezes with the latest version of Proxmox (8.0.4 on Linux 6.2.16-15).
I've tested with an existing VM and a new VM, both with WIndows 10 installs.

I was not having freezing issues prior to Windows updates on the fresh install.
Running 22H2 (10.0.19045.3516).

A trace ran exclusively during a long (~10 second) freeze:
Code:
strace -c -p $(cat /var/run/qemu-server/100.pid)
strace: Process 59280 attached
^Cstrace: Process 59280 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.48  721.428003      137913      5231           ppoll
  2.16   16.148197         796     20272           write
  0.62    4.669689         894      5220           read
  0.49    3.628124         731      4958           recvmsg
  0.20    1.522290       15222       100           sendmsg
  0.04    0.323616       16180        20           accept4
  0.00    0.001229          30        40           fcntl
  0.00    0.000155           7        20           close
  0.00    0.000037           1        20           getsockname
  0.00    0.000010           2         4           futex
------ ----------- ----------- --------- --------- ----------------
100.00  747.721350       20836     35885           total


A trace ran during a few short freezes and generally stuttery-ness:
Code:
strace -c -p $(cat /var/run/qemu-server/100.pid)
strace: Process 59280 attached
^Cstrace: Process 59280 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 91.98 2645.416200       20348    130007           ppoll
  3.21   92.196400        3555     25932      3993 futex
  3.03   87.047001         224    387331           write
  0.85   24.409985         183    133377           read
  0.64   18.437890         205     89820           recvmsg
  0.18    5.219277         274     18996           io_submit
  0.09    2.483004        1379      1800           sendmsg
  0.01    0.247045         686       360           accept4
  0.01    0.202085         561       360           getsockname
  0.01    0.170705         237       720           fcntl
  0.00    0.053465         169       316           mmap
  0.00    0.042156          66       632           rt_sigprocmask
  0.00    0.033994         107       316           mprotect
  0.00    0.032163          89       360           close
------ ----------- ----------- --------- --------- ----------------
100.00 2875.991370        3638    790327      3993 total

The freezes will occur randomly while doing mundane tasks, or during heavy loads. Doesn't seem to matter.
The duration is random. Some last a half a second, some last up to 30 seconds.

During this period all threads allocated to the VM are at 100% and the VM is completely unresponsive and unusable.
IIRC, that sounds more like the issue reported here: https://forum.proxmox.com/threads/p...pu-issue-with-windows-server-2019-vms.130727/

Most of the freezes reported in this thread were permanent and are fixed in newer kernels. It's better to open up new threads for new issues, otherwise it will just be hard to follow.
 
I did have the kernel-headers and kernel-devel packages installed for AlmaLinux.

The key... at least I think... was changing mmu_invalidate_seq to mmu_notifier_seq in the bpftrace script.

The AlmaLinux 8 kernel is based on the 4.18 kernel... but AlmaLinux (and RHEL before it) was known to backport a lot of stuff from other packages and keep the version number the same, so it's somewhat difficult to know what may be included in this kernel.

The kernel-devel package for the latest kernel - kernel-devel-4.18.0-477.27.2.el8_8.x86_64 - does not seem to have a is_page_fault_stale() function. So that's why I'm not sure if all of this discussion really relates to my AlmaLinux 8 issue - except for the fact that this whole thread is describing my symptoms to a tee.

There is a mention of mmu_notifier_seq in the /usr/src/kernels/4.18.0-477.27.2.el8_8.x86_64/include/linux/kvm_host.h file, but it declares mmu_notifier_seq as an unsigned long.

And in fact, I have run the modified bpftrace script looking at mmu_notifier_seq counts, and one server is showing this value to be 3,405,720,771 which is north of the 2,147,483,647 value for an int. But this particular server has never had this VM freezing issue.

I suppose one question would be if mmu_notifier_seq correlates directly to mmu_invalidate_seq?

The other question would be if the CPU in use plays a role in this somehow.

The server that has never had this VM freezing issue (the one with a mmu_notifier_seq count of 3,405,720,771) is using an AMD Ryzen 9 3900X CPU.

The other servers that are experiencing this VM freezing issue are using CPUs:

Intel Xeon E3-1230v2
Intel Xeon E3-1270v2
Intel Core i9-11900

The VM freezes happen randomly and I've never been able to find any cause. The last VM froze up after 2 days of uptime. Another froze up after 130+ days of uptime.

When the freeze ups happen, the qemu-kvm process is running at 100% of CPU. All of the CPUs dedicated to that VM (these are all single tenant node servers - only one VM running on the server) are showing 0% idle and just get stuck.

Again, sorry for muddying up this thread - as I said, I'm not using Proxmox - but I've been pulling my hair out for months trying to figure this out. I found this thread through a Google search and other than being AlmaLinux and not Proxmox, everything this thread describes seems to be happening during my freeze ups. A Google search doesn't seem to reveal any other AlmaLinux users experiencing this issue, which is puzzling itself.
You know the relevant commits that caused and fixed the issue in Proxmox VE. You'll have to ask the AlmaLinux devs/community or look at their kernel repository to see if the relevant commits are there or not (tip: search by commit title instead of ID, because a backport will get a new ID).
 
Just to follow up with this, in case any other AlmaLinux users experience this issue and come across this thread.

The is_page_fault_stale() function is indeed in the Almalinux kernel and using mmu_seq as an integer parameter. It was added in the 4.18.0-425.3.1 kernel for AlmaLinux 8 and as of this writing - it is still in the latest kernel 4.18.0-477.27.2 as an integer.

The 4.18.0-425.3.1 and every kernel after that up to 4.18.0-477.27.2 (perhaps newer kernels, that's just the latest kernel as of this post) is still referencing mmu_notifier_seq as an integer in the is_page_fault_stale() function. This results in the qemu-kvm guest freezing when the mmu_notifier_seq counter reaches max integer - 2,147,483,647.

My fix for this was to downgrade the kernel on the node back down to 4.18.0-372.26.1, although I'm fairly certain that 4.18.0-372.32.1 is OK as well. Neither of those kernels - and presumably the ones before them - have the is_page_fault_stale() function and thus no freezing when mmu_notifier_seq reaches max integer. I currently have an AlmaLinux 8 node running 4.18.0-372.26.1 with a qemu-kvm guest and the mmu_notifier_seq is at 2,848,642,148 and it is still running. Previously this node was running 4.18.0-477.27.2 and the qemu-kvm guest froze shortly after mmu_notifier_seq reached 2,147,483,647.

All of my use is really with AlmaLinux 8, but I did check AlmaLinux 9. This issue is also present in the latest AlmaLinux 9 kernel - which at the time of this posting I believe to be 5.14.0-284.11.1. My best guess is that this was added to the AlmaLinux 9 kernel in version 5.14.0-162.6.1. The last AlmaLinux 9 kernel not to have the is_page_fault_stale() would be 5.14.0-70.30.1. So if you are experiencing random VM freezes on an AlmaLinux 9 node, I would suggest downgrading to the 5.14.0-70.30.1 kernel to fix this.

The mainline kernel - version 6.6 - appears to have removed mmu_seq as a parameter to the is_page_fault_stale() function (or perhaps it never had it?). The long-term 6.1 kernel has mmu_seq as an unsigned long in is_page_fault_stale(). How Almalinux goes about resolving this I do not know. But downgrading to a kernel version before is_page_fault_stale() was added seems to resolve the issue on my AlmaLinux 8 systems.

I am really puzzled as to why this has not gotten more attention within the AlmaLinux circles (Presumably this issue is also in RHEL). Maybe nobody is using AlmaLinux as a node OS?

Just wanted to add all of this, because this thread helped immensely in allowing me to track down where the issue was at.
 
Just to follow up with this, in case any other AlmaLinux users experience this issue and come across this thread.

The is_page_fault_stale() function is indeed in the Almalinux kernel and using mmu_seq as an integer parameter. It was added in the 4.18.0-425.3.1 kernel for AlmaLinux 8 and as of this writing - it is still in the latest kernel 4.18.0-477.27.2 as an integer.

The 4.18.0-425.3.1 and every kernel after that up to 4.18.0-477.27.2 (perhaps newer kernels, that's just the latest kernel as of this post) is still referencing mmu_notifier_seq as an integer in the is_page_fault_stale() function. This results in the qemu-kvm guest freezing when the mmu_notifier_seq counter reaches max integer - 2,147,483,647.

My fix for this was to downgrade the kernel on the node back down to 4.18.0-372.26.1, although I'm fairly certain that 4.18.0-372.32.1 is OK as well. Neither of those kernels - and presumably the ones before them - have the is_page_fault_stale() function and thus no freezing when mmu_notifier_seq reaches max integer. I currently have an AlmaLinux 8 node running 4.18.0-372.26.1 with a qemu-kvm guest and the mmu_notifier_seq is at 2,848,642,148 and it is still running. Previously this node was running 4.18.0-477.27.2 and the qemu-kvm guest froze shortly after mmu_notifier_seq reached 2,147,483,647.

All of my use is really with AlmaLinux 8, but I did check AlmaLinux 9. This issue is also present in the latest AlmaLinux 9 kernel - which at the time of this posting I believe to be 5.14.0-284.11.1. My best guess is that this was added to the AlmaLinux 9 kernel in version 5.14.0-162.6.1. The last AlmaLinux 9 kernel not to have the is_page_fault_stale() would be 5.14.0-70.30.1. So if you are experiencing random VM freezes on an AlmaLinux 9 node, I would suggest downgrading to the 5.14.0-70.30.1 kernel to fix this.

The mainline kernel - version 6.6 - appears to have removed mmu_seq as a parameter to the is_page_fault_stale() function (or perhaps it never had it?). The long-term 6.1 kernel has mmu_seq as an unsigned long in is_page_fault_stale(). How Almalinux goes about resolving this I do not know. But downgrading to a kernel version before is_page_fault_stale() was added seems to resolve the issue on my AlmaLinux 8 systems.

I am really puzzled as to why this has not gotten more attention within the AlmaLinux circles (Presumably this issue is also in RHEL). Maybe nobody is using AlmaLinux as a node OS?

Just wanted to add all of this, because this thread helped immensely in allowing me to track down where the issue was at.
Hi!

Nice job! I use AlmaLinux, but only as a guest. Did you post this at AlmaLinux Forum?

https://bugs.almalinux.org/my_view_page.php

I wonder if fixes added to proxmox kernel are widespread to maintree kernel or proxmox kernel is only locally patched?
 
Hi,
Nice job! I use AlmaLinux, but only as a guest. Did you post this at AlmaLinux Forum?

https://bugs.almalinux.org/my_view_page.php
Yes, asking/reporting via their channels is the way to go ;)

I wonder if fixes added to proxmox kernel are widespread to maintree kernel or proxmox kernel is only locally patched?
All of this information can be found by looking at our patch file: https://git.proxmox.com/?p=pve-kern...c;hb=6810c247a180f3bb1492873cc571c3edd517d8a3

The mainline kernel accidentally fixed the issue in 6.3 with a refactoring of the code:
Code:
Upstream commit ba6e3fe25543 ("KVM: x86/mmu: Grab mmu_invalidate_seq in
kvm_faultin_pfn()") unknowingly fixed the bug in v6.3 when refactoring
how KVM tracks the sequence counter snapshot.

And the stable kernel v6.1 also has the fix, that's where we picked it from:
Code:
(cherry-picked from commit 82d811ff566594de3676f35808e8a9e19c5c864c in stable v6.1.51)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!