[SOLVED] Proxmox 8.0 / Kernel 6.2.x 100%CPU issue with Windows Server 2019 VMs

As the main author of this thread / issue here I can also state that after updating to latest Proxmox 8.2.x with kernel 6.8 I was able to re-enable KSM and the Windows Terminlserver VM which previously ended up in 100%CPU usage and working nicely again on the same hardware/environment.

So, Proxmox 8.2.x with with kernel 6.8 seem to have finally solved this issue. So thanks to everyone contributing to this thread and especially thanks to the Proxmox crew/devs for taking our issues here serious and listening to us.

And if a proxmox fellow walks over this thread here, I think you can finally flag it as solved.
 
Thank you all for testing the 6.8 kernel and reporting back. Judging from the positive feedback, it indeed seems very likely that the 6.8 kernel finally resolves the issue.

Thanks everyone, especially @jens-maus @Whatever @Jorge Teixeira @spirit @Ramalama, for your help tracking this down!

For anyone who finds this thread, a summary of the issue and resolution:

Symptoms: Proxmox VE running with kernel 5.19, 6.2 or 6.5 on a host with multiple NUMA nodes (you can check this using lscpu). VMs frequently become unresponsive (freeze) with high CPU usage for some time ranging from ~1 seconds to >60 seconds. During that time, the VMs do not respond to pings. After the freeze, the VM comes back on its own and continues to run (without manual intervention). All guest OSs are affected in principle, though Windows VMs seem to be most affected. On Windows VMs, the freezes are often long enough to provoke RDP session timeouts. On Linux VMs, the guest OS may report watchdog: BUG: soft lockup The freezes can happen regardless of KSM being enabled or disabled, but become more frequent if KSM is enabled.

Resolution:
  • Preferred solution on Proxmox VE 8.x: Upgrade to at least kernel 6.8, which includes an upstream patch [1] that appears to resolve the issue.
    • The easiest way is to upgrade to at least Proxmox VE 8.2, which includes kernel 6.8. Make sure to read the "known issues" section of the release notes [2] before you upgrade.
    • If you cannot upgrade to Proxmox VE 8.2 completely yet, you can install the opt-in kernel 6.8 [3].
  • Workaround if you cannot upgrade to kernel 6.8: In most cases, the freezes can be avoided by disabling the NUMA balancer [4]. You can disable the NUMA balancer for the current boot by running the following command:
    Code:
    echo 0 > /proc/sys/kernel/numa_balancing
    After a reboot, the NUMA balancer will be active again.

    If you want to disable the NUMA balancer permanently, you need to add numa_balancing=disable to the kernel command line and reboot. See the admin guide [5] for information how to modify the kernel command line.
[1] https://git.kernel.org/pub/scm/linu.../?id=d02c357e5bfa7dfd618b7b3015624beb71f58f1f
[2] https://pve.proxmox.com/wiki/Roadmap#Known_Issues_&_Breaking_Changes
[3] https://forum.proxmox.com/threads/144557/
[4] https://doc.opensuse.org/documentation/leap/tuning/html/book-tuning/cha-tuning-numactl.html
[5] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot_edit_kernel_cmdline
 
Last edited:
Confirm. I have same problem and helped
Bash:
echo 0 > /proc/sys/kernel/numa_balancing
.
Upgrade when you get a chance, especially if you're using something like ZFS RAID10 on multiple sockets. The NUMA load balancer helps a lot there, just hits this bug. With the latest kernel it can stay enabled and it's much more efficient too. I had 100% IO utilization to 40%.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!