[SOLVED] Proxmox 8.0 / Kernel 6.2.x 100%CPU issue with Windows Server 2019 VMs

karor · Jan 17, 2024

fweber said:
Out of curiosity, can you post the output of lscpu?

Sure thing, here's the host still with mitigations=off:

Code:

# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  32
  On-line CPU(s) list:   0-31
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel
  Model name:            Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz
    BIOS Model name:           Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz  CPU @ 3.3GHz
    BIOS CPU family:     179
    CPU family:          6
    Model:               62
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           2
    Stepping:            4
    CPU(s) scaling MHz:  88%
    CPU max MHz:         4000.0000
    CPU min MHz:         1200.0000
    BogoMIPS:            6600.37
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm c
                         onstant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16
                         xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb ssbd ibrs ibpb stibp tpr_shadow fl
                         expriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts vnmi md_clear flush_l1d
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   512 KiB (16 instances)
  L1i:                   512 KiB (16 instances)
  L2:                    4 MiB (16 instances)
  L3:                    50 MiB (2 instances)
NUMA:
  NUMA node(s):          2
  NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
  NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         KVM: Vulnerable
  L1tf:                  Mitigation; PTE Inversion; VMX vulnerable
  Mds:                   Vulnerable; SMT vulnerable
  Meltdown:              Vulnerable
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Vulnerable
  Spectre v1:            Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers
  Spectre v2:            Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

And the guest with mitigations disabled as well (I generally use the "cpu: host" flag):

Code:

# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  30
  On-line CPU(s) list:   0-29
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        QEMU
  Model name:            Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz
    BIOS Model name:     pc-q35-8.1  CPU @ 2.0GHz
    BIOS CPU family:     1
    CPU family:          6
    Model:               62
    Thread(s) per core:  1
    Core(s) per socket:  15
    Socket(s):           2
    Stepping:            4
    BogoMIPS:            6599.99
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch
                         _perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 cx16 pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave
                         avx f16c rdrand hypervisor lahf_lm cpuid_fault ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust smep erms xsaveopt arat
                         umip md_clear flush_l1d arch_capabilities
Virtualization features:
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):
  L1d:                   960 KiB (30 instances)
  L1i:                   960 KiB (30 instances)
  L2:                    120 MiB (30 instances)
  L3:                    32 MiB (2 instances)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-29
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Mitigation; PTE Inversion; VMX vulnerable, SMT disabled
  Mds:                   Vulnerable; SMT Host state unknown
  Meltdown:              Vulnerable
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Vulnerable
  Spectre v1:            Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers
  Spectre v2:            Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

fweber said:
Note that one needs to differentiate between NUMA emulation for VMs [1], which is enabled on a per-VM basis (the "Enable NUMA" checkbox in the GUI), and the NUMA balancer, which is a kernel task running on the host [2] and thus affects all VMs running on that host. In my tests with the reproducer [3], it looks like a VM can freeze regardless of whether NUMA emulation is enabled or disabled for that VM, as long as the NUMA balancer is enabled on the (NUMA) host. So in other words, I don't think the value of the "Enable NUMA" checkbox for a VM makes any difference for the freezes.

[0] https://lore.kernel.org/all/20240110012045.505046-1-seanjc@google.com/
[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_numa
[2] https://doc.opensuse.org/documentation/leap/tuning/html/book-tuning/cha-tuning-numactl.html
[3] https://lore.kernel.org/kvm/832697b9-3652-422d-a019-8c0574a188ac@proxmox.com/T/#u

You are very probably right, when I disabled numa balancing / emulation, I did it for both host and guest at the same time, I had other settings enabled like CPU pinning, hugepages etc and turned stuff off until the issue disappeared.

This is the CPU graph on the guest right after the upgrade of the host from 7 to 8 and until it settled again.

Whatever · Jan 23, 2024

@fweber

I've manage to dedicate 1 node with mitigations=on and single client RDS server (who is working free of charge and should not complaint to much)
So, I'm ready to test new kernel with patch if you provide such

Right now numa balancing has been switched off and RDS server works smoothly and fine

zeuxprox · Feb 9, 2024

Hello,

is there any update on this issue?

Thank you

fweber · Feb 9, 2024

A v3 of one of the patches was posted upstream last week, I tested it and it also fixes the hangs for our reproducer [1]. Unfortunately, none of the patches has been merged into the upstream kernel yet. Considering the areas these patches apply to are somewhat complex (scheduler and KVM MMU) we'd prefer to only backport them to our kernel once they have been merged (or at least reviewed) upstream. So nothing too significant to report so far, but I'll continue to keep track of this upstream and let you know here once a fix has been merged upstream.

[1] https://lore.kernel.org/all/b9f6e379-f517-4c24-8df4-1e5484515324@proxmox.com/

admin-genpartners-ltd · Feb 13, 2024

Is there any hope for a fix?
Windows server 2022

48 x Intel(R) Xeon(R) Silver 4410Y (2 Sockets)
RAM 256 GB

Kernel Version

Linux 6.5.11-4-pve (2023-11-20T10:19Z)

Boot Mode

EFI

Manager Version

pve-manager/8.1.3/b46aac3b42da5d15

fiona · Feb 14, 2024

Hi,

admin-genpartners-ltd said:
Is there any hope for a fix?

48 x Intel(R) Xeon(R) Silver 4410Y (2 Sockets)
RAM 256 GB
Kernel Version

Linux 6.5.11-4-pve (2023-11-20T10:19Z)
Boot Mode

EFI
Manager Version

pve-manager/8.1.3/b46aac3b42da5d15

yes, the fix is being worked on upstream, see @fweber's post. In the meantime, you can try and disable the numa balancer as a workaround.

admin-genpartners-ltd · Feb 14, 2024

I saw the message but I don’t understand how to do it. How to configure NUMA?
mitigations=off: - i did do it only

fweber · Feb 14, 2024

admin-genpartners-ltd said:
I saw the message but I don’t understand how to do it. How to configure NUMA?
mitigations=off: - i did do it only

You can disable the NUMA balancer [1] for the current boot by running the following command:

Code:

echo 0 > /proc/sys/kernel/numa_balancing

After a reboot, the NUMA balancer will be active again.

If you want to disable the NUMA balancer permanently, you need to add numa_balancing=disable to the kernel command line and reboot. See the admin guide [2] for information how to modify the kernel command line.

[1] https://doc.opensuse.org/documentation/leap/tuning/html/book-tuning/cha-tuning-numactl.html
[2] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot_edit_kernel_cmdline

admin-genpartners-ltd · Feb 14, 2024

fweber said:
You can disable the NUMA balancer [1] for the current boot by running the following command:

Code:

echo 0 > /proc/sys/kernel/numa_balancing

After a reboot, the NUMA balancer will be active again.

If you want to disable the NUMA balancer permanently, you need to add numa_balancing=disable to the kernel command line and reboot. See the admin guide [2] for information how to modify the kernel command line.

[1] https://doc.opensuse.org/documentation/leap/tuning/html/book-tuning/cha-tuning-numactl.html
[2] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot_edit_kernel_cmdline

Thank you!
echo 0 > /proc/sys/kernel/numa_balancing
how i can check that succes?

fiona · Feb 15, 2024

admin-genpartners-ltd said:
Thank you!
echo 0 > /proc/sys/kernel/numa_balancing
how i can check that succes?

The proc file system should be a reliable indicator, so just:

Code:

cat /proc/sys/kernel/numa_balancing

Also, I think you would get an error if it doesn't work, e.g. when setting an invalid value:

Code:

root@pve8a1 ~ # echo foo > /proc/sys/kernel/numa_balancing
echo: write error: invalid argument

fweber · Feb 19, 2024

~~A new kernel package proxmox-kernel-6.5.13-1-pve-signed is now available on pvetest [1]~~. (EDIT: This patch was reverted in proxmox-kernel-6.5.13-2-pve, see [3] for more details). This kernel integrates a patch to the Linux scheduler that resolves the temporary freezes on our test system (with KSM enabled on a NUMA host), see [2] for more details.

It would be great if you could test this kernel and report back whether you still see temporary freezes (even if the NUMA balancer and KSM are both active), and whether you see any (positive or negative) impact on host/guest performance. Please include the output of the following commands:

Code:

lscpu
uname -a
grep "" /proc/sys/kernel/numa_*
grep "" /sys/kernel/debug/sched/preempt
grep "" /sys/kernel/mm/ksm/*

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_test_repo
[2] https://git.proxmox.com/?p=pve-kernel.git;a=commit;h=29cb6fcbb78e0d2b0b585783031402cc8d4ca148
[3] https://forum.proxmox.com/threads/130727/page-10#post-643000

spirit · Feb 21, 2024

BTW, I just notice a but with stopping ksmtuned service, it seem than "/sys/kernel/mm/ksm/run" is still=1 after ksmtuned service stop

(They are no any execstop in the service)

fweber · Feb 21, 2024

spirit said:
BTW, I just notice a but with stopping ksmtuned service, it seem than "/sys/kernel/mm/ksm/run" is still=1 after ksmtuned service stop

(They are no any execstop in the service)

That's true -- if KSM was active when ksmtuned was stopped, it will stay active after ksmtuned was stopped. The docs recommend to not only stop/disable ksmtuned, but also run echo 2 > /sys/kernel/mm/ksm/run [1]. This will take care of unmerging all shared pages, and KSM will be inactive afterwards.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_disabling_ksm

spirit · Feb 21, 2024

fweber said:
That's true -- if KSM was active when ksmtuned was stopped, it will stay active after ksmtuned was stopped. The docs recommend to not only stop/disable ksmtuned, but also run echo 2 > /sys/kernel/mm/ksm/run [1]. This will take care of unmerging all shared pages, and KSM will be inactive afterwards.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_disabling_ksm

see than the original ksmtuned.service from redhat, have a depend on another ksm.service (using a ksmctl commande, doing the echo 0 > )
https://github.com/ksmtuned/ksmtuned/blob/master/data/ksmtuned.service.in

I think it's since the systemd integration, as previously with sys5init, /usr/bin/ksmtuned was really setting 0.

(I have the windows hang problem on all my windows randomly, disabling numa_balancing is not enough, ksm/run need to be disabled too)

Jorge Teixeira · Feb 22, 2024

fweber said:
A new kernel package proxmox-kernel-6.5.13-1-pve-signed is now available on pvetest [1]. This kernel integrates a patch to the Linux scheduler that resolves the temporary freezes on our test system (with KSM enabled on a NUMA host), see [2] for more details.

It would be great if you could test this kernel and report back whether you still see temporary freezes (even if the NUMA balancer and KSM are both active), and whether you see any (positive or negative) impact on host/guest performance. Please include the output of the following commands:

Code:

uname -a grep "" /proc/sys/kernel/numa_* grep "" /sys/kernel/debug/sched/preempt grep "" /sys/kernel/mm/ksm/*

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_test_repo
[2] https://git.proxmox.com/?p=pve-kernel.git;a=commit;h=29cb6fcbb78e0d2b0b585783031402cc8d4ca148

Hello fweber.
I installed this kernel yesterday and, despite keeping KSM disabled because it has no benefits in my case, I have the NUMA balancer active and I notice that the server does not have such high peaks in usage. Until then there were peaks in the CPU graph up to 9 and since yesterday the maximum reached 5.5 (the peaks were mainly reached when backing up to PBS). In my case, most of the resources are occupied by a Windows Server 2022 VM with around 80 active RDP users but I have another 4 VMs with Linux on the same server.

root@pve:~# uname -a
Linux pve 6.5.13-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-1 (2024-02-05T13:50Z) x86_64 GNU/Linux

root@pve:~# grep "" /proc/sys/kernel/numa_*
/proc/sys/kernel/numa_balancing:1
/proc/sys/kernel/numa_balancing_promote_rate_limit_MBps:65536

root@pve:~# grep "" /sys/kernel/debug/sched/preempt
none (voluntary) full

root@pve:~# grep "" /sys/kernel/mm/ksm/*
/sys/kernel/mm/ksm/full_scans:0
/sys/kernel/mm/ksm/general_profit:0
/sys/kernel/mm/ksm/max_page_sharing:256
/sys/kernel/mm/ksm/merge_across_nodes:1
/sys/kernel/mm/ksm/pages_shared:0
/sys/kernel/mm/ksm/pages_sharing:0
/sys/kernel/mm/ksm/pages_to_scan:100
/sys/kernel/mm/ksm/pages_unshared:0
/sys/kernel/mm/ksm/pages_volatile:0
/sys/kernel/mm/ksm/run:0
/sys/kernel/mm/ksm/sleep_millisecs:20
/sys/kernel/mm/ksm/stable_node_chains:0
/sys/kernel/mm/ksm/stable_node_chains_prune_millisecs:2000
/sys/kernel/mm/ksm/stable_node_dups:0
/sys/kernel/mm/ksm/use_zero_pages:0

EDIT: Well...it seems I spoke too soon...the server just became unresponsive. I tried to open htop and it just wouldn't open. As soon as I stopped the NUMA balancer everything went back to normal. There was no KSM active, just the NUMA balancer. But it is important to mention the following: with the previous kernel, without KSM and with the NUMA balancer active, the VM with the windows server was configured with 64GB of ram and worked well without any problems. With this new kernel I changed the VM's RAM to 80GB and the freeze happened with everything else being the same. I had already noticed that the more memory I allocated to the VM, the faster the freeze occurred. In the image I marked in red the moment when I stopped the NUMA balancer and in blue the moment when the server started to slow down.

fweber · Feb 23, 2024

Jorge Teixeira said:
EDIT: Well...it seems I spoke too soon...the server just became unresponsive. I tried to open htop and it just wouldn't open. As soon as I stopped the NUMA balancer everything went back to normal. There was no KSM active, just the NUMA balancer. But it is important to mention the following: with the previous kernel, without KSM and with the NUMA balancer active, the VM with the windows server was configured with 64GB of ram and worked well without any problems. With this new kernel I changed the VM's RAM to 80GB and the freeze happened with everything else being the same. I had already noticed that the more memory I allocated to the VM, the faster the freeze occurred. In the image I marked in red the moment when I stopped the NUMA balancer and in blue the moment when the server started to slow down.

Thank you for testing and reporting back! So it seems like in your case, there are still freezes with the new kernel 6.5.13-1-pve, but they don't happen anymore after disabling the NUMA balancer. That's useful to know. Could you please also attach the output of lscpu? (I just edited my earlier post the also include that command).

spirit said:
see than the original ksmtuned.service from redhat, have a depend on another ksm.service (using a ksmctl commande, doing the echo 0 > )
https://github.com/ksmtuned/ksmtuned/blob/master/data/ksmtuned.service.in

I think it's since the systemd integration, as previously with sys5init, /usr/bin/ksmtuned was really setting 0.

Ah, good to know, thanks for checking this!

spirit said:
(I have the windows hang problem on all my windows randomly, disabling numa_balancing is not enough, ksm/run need to be disabled too)

Interesting, I don't think anyone else reported that the freezes still happen after disabling the NUMA balancer so far. Could you please attach the output of the commands I posted above [1]?

[1] https://forum.proxmox.com/threads/130727/page-9#post-635854

Jorge Teixeira · Feb 23, 2024

fweber said:
Thank you for testing and reporting back! So it seems like in your case, there are still freezes with the new kernel 6.5.13-1-pve, but they don't happen anymore after disabling the NUMA balancer. That's useful to know. Could you please also attach the output of lscpu? (I just edited my earlier post the also include that command).

Ah, good to know, thanks for checking this!

Interesting, I don't think anyone else reported that the freezes still happen after disabling the NUMA balancer so far. Could you please attach the output of the commands I posted above [1]?

[1] https://forum.proxmox.com/threads/130727/page-9#post-635854

The result of lscpu:

root@pve:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 88
On-line CPU(s) list: 0-87
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
Model name: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
BIOS Model name: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz CPU @ 2.2GHz
BIOS CPU family: 179
CPU family: 6
Model: 79
Thread(s) per core: 2
Core(s) per socket: 22
Socket(s): 2
Stepping: 1
CPU(s) scaling MHz: 91%
CPU max MHz: 3600.0000
CPU min MHz: 1200.0000
BogoMIPS: 4394.71
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pd
pe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c
rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd ibrs ibpb stibp tpr_shadow flexpriorit
y ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cq
m_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts vnmi md_clear flush_l1d
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 1.4 MiB (44 instances)
L1i: 1.4 MiB (44 instances)
L2: 11 MiB (44 instances)
L3: 110 MiB (2 instances)
NUMA:
NUMA node(s): 2
NUMA node0 CPU(s): 0-21,44-65
NUMA node1 CPU(s): 22-43,66-87
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: KVM: Vulnerable
L1tf: Mitigation; PTE Inversion; VMX vulnerable
Mds: Vulnerable; SMT vulnerable
Meltdown: Vulnerable
Mmio stale data: Vulnerable
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Vulnerable
Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers
Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected
Srbds: Not affected
Tsx async abort: Vulnerable

fweber · Mar 4, 2024

Was anyone else able to test whether proxmox-kernel-6.5.13-1-pve-signed solves the freezes for them? From what @Jorge Teixeira reported, it looks like it does not help with the freezes. If that's the case, we might need to revert the scheduler patch [1] in the next version of our kernel and instead go for the KVM fix (latest version v5 at [2]) once it is merged into the mainline kernel -- but, as explained in [1], it would be hard to backport that one to our 6.5 kernel.

[1] https://git.proxmox.com/?p=pve-kernel.git;a=commit;h=29cb6fcbb78e0d2b0b585783031402cc8d4ca148
[2] https://lore.kernel.org/all/20240222012640.2820927-1-seanjc@google.com/

Jorge Teixeira · Mar 6, 2024

fweber said:
Was anyone else able to test whether proxmox-kernel-6.5.13-1-pve-signed solves the freezes for them? From what @Jorge Teixeira reported, it looks like it does not help with the freezes. If that's the case, we might need to revert the scheduler patch [1] in the next version of our kernel and instead go for the KVM fix (latest version v5 at [2]) once it is merged into the mainline kernel -- but, as explained in [1], it would be hard to backport that one to our 6.5 kernel.

[1] https://git.proxmox.com/?p=pve-kernel.git;a=commit;h=29cb6fcbb78e0d2b0b585783031402cc8d4ca148
[2] https://lore.kernel.org/all/20240222012640.2820927-1-seanjc@google.com/

I fweber.
I have made some more tests and for me the current kernel does not solve the problem and the freezes continue.

jens-maus · Mar 6, 2024

fweber said:
Was anyone else able to test whether proxmox-kernel-6.5.13-1-pve-signed solves the freezes for them? From what @Jorge Teixeira reported, it looks like it does not help with the freezes. If that's the case, we might need to revert the scheduler patch [1] in the next version of our kernel and instead go for the KVM fix (latest version v5 at [2]) once it is merged into the mainline kernel -- but, as explained in [1], it would be hard to backport that one to our 6.5 kernel.

[1] https://git.proxmox.com/?p=pve-kernel.git;a=commit;h=29cb6fcbb78e0d2b0b585783031402cc8d4ca148
[2] https://lore.kernel.org/all/20240222012640.2820927-1-seanjc@google.com/

Let me please state, that I haven't yet been able to try out this new `proxmox-kernel-6.5.13-1-pve-signed` kernel simply because I hadn't had the time yet. Sorry for that. Hope to test it somewhat soon and provide my own feedback on that. Perhaps @Whatever can try it since he also very early reported on these kind of 100%CPU freezes and suggestes solutions like disabling mitigations and kvm which in fact worked for me and our 8.1.4 PVE production cluster works flawlessly since then.

[SOLVED] Proxmox 8.0 / Kernel 6.2.x 100%CPU issue with Windows Server 2019 VMs

Active Member

Renowned Member

Renowned Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

Proxmox Staff Member

Distinguished Member

Proxmox Staff Member

Distinguished Member

Renowned Member

Attachments

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Member

We value your privacy