High VM-EXIT and Host CPU usage on idle with Windows Server 2025

SchorschSK · Dec 1, 2025

@_gabriel exactly as @RoCE-geek said.

Hardware: 2x Xeon 6230R and both VMs had this config.

Code:

agent: 1
bios: ovmf
boot: order=scsi0;ide2;ide0;net0
cores: 52
cpu: Cascadelake-Server-v5,flags=+md-clear;+pcid;+spec-ctrl;+pdpe1gb;+hv-tlbflush;+hv-evmcs
efidisk0: linstor_nvme_1:pm-33087c22_114,efitype=4m,ms-cert=2023,pre-enrolled-keys=1,size=3080K
hotplug: 0
ide0: ISO-1:iso/virtio-win-0.1.285.iso,media=cdrom,size=771138K
ide2: ISO-1:iso/Win11_23H2_English_x64.iso,media=cdrom,size=6548134K
machine: pc-q35-10.1
memory: 64000
meta: creation-qemu=10.1.2,ctime=1764535929
name: win11-23h2
net0: virtio=BC:24:11:E4:D1:51,bridge=vmbr0,firewall=1
numa: 1
ostype: win11
scsi0: linstor_nvme_1:pm-a0c6d8aa_114,discard=on,iothread=1,size=159383560K,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=xxxxxxxxxxxxxxxxxxxxxxxxxxxx
sockets: 2
tpmstate0: linstor_nvme_1:pm-29d19dc3_114,size=4M,version=v2.0
vga: virtio

Win23H2 is only marginally faster (margin of error) on all cores but the single core speedup is 18% which kinda reflects the max boost frequency during the single core test run.

Xeon 6230R was able to boost up to 3.75GHz in Win11 23H2 vs 3.1GHz turbo boost on Win11 25H2.

MarkusKo · Dec 1, 2025

Just for reference, updating PVE to 9.1 / Kernel 6.17 did reduce my CPU usage for VM's, including Windows Server 2025
Node CPU: AMD Epyc 9355P, VM Processor Type: x86-64-v4, virtio-win-0.1.271

Node1 Windows Server 2025 VM CPU usage:

Node2 Windows Server 2025 VM CPU usage:

RoCE-geek · Dec 1, 2025

MarkusKo said:
Just for reference, updating PVE to 9.1 / Kernel 6.17 did reduce my CPU usage for VM's, including Windows Server 2025
Node CPU: AMD Epyc 9355P, VM Processor Type: x86-64-v4, virtio-win-0.1.271

Node1 Windows Server 2025 VM CPU usage:
View attachment 93485

Node2 Windows Server 2025 VM CPU usage:
View attachment 93487

Thanks, will check the impact. What's your VM version? Was the PVE upgrade itself only one change made?

MarkusKo · Dec 1, 2025

Version is 10.0+pve1. I did only update PVE, did not modify the VM's.

guruevi · Dec 1, 2025

Here is a reference to the documentation: https://www.qemu.org/docs/master/system/i386/hyperv.html

The fallback that people noticed to other timers is desired. Yes, Windows 2025/11 instituted more timers because Windows software is broken and may not function well in VM, so this is a kernel change in Windows causing more interrupts. You can play with the above switches to see if you can get better emulation of HyperV but even there is higher CPU usage.

Do qm showcmd, see if hv_frequencies is in there and post the output here. Add hv_frequencies to args if necessary.

RoCE-geek · Dec 1, 2025

guruevi said:
Here is a reference to the documentation: https://www.qemu.org/docs/master/system/i386/hyperv.html

The fallback that people noticed to other timers is desired. Yes, Windows 2025/11 instituted more timers because Windows software is broken and may not function well in VM, so this is a kernel change in Windows causing more interrupts. You can play with the above switches to see if you can get better emulation of HyperV but even there is higher CPU usage.

Do qm showcmd, see if hv_frequencies is in there and post the output here. Add hv_frequencies to args if necessary.

I've been experimenting with many of these flags - no positive change with hv_frequencies either. There's no (clear) solution via HVE flags.

nodoame · Dec 1, 2025

No huge difference for me after updating from kernel 6.16.8 to 6.17.8 (plain Debian) unfortunately. Still the same 2-3x higher idle load and no turbo boost...

guruevi · Dec 1, 2025

I'm kind of wondering what people are expecting here after reading some more into the above details - you have 8-16 CPUs, according to the documentation, it is expected that HyperV guests will use less than 1% of CPU per core when 'idle'. So the total host CPU usage should be under 8-16% of 1 CPU at 'idle' (tasks may not get distributed to multiple cores at idle). I get ~5% for a 4-core Windows 11 guest and that seems to scale up as I add more cores. Note that setting the guest type to Linux in Proxmox brings that idle usage to ~15-20%, you have to set the guest type to Windows 11/2025 it drops to ~4-5% and this is on a relatively busy network with AD etc, so lots of things are happening 'constantly' to Windows machines. Look at the logs, you can disable a lot of services to get "more idle", but Windows is quite chatty compared to even 'fat' Linux distros. I get 0.23% on idle containers and ~1% on idle Linux (RHEL, Ubuntu).

RoCE-geek · Dec 1, 2025

guruevi said:
I'm kind of wondering what people are expecting here after reading some more into the above details - you have 8-16 CPUs, according to the documentation, it is expected that HyperV guests will use less than 1% of CPU per core when 'idle'. So the total host CPU usage should be under 8-16% of 1 CPU at 'idle' (tasks may not get distributed to multiple cores at idle). I get ~5% for a 4-core Windows 11 guest and that seems to scale up as I add more cores. Note that setting the guest type to Linux in Proxmox brings that idle usage to ~15-20%, you have to set the guest type to Windows 11/2025 it drops to ~4-5% and this is on a relatively busy network with AD etc, so lots of things are happening 'constantly' to Windows machines. Look at the logs, you can disable a lot of services to get "more idle", but Windows is quite chatty compared to even 'fat' Linux distros. I get 0.23% on idle containers and ~1% on idle Linux (RHEL, Ubuntu).

We know quite well what is happening and what has changed in 24H2 core (aka WS2025/Win11-24H2). Just check the recent evidence-based posts if really interested in the root cause. There's a stable WS2022/Win11-23H2 idle baseline, with sudden change in 24H2 core, which is caused by sudden abuse of specific synthetic timers (and/or other substitution VM calls if disabled). This is all we're talking about - and how to mitigate it, otherwise this change will seriously impact any existing dense VM environment. This is not about theorizing Windows vs Linux or about disabling Windows services, etc.

guruevi · Dec 1, 2025

@RoCE-geek: I'm not seeing the above as a performance issue on my hardware (rather modern Xeon Gold) in production environments with current Win11-25H2 with mostly default recommended settings.

Even looking at the other posts, I see a 'normal' looking Windows 11/2025 system. I can see where if you set the settings to Linux which disables the Hyper-V emulation, then, the idle jumps to 20%, which would be significant. I can understand AMD virtualization may be worse than Intel. But I'm looking at the documentation from Microsoft where this is discussed and Microsoft says this is 'normal' even on Hyper-V.

But this is VMware cluster for an idle Windows Server host (average it says is ~5%):

Hyper-V conveniently 'hides' VMs from its host statistics, so without some Prometheus-fu I'm not going to even try to understand, but this is a bare metal Windows (on PowerEdge), "idle" CPU usage bounces to ~10% max according to iDRAC, with an average of, again ~4%.

If you're seeing 4-5% CPU usage or a little higher on older models, I think we're chasing ghosts trying to get it to 1-2% we're "used to" in Linux-land because even the bare metal, Windows is just yes, triggering timers, which costs CPU time and has to be patched through. And KVM is not alone in finding out this, various other audio equipment and real-time software and games is "noticing" significant performance issues with interrupts for current iterations of Windows 11, it's not something that "we" could fix without Microsoft.

RoCE-geek · Dec 1, 2025

guruevi said:
@RoCE-geek: I'm not seeing the above as a performance issue on my hardware (rather modern Xeon Gold) in production environments with current Win11-25H2 with mostly default recommended settings.

Even looking at the other posts, I see a 'normal' looking Windows 11/2025 system. I can see where if you set the settings to Linux which disables the Hyper-V emulation, then, the idle jumps to 20%, which would be significant. I can understand AMD virtualization may be worse than Intel. But I'm looking at the documentation from Microsoft where this is discussed and Microsoft says this is 'normal' even on Hyper-V.

But this is VMware cluster for an idle Windows Server host (average it says is ~5%):
View attachment 93520

Hyper-V conveniently 'hides' VMs from its host statistics, so without some Prometheus-fu I'm not going to even try to understand, but this is a bare metal Windows (on PowerEdge), "idle" CPU usage bounces to ~10% max according to iDRAC, with an average of, again ~4%.
View attachment 93517

If you're seeing 4-5% CPU usage or a little higher on older models, I think we're chasing ghosts trying to get it to 1-2% we're "used to" in Linux-land because even the bare metal, Windows is just yes, triggering timers, which costs CPU time and has to be patched through. And KVM is not alone in finding out this, various other audio equipment and real-time software and games is "noticing" significant performance issues with interrupts for current iterations of Windows 11, it's not something that "we" could fix without Microsoft.

And here is the root of "misunderstanding". Most of the contributors here are looking for the 24H2 side-effects mitigation, simply because a 3x increase of idle load is hardly acceptable for us, and we do not want to resign ourselves to the fact that this is the "new standard" (if you do, good for you). We've analyzed the root cause and we're definitely not alone - other platforms like FreeBSD/Bhyve seem to be affected in even higher degree.

So why are we so interested? At least for these main reasons: 1) with WS2022 baseline, there is no real/impactful added idle CPU load, 2) other KVM-based system already mitigated this problem (Nutanix here), 3) we're very well aware of the impact of any massive WS2022->WS2025 upgrades in our infrastructure, hence wanna be prepared in advance, and 4) if something technically deactivates TurboBoost, this is definitely bug, not acceptable feature.

That's all - It won't help any of us here to pat each other on the back and say that it's actually okay and that we should just leave it alone.

SchorschSK · Dec 1, 2025

mentioning the big boyz: VmWare ESX 8 is also fine...

can somebody test hyper-v please? windows is not my piece of cake...

guruevi · Dec 2, 2025

@RoCE-geek - I definitely understand what you're going after. Can those calls be optimized, perhaps, I'm just trying to understand why you would consider this a KVM/Proxmox problem. Some people mention 18% CPU usage, some people mention 2-3% CPU usage. What is considered an acceptable 'baseline'.

Let's start with: How do you get your bare metal Windows 11/2025 under 5% average usage (the above stats I provided is a relatively clean Windows build)? I've since looked at Grafana for other computers, the lowest bare metal I get is ~3% for a quad-core, a 64C/128T has 8.84% is spent in "privileged" (aka kernel) mode so it does seem to have a relationship with the number of cores/threads. Which makes sense given what you demonstrated, there seems to be a timer attached to each thread and they are continuously firing, I believe that is true regardless of whether it detects being on a VM.

From what I can read these timers are added so it can detect a live migration or CPU frequency change so it can update certain timers/counters in the kernel. Like so many things with modern Windows, this seems more like a 'vibe coded' solution rather than a fix.

@SchorschSK - can you please qualify what is 'fine' - see above for my vSphere environment and just plain "bare metal" idle. Make sure you do not look at stats inside the guest, the guest may report 0% CPU usage, you have to look at the underlying hardware.

RoCE-geek · Dec 2, 2025

guruevi said:
@RoCE-geek - I definitely understand what you're going after. Can those calls be optimized, perhaps, I'm just trying to understand why you would consider this a KVM/Proxmox problem. Some people mention 18% CPU usage, some people mention 2-3% CPU usage. What is considered an acceptable 'baseline'.

Let's start with: How do you get your bare metal Windows 11/2025 under 5% average usage (the above stats I provided is a relatively clean Windows build)? I've since looked at Grafana for other computers, the lowest bare metal I get is ~3% for a quad-core, a 64C/128T has 8.84% is spent in "privileged" (aka kernel) mode so it does seem to have a relationship with the number of cores/threads. Which makes sense given what you demonstrated, there seems to be a timer attached to each thread and they are continuously firing, I believe that is true regardless of whether it detects being on a VM.

From what I can read these timers are added so it can detect a live migration or CPU frequency change so it can update certain timers/counters in the kernel. Like so many things with modern Windows, this seems more like a 'vibe coded' solution rather than a fix.

@SchorschSK - can you please qualify what is 'fine' - see above for my vSphere environment and just plain "bare metal" idle. Make sure you do not look at stats inside the guest, the guest may report 0% CPU usage, you have to look at the underlying hardware.

I'm afraid that I've already expressed all the key points or relevant ideas I had so far, so I've no need to repeat it. To be clear, I'm not saying that this is a KVM/QEMU/Proxmox problem, I'm just in the gang of those saying "look, something bad happened in 24H2 core, so let's try to mitigate it". We all know well, that if MS crippled something, MS should resolve it. But it's not usually happening. On the other side, in some cases community already expressed much more power (than expected), hence MS was forced to reevaluate something (minor, but helpful). I'm not saying this will happen in this case, but there's always a chance. If a bare metal WS2025 instance is crippled as well, cool, but in VM, it's caused by extensive MSR/HVE STIMER calls (or by its substitution, if disabled). Some platforms already mitigated it, some are probably finding the way at the moment, but in any case, I can hardly imagine that this complex problem will be let untouched, i.e. "silently" accepted. Last but not least, especially the crippled Turbo(Boost) activation on idle OS is simply something what industry will not accept (after the inital WTF phase). But I understand that everyone may have different expectations and goals, and that's okay.

guruevi · Dec 2, 2025

Your claim that it has been fixed on other platforms is weird, because I'm not running anything too old, all my stuff is relatively up-to-date including HyperV and bare metal Windows. A vague acknowledgment that 'similar issues' were fixed 'at one point', I agree, this was a problem in 2017, but nothing from 2024/2025.

I can't see anything from Microsoft, VMware, Nutanix, RHEL, Canonical or other providers to even acknowledge this problem, feel free to link if you do.

You are probably the first one to even bring this specific issue up in various forums, but I don't see any acknowledgment, other testers that figured out this was the problem or fixes from upstream KVM. I have an interest in 'fixing' it because I do a lot with low latency audio that stutters and pops because of excessive interrupts even on bare metal, at this point, no vendor has recommended a fix or a patch other than rolling it back to Windows 10 or switching to Linux (feel free to report what you find from this tool: https://www.resplendence.com/latencymon)

SchorschSK · Dec 5, 2025

@guruevi I have been aware of this issue for roughly a year, ever since we began evaluating Proxmox following Broadcom’s introduction of their new pricing model. Until now, it has been a secondary concern because our environment primarily consists of Kubernetes and Linux VMs (still running on VMware). However, with the growing demand for reasonably priced VDI solutions—and given that Sam Altman is reportedly accumulating not only GPUs but also CPUs and RAM—we are looking to migrate some of our costly workstation workloads (such as CAD) to our servers. Under these circumstances, this issue has become a genuine blocker.

Since @RoCE-geek thoroughly analyzed the problem, identified the source, and provided excellent supporting data, I did not expect that others—without technical evidence and relying solely on “common sense”—would attempt to challenge his findings in subtle ways.

guruevi · Dec 5, 2025

I’m not challenging his findings, I’m asking for evidence that this is considered a genuine “problem” in the industry and the assertion that other hypervisors or bare metal do not have this problem, or have solved it, which I have evidence to the contrary. This is an issue across the board, open a ticket with Microsoft.

Let’s say an idle machine consumes 1% per core as per the Microsoft documentation and that translates to 5% across a 16 core as indicated above, you’d still be able to overcommit up to 20:1 on those 16 cores, a modern system having 64 cores would be able to run 80 16-core VMs or >1000 1-core. Just sitting idle, Windows will spike to 10% every few minutes just on security software and other events so that level of overcommitment is absurd as you quickly run into other problems (you can’t overcommit memory at those rates).

VMware recommends 1:1-3:1 vCPU to pCPU ratios, 5:1 or 6:1 will lead to issues. Even at the 5:1 ratio, Linux desktops would be consuming ~10% in Idle. Windows is less efficient at 25% in that scenario.

LongQT-sea · Dec 9, 2025

You can try this custom CPU model. Create the file etc/pve/virtual-guest/cpu-models.conf

Code:

# Proxmox VE Custom CPU Models
cpu-model: AMD-windows
    flags +invtsc;+hv-frequencies;+hv-reenlightenment;+hv-emsr-bitmap;+hv-tlbflush-direct
    phys-bits host
    hidden 0
    hv-vendor-id amd
    reported-model host

cpu-model: Intel-windows
    flags +invtsc;+hv-frequencies;+hv-evmcs;+hv-reenlightenment;+hv-emsr-bitmap;+hv-tlbflush-direct
    phys-bits host
    hidden 0
    hv-vendor-id intel
    reported-model host

cpu-model: example-Skylake-Client-v4-windows
    flags +invtsc;+hv-frequencies;+hv-evmcs;+hv-reenlightenment;+hv-emsr-bitmap;+hv-tlbflush-direct
    phys-bits host
    hidden 0
    hv-vendor-id intel
    reported-model Skylake-Client-v4

For live migration to work with +invtsc, you need to set the tsc-frequency flag for your cluster (this requires host CPU support tsc_scaling).

Code:

qm set <VMID> --args "-global cpu.tsc-frequency=2000000000"

P.S.: I use this primarily to speed up the Windows hypervisor (Hyper-V, WSL2, VBS) in a Windows VM.

benyamin · Dec 9, 2025

LongQT-sea said:
You can try this custom CPU model.

Now that's an interesting take...

SchorschSK · Dec 9, 2025

LongQT-sea said:
You can try this custom CPU model.

No measureable effect on max performance or idle. Did not test WSL2 performance.

High VM-EXIT and Host CPU usage on idle with Windows Server 2025

New Member

Renowned Member

Active Member

Renowned Member

Renowned Member

Active Member

New Member

Renowned Member

Active Member

Renowned Member

Attachments

Active Member

New Member

Renowned Member

Active Member

Renowned Member

New Member

Renowned Member

Member

Member

New Member

We value your privacy