yawgpp (Yet Another Windows Guest Performance Post)

alexskysilk

Distinguished Member
Oct 16, 2015
2,753
974
213
Chatsworth, CA
www.skysilk.com
in the many years I've been using PVE, I havent had much call for using Windows guests, and when I did it was usually Windows 2016 (and older before) and had reasonably good results. In the last few weeks, I had need of a Windows guest for a specific purpose and I figured its a good time to set up a W11 guest and see how it behaves. the following represents my lessons learned. This is not a complete list of recommendations by any stretch- you can and should follow the PVE windows guest documentation before any of this becomes relevant.

1. Most threads I read (and personal experience) suggested to stay away from host CPU type. While this remains true, the USUAL alternative is x86-64-v2-aes. Turns out, THIS IS AN ABSOLUTELY TERRIBLE choice for Windows 11- even if it worked perfectly fine for w10/2016. the MINIMUM model required for acceptable performance is x86-64-V3 (W11 makes use of AVX2 calls which v2-aes lacks.)
2. Memory integrity must be turned off inside the guest (looks like this is known.)
3. Windows 11 behaves quite a bit better if it knows its a VM. thankfully, this can be achieved simply enough like so:
Code:
qm set 101 --args "-cpu x86-64-v3,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_synic,hv_stimer,hv_vpindex,hv_reset,hv_frequencies,hv_runtime,hv_tlbflush,hv_ipi,aes,+kvm_pv_unhalt"
(change cpu type to newer if your CPU has newer functions available)
4. NUMA should be enabled regardless of number of CPUs on anything made in the last 10+ years.
5. NTFS guest FS benefits noticeably from vdisk caching. If your underlying storage allows it, cache=write back helped quite a bit (I didnt bother benchmarking since I dont have a use case, just noticeable behavioral improvement.)

I hope this helps someone :)
 
  • Like
Reactions: Johannes S
-cpu x86-64-v3,... and -cpu host,... are not the same thing. x86-64-v3 is a named CPU model / ABI baseline, while host is host passthrough. In QEMU terms, named models expose a predefined, stable feature set, and host passthrough exposes the host CPU’s model, stepping, and supported features as closely as virtualization allows. QEMU explicitly documents these as two different modes, not mere syntax sugar.

So:

-cpu host = “make the guest CPU look like the real host CPU as much as possible”. QEMU says this passes the host CPU model, features, model, and stepping to the guest, subject to KVM filtering. It is the closest thing to bare metal, but it is bad for live migration across unlike hosts.

-cpu x86-64-v3 = “present a standardized virtual CPU that meets the x86-64-v3 baseline”. That is a compatibility target, not your actual host CPU. QEMU’s docs treat these named models as a way to isolate the guest from host differences and improve migration compatibility.

x86-64-v3 is not fundamentally the same as host plus a few obvious flags. It is closer to a curated CPU definition with a stable contract. In practice it does boil down to a specific set of exposed CPU capabilities, but it is still a different CPU modeling mode from host passthrough.

Hyper-V flags such as hv_relaxed, hv_spinlocks=0x1fff, hv_vapic are a separate layer. QEMU documents Hyper-V enlightenments as individual CPU flags that are not enabled by default, and they can be added on top of either host or a named model.

- host + hv_* = maximum fidelity/performance to that exact machine, least portable. (but needs to be tested, its totally dependend on the host CPU)
- x86-64-v3 + hv_* = portable baseline plus your chosen Hyper-V enlightenments.
- host + manually adding the same hv_* flags still does not become equivalent to x86-64-v3, because the base CPU exposed to the guest is different.

There is one more subtle point. With -cpu host, QEMU also passes through all supported paravirtualized KVM CPU features to the guest. That means even if you manually add a bunch of flags, host can still expose things that a named model would not expose by default.

In Proxmox terms, this matches their guidance too: host is the exact host CPU and can break live migration to different CPUs, while x86-64-v3 is a compatibility-oriented model intended to be usable across CPUs that meet that level. Proxmox describes x86-64-v3 as compatible with Intel Haswell-or-newer and AMD EPYC-or-newer class CPUs, with a defined added flag set over lower baselines.

--cpu x86-64-v3 is not just a shortcut for -cpu host with some flags.
It is a different base CPU definition. You can add flags to both, but the underlying CPU identity and default exposed feature set remain different.

Rule of thumb:
- Use host when you want best performance and do not care about migrating that VM to dissimilar nodes. *
- Use x86-64-v3 when you want a safer, more portable baseline across multiple hosts. **
- Add hv_* flags on top if you are tuning a Windows guest.

* you have to run performance tests. Its totally dependend on the host cpu - and there are about +30 QUEMU Flags You can set
** on older CPUs You might not be able to use x86-64-v3 so You need to use x86-64-v2

That said - host cpu CAN (but dont have to) give You better performance with the right QUEMU Flags if You dont care about live migration between hosts, but that depends on the host cpu type and the flags set. CPU Core Pinning, huge memory pages, NUMA - that all needs to be tested and profiled.

Interestingly I tuned (not finished now) my Windows11 guest on an old Xeon E5-2697 v2 and found out that Geekbench results did not reflect the actual performance while Novabench 5 gave more realistic results. On that Windows11 guest I have 12 Hyper-V instances, running private Github Action Runners - which works surprisingly well. (ok, only max. 6 at the time are actually busy - 6 for each of my two github organisations)

The save choice is to use x86-64-v3/v2 but if You really want to tune it in, You might consider to test host cpu passthrough - it might give You better results with the right flags set. Memory integrity must be turned off inside the guest - thats crucial.
 
If I recall correctly you should also be able to use host + disabled flag for nested virtualization to get a good performance and maximum features of your cpu
 
  • Like
Reactions: UdoB
not an option since I run hyper-v guests ;-) - there is just no "one fits all" good solution - either stick to the defaults, or be perpaired to try a LOT of flags to get it running on host cpu setting (if at all)
 
Last edited:
  • Like
Reactions: Johannes S