[TUTORIAL] Windows VM Performance and Softlocking in Nintendo Switch Emulation - fixed with +invtsc

aphirst

New Member
Jun 13, 2022
4
1
3
This is an issue that I've actually already solved, but because it took me so much time and effort to isolate the cause, I wanted to document the problem in a few places so that if anyone else experiences a similar issue it can hopefully save them a lot of stress.

Among other things, my Proxmox box (ASRock B450M Pro4, 5800X3D, RX6700XT, 4x16GB 3200MT/s ECC, 8TB U.2 SSD) is primarily used as a Windows gaming VM with 12 of 16 threads, about half of the system memory, and GPU passthrough. Passthrough of the GPU works without issue and I can even pass through other devices including motherboard audio and one of my USB controllers by giving their PCIE IDs to one of the ACS kernel commands to break them into their own IOMMU groups (I didn't want to use the command that breaks EVERYTHING into its own group, just the devices I cared about),

Recently I swapped out the older CPU (5700X) for the current one (5800X3D), and was trying out all kinds of software to compare the before/after performance, when I eventually noticed two things:
  • single-threaded benchmark performance (e.g. passmark PerformanceTest) was significantly lower than I expected - under 2700pts instead of over 3300pts
  • a VERY pernicious and diagnosis-resistant issue that affected Nintendo Switch emulation
Basically, in some Switch games (specifically Breath of the Wild), loading screens would randomly softlock partway, no matter what settings were set in the emulator - it even did it for the two different emulators Yuzu and Ryujinx which was very surprising.

Now, over the last 2 years I've fiddled around with the CPU flags in the cpu: line as well as an args: -cpu section. I suspected that I'd inadvertently broken something so I cleared out the relevant options and reduced it to just the line
Code:
cpu: host
Lo and behold, the softlocking went away!

For various reasons, I'd like to customise my CPU flags (including but not limited to disabling the hypervisor, and setting topoext=on), which tends to necessitate specifying the flags via the args: line. I noticed that if I so much as specified just
Code:
args: -cpu host
the softlocking came back! After some diagnosis I discovered that if I reproduce the exact set of flags that Proxmox/qemu generates by default, which you can see by running qm showcmd <VMID>, the softlocking goes away again, but then if I start making further changes to those flags the softlocking returns once again. In my case, those "stock" flags were
Code:
args: -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vendor_id=proxmox,hv_vpindex,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt'

Basically, I learned the hard way that the options you place into args: -cpu <FLAGS> often have side-effects as they sometimes implicitly enable or disable options like HyperV enlightenments or other flags. Thankfully, I also figured out a way to keep the softlocks away even when I made my own additional customisations to the CPU flags:

I needed to also explicitly specify the CPU flag +invtsc!

Part of the inspiration for setting that flag was a discussion in the Yuzu discord, where a developer suspected VM-related foul play in their main loop's timings. Apparently, they explicitly check for the presence of an invariant TSC and try to use it directly, otherwise falling back to using system wall clock. Clearly, the latter must cause issues inside VMs. Also clearly, since I have played that game at length before, I must have previously had this setting enabled without understanding its significance (probably copied blindly from somewhere), and then later simply removed it without thinking.

Armed with this new knowledge, not only do I no longer see that softlocking issue, but I repeated various CPU benchmarks and, even with the hypervisor disabled (a setting notorious for tanking performance), I still get near-baremetal results in benchmarking and overall system feel. The final set of CPU-related flags for my purposes boils down to:
Code:
args: -cpu 'host,host-cache-info=on,topoext=on,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vendor_id=0123756792CD,hv_vpindex,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,+invtsc,hypervisor=off' -smp '12,sockets=1,cores=6,threads=2'
and a nice little side-effect is that I can even see amounts of L1-L3 cache in Windows Task Manager that make sense given the number of threads I've enabled - basically it resembles a 5600X3D.

Anyway, I hope this summary can be of some use to someone, whether right now or in a few months or even years. I'd appreciate hearing back from anyone affected! Thanks for reading.

P.S. What tags should I set this thread as, given the nature of the topic?
 

Attachments

  • taskmanager.png
    taskmanager.png
    33.2 KB · Views: 12
Last edited:
  • Like
Reactions: prox4mfe

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!