Passed-through VM stuttering

stewart

New Member
Aug 1, 2022
22
1
3
Hi.

I posted a similar thread a month ago, and whilst I now understand my problem a lot better, and the suggestions I got from posters were really helpful, I'm still not closer to resolving the issue.

I've got an instance of VE running 7.2-3 and, amongst other VMs, run a Windows 11 virtual machine as my desktop. Amongst other peripherals, I've passed my GPU through and I'm able to use it on my VM by plugging a monitor into it. However, I find that when I run higher-spec games, I suffer with stuttering -- visual, audio, and USB lagging. I know my machine can run these without problem as I have an SSD with a Windows OS on to test and there are no issues when not running Proxmox. I use GTA V to test as it's an old game and I can be sure it's not due to a lack of resources. Checking performance stats on the VM and hypervisor show nothing being highly utilised.

What I have tried to date:
1) Reducing the resources of the VM from 16 to 12GB RAM, and 8 cores to 4 cores. No impact.
2) Originally I had passed through my USB controller via PCI, but I have tried passing through USB ports and devices individually. This makes the stuttering much worse.
3) I have pcie_acs_override=downstream,multifunction enabled to split my IOMMU groups. No impact.
4) As my USB controller and RAM are in the same IOMMU group, I purchased a separate USB controller and passed that through instead. No impact.
5) I've run 'MSI_util_v3' and updated both the audio controller and GPU to use 'msi' with no effect.
6) I've kept my NVIDIA drivers (and all others) up to date, and have tried previous NVIDIA drivers.

These are the specs of my server:
Code:
Intel i7-10700KF @ 3.80GHz, 16 cores
32GB DDR4 Memory
500GB Kingston SSD
4 x 4TB WD Red HDD
NVIDIA GeForce GTX 1050 Ti (GigaByte) - Passed-through GPU
NVIDIA GeFroce GTX 750 Ti (GigaByte) - VE GPU
MSI Z590 PRO WiFi
PCI Dual-NIC
PCI USB Controller

These are my IOMMU groupings:
Code:
IOMMU Group 0:
        00:00.0 Host bridge [0600]: Intel Corporation Device [8086:9b43] (rev 05)
IOMMU Group 1:
        00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05)
IOMMU Group 2:
        00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911]
IOMMU Group 3:
        00:14.0 USB controller [0c03]: Intel Corporation Device [8086:43ed] (rev 11)
        00:14.2 RAM memory [0500]: Intel Corporation Device [8086:43ef] (rev 11)
IOMMU Group 4:
        00:16.0 Communication controller [0780]: Intel Corporation Device [8086:43e0] (rev 11)
IOMMU Group 5:
        00:17.0 SATA controller [0106]: Intel Corporation Device [8086:43d2] (rev 11)
IOMMU Group 6:
        00:1b.0 PCI bridge [0604]: Intel Corporation Device [8086:43c0] (rev 11)
IOMMU Group 7:
        00:1b.4 PCI bridge [0604]: Intel Corporation Device [8086:43c4] (rev 11)
IOMMU Group 8:
        00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:43b8] (rev 11)
IOMMU Group 9:
        00:1c.4 PCI bridge [0604]: Intel Corporation Device [8086:43bc] (rev 11)
IOMMU Group 10:
        00:1c.5 PCI bridge [0604]: Intel Corporation Device [8086:43bd] (rev 11)
IOMMU Group 11:
        00:1c.6 PCI bridge [0604]: Intel Corporation Device [8086:43be] (rev 11)
IOMMU Group 12:
        00:1c.7 PCI bridge [0604]: Intel Corporation Device [8086:43bf] (rev 11)
IOMMU Group 13:
        00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:4385] (rev 11)
        00:1f.3 Audio device [0403]: Intel Corporation Device [8086:f0c8] (rev 11)
        00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:43a3] (rev 11)
        00:1f.5 Serial bus controller [0c80]: Intel Corporation Device [8086:43a4] (rev 11)
IOMMU Group 14:
        01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] [10de:1c82] (rev a1)
IOMMU Group 15:
        01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
IOMMU Group 16:
        03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] [10de:1380] (rev a2)
IOMMU Group 17:
        03:00.1 Audio device [0403]: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] [10de:0fbc] (rev a1)
IOMMU Group 18:
        05:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU Group 19:
        06:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
IOMMU Group 20:
        06:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
IOMMU Group 21:
        07:00.0 Network controller [0280]: Intel Corporation Device [8086:2725] (rev 1a)
IOMMU Group 22:
        08:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03)

This is the config of my Win11 VM:
Code:
agent: 1,fstrim_cloned_disks=1
args: -machine type=pc-q35-6.2,kernel_irqchip=on
balloon: 0
bios: ovmf
boot: order=virtio0
cores: 4
cpu: host,hidden=1
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:01:00,pcie=1,x-vga=1
hostpci2: 0000:08:00.0
hotplug: disk,network,usb,cpu
ide2: none,media=cdrom
machine: pc-q35-6.2
memory: 12288
meta: creation-qemu=6.2.0,ctime=1661523193
name: Win11
net0: virtio=0E:96:FE:D2:E1:B6,bridge=vmbr0
numa: 0
onboot: 1
ostype: win11
smbios1: uuid=91e9e5d8-af8c-4631-ab95-acd5a6998865
sockets: 1
tpmstate0: local-lvm:vm-100-disk-1,size=4M,version=v2.0
vga: none
virtio0: local-lvm:vm-100-disk-2,size=128G
virtio1: ZFSData01:100/vm-100-disk-0.qcow2,size=5588G
vmgenid: 366c6917-db04-419e-aebb-b559fd9c4ccb

The other VMs/CTs I have on my server:
Win19 Server VM -- A server with 8GB RAM and 4 cores. Nothing too intense running on here yet, just Plex.
pfSense Firewall VM -- Performs my home routing and security. 2GB RAM and 2 cores.
Fileserver CT -- A file server container with 512MB RAM, 2 cores (overkill I know).

I would appreciate any help in troubleshooting this problem. I'm starting to hit a brick wall at this point.
 
  • Like
Reactions: chinoraz
Don't be mislead by the number of threads vs. Number of cores.
Your system has 8 cores and 16 threads.
https://ark.intel.com/content/www/d...700kf-processor-16m-cache-up-to-5-10-ghz.html

Hence your math is a bit off when assigning 8 cores ;)
Even with 4 Cores it might be not enough from time to time
- your vms want cycles too
- so does the host.
In the end your comparison with running windows physical is not applicable here.

I'd try two things
1. Make sure your memory is not overloaded. Aside your VMS the host is needs resources as well. Especially with zfs
2. Set your guest CPU type to "host". That's allowing all the modern registers to be used in the VM. I've not seen this config directive - on my end it made a huge difference
Damn it. Scrolled over it.
Still 1. Might be an issue

Good luck
 
Last edited:
Don't be mislead by the number of threads vs. Number of cores.
Your system has 8 cores and 16 threads.
https://ark.intel.com/content/www/d...700kf-processor-16m-cache-up-to-5-10-ghz.html

Hence your math is a bit off when assigning 8 cores ;)
Even with 4 Cores it might be not enough from time to time
- your vms want cycles too
- so does the host.
In the end your comparison with running windows physical is not applicable here.

I'd try two things
1. Make sure your memory is not overloaded. Aside your VMS the host is needs resources as well. Especially with zfs
2. Set your guest CPU type to "host". That's allowing all the modern registers to be used in the VM. I've not seen this config directive - on my end it made a huge difference
Damn it. Scrolled over it.
Still 1. Might be an issue

Good luck
I originally had 8 cores assigned to my Win11 VM but it was recommended I lower this to 4 cores. I didn't notice any discernible difference between the two setups.

When looking at the server's memory, it does look overloaded, but I convinced myself this was due to Windows gobbling it up for idle memory.
1674042535965.png
Is this something I should be concerned about, do you think? The OS currently is showing 5.8/12GB in use, but the summary in Proxmox shows 10/12GB.

EDIT: I've read that ZFS may cause this?
1674055288371.png
 
Last edited:
If not configured otherwise zfs will by default use up to 50% of memory.
That means your host goes under heavy memory pressure trying to use swap, etc.
This leads to pressure in your VMS and all sorts of side effects.

Try to see with arcstat and arc_summary how your zfs behaves.
After a reboot it should be reasonable and the after some time grow to 16GB

My advice:
Plan 2gb for the host
2-4 GB for zfs
The rest is for your VMs.
Also consider to leave some buffer (1-2GB)
 
If not configured otherwise zfs will by default use up to 50% of memory.
That means your host goes under heavy memory pressure trying to use swap, etc.
This leads to pressure in your VMS and all sorts of side effects.

Try to see with arcstat and arc_summary how your zfs behaves.
After a reboot it should be reasonable and the after some time grow to 16GB

My advice:
Plan 2gb for the host
2-4 GB for zfs
The rest is for your VMs.
Also consider to leave some buffer (1-2GB)
Thank you for your suggestions. It's obvious ZFS is the culprit here, and I've seen the recommendation of having 1GB of memory per 1TB of ZFS storage, and I have 16TB of storage. Even if that is an overestimate, I've ordered another 32GB of memory that I'll be installing this weekend. I'll update you on the results!
 
An update: got the memory, installed it, and it's solved my lack of memory issue. But that doesn't seem to be causing the stuttering. Doing some more verbose monitoring of the performance, it seems it could be the CPU causing the problem.

I'm trying to do some testing with CPU limits and trying to understand how that works after seeing this Reddit post.

I'm not sure if anyone is still watching this thread, but if you have any tips on tuning the CPU, I'd be grateful.
 
An update: got the memory, installed it, and it's solved my lack of memory issue. But that doesn't seem to be causing the stuttering. Doing some more verbose monitoring of the performance, it seems it could be the CPU causing the problem.

I'm trying to do some testing with CPU limits and trying to understand how that works after seeing this Reddit post.

I'm not sure if anyone is still watching this thread, but if you have any tips on tuning the CPU, I'd be grateful.
I'm also getting stuttering when passing through a GPU and playing games. Everything is working great except for the stuttering which lasts less than a second each time, but produces noticeable glitches when playing a game.

I have a super-decked out machine too, 2x 2699-v4, so 88 threads total, and 512GB of ecc dram, tesla p40 full pass-through to windows. I've tried a lot of different things, like you to reduce stuttering, but nothing has gotten rid of it. Windows VM is currently configured with 16GB ram, and 16 cores(threads), so nothing outlandish relative to my host system resources.

Could turning off spectre mitigations have any impact?
 
Last edited:
2x 2699-v4
Only a fraction (like 2/5) of the single threaded performance, you need for games, compared to a new consumer desktop CPU. So basically like running modern games on a 13 years old desktop PC running a i7 3XXX. Did you check if your games cause individual cores to stay at 100% utilization?
 
Only a fraction (like 2/5) of the single threaded performance, you need for games, compared to a new consumer desktop CPU. So basically like running modern games on a 13 years old desktop PC running a i7 3XXX. Did you check if your games cause individual cores to stay at 100% utilization?
That's a bit overstated.. I was just playing the witch3 at ultra settings. Framerate is excellent. no cores are peggeed at 100%.. Probably GPU limited. Everything runs fantastic except for the stuttering.

btw, those cores turbo boost to 3.6Ghz
 
Yes and if you compare the single-threaded cinebench benchmarks the 2699v4 is on par with the 13 year old i7 3770. It's a great CPU for running lots of VMs (I got myself a 2687v4) but not for gaming or other similar tasks that can't benefit that much from parallelization.
Also, make sure to enable NUMA and that the Gaming VM is running on the CPU that the GPU is attached to. Performance can drop terrible if a process running on one CPU needs to access resources/hardware only available to the other CPU.
 
Yes I know it's not "great" for gaming.. but it's fine for what a P40 can do. (about equal to a 1080-ti in performance). I will try the NUMA awareness and cpu-pinning.
 
TL;DR: Delete hidden=1

Hi,

It seems like you've been wrestling with some frustrating performance issues in your Proxmox setup, specifically when running higher-spec games in your Windows 11 virtual machine. It's great that you've tried various troubleshooting steps, and I believe I can provide a potential solution for you.

The issue you're experiencing could indeed be linked to the "hidden=1" setting in the CPU section of the VM config file. This setting essentially hides the fact that your virtual machine is running in a virtualized environment from the guest OS (Windows 11 in your case).

The purpose of "hidden=1" is to make the guest OS believe it's running on bare metal, but this can sometimes backfire, as it may prevent the guest OS from optimizing its performance for a virtualized environment, leading to the problems you've described.

So, simply remove "hidden=1" in the CPU section of your config file, and you should notice an improvement in the performance of your virtualized Windows 11 environment, especially when running high-demand applications and games.

Remember to monitor your system and check if the stuttering issues have been resolved. If necessary, you can fine-tune the CPU settings to best suit your workload, but removing "hidden=1" is a good starting point. I hope this solution helps you get the most out of your Proxmox setup for gaming and other tasks. Good luck!
 
Successful update: using kernel option:
Code:
mitigations=off
removed the stuttering. Just a warning, this option would only be suitable for a private/trusted scenario.
 
My problem was to do with the way I was storing the games. I had them stored on a ZFS array on WD NAS disks which have a slower-than-recommended read/write speeds which caused my spikes. I'm now running games off SSDs and I'm not experiencing the issue anymore.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!