Looking for help to identify performance issue

stevenwh

Member
Mar 16, 2024
30
2
8
So, I've had another couple of threads that are similar to this, but decided to create a new thread on this one because I'm not sure it's related to either of those.

I'm working on setting up a new homelab server. One of the things I would like to be able to do with it is just have some light cloud gaming VMs for personal use when I'm away from home. Those VMs are my specific focus here.
Host Hardware:
Poweredge R730
Dual E5-2690 V4
768 GB Ram
Dual 10 Gbps LAN link
16 x 4 TB SSD Drives in hardware RAID10 (Perc H730P) - Consumer TLC drives
500 GB SSD boot drive
Tesla P40 used in vGPU
RTX 4060 Ti used in pass through


Gaming Guest VM:
Windows 11
6 Cores
32 GB Ram
250 GB Disk
P40 with 8GB ram profile selected
Using Parsec to remote into the VM


I'll start by saying I know from another thread I have the consumer SSDs are less than ideal, but they are what I have right now. I can't afford to dump thousands into enterprise drives currently. Most of them I had already and have used for other things before now so know the drives themselves are good. I only picked up a few more to max out the drive bays on this 16 bay server. If they end up being the problem then I'll have to figure something out, but I don't think they are the problem, at least not currently. I'm running a single VM right now while running tests and I'm sure these drives in this configuration can handle a single VM lol.

Also, I know this CPU isn't the best CPU for gaming with the lower single threaded performance.

As for the performance. I've gotten everything running great for the most part honestly. The OS responds snappily and everything seems to be running good. But when I open up a game and try to start playing is when I start seeing problems. I've been testing with a couple of different games, Doom Eternal and Cyberpunk mostly. So using one that isa very efficient easy to run game, and one that is more demanding just to see the differences. In Doom Eternal everything will be running great for a while, I'm maxing out the 60 FPS limit (I forget where that limit comes from, something to do with the vGPU I think?). When running in 1440 in this game GPU usage will hover around 38%, CPU usage around 30% (highest single core may reach 50%) according to the guest usage statistics. According to host proxmox the CPU is usually around the same if I'm looking at the VM, if I'm looking at the node it's 5 - 7%. nvidia-smi shows the Tesla P40 sitting at around 100 W / 250 W, ~72 C while the game is running.

So... the problem happens just periodically while playing. It will undergo a few minutes of "hitching". It's not just a graphics hitch, it's like an entire VM freezes for a split second (audio cuts out and if I go out of the game to windows the mouse is stuttery there as well). It will do this every 1 - 3 seconds for a minute or two, and then everything will go back to running perfectly for could be another 5 minutes, or I've had it go as long as 25 minutes before the issue occurred again.

I cannot figure out what is causing this problem. While the problem is occurring if I look at everything, it looks fine... nothing is maxing out CPU usage doesn't show a spike anywhere, iostat disk usage shows steady, P40 GPU usage stays constant. I don't see any other processes on the VM spiking in usage or anything. The only thing that gives me any indication of a problem (other than the game / vm itself) is parsec metrics does show a short spike in frame time when the problem occurs. I'm assuming though that is because the parsec process on the VM is freezing along with the rest of the VM and not sending frame updates to the client parsec.

The same problems occur in Cyberpunk, so it's not isolated to Doom Eternal. In Cyberpunk I'm able to maintain 45+ FPS on this GPU with Medium graphics settings in 1080 resolution. But it has the same hitching that occurs randomly for a few minutes.

Also, I'll say the problem can occur with either game just sitting at a menu, not even with the game world loaded. I think the problem can occur if I'm not even in a game honestly, but I haven't found a good way to verify that. Since it's random when it happens and these are pretty short duration stutters they are just a lot harder to spot when doing general tasks in windows.

I'm just not even sure what to look at next to figure this problem out... Even with the low single threaded performance of these CPUs, I'd expect to see a high CPU usage spike if it was a CPU problem. If it was a disk problem I would think it would be more repeatable. I have transferred very large files around on the VM and over the network and don't see any issues crop up while doing that. When testing the disk IO performance I was able to saturate both 10 Gbps uplinks for 15 minutes straight with no dips. When using fio to test IOPs I'm seeing sustained 150k+ IOPs for even a 10 minute test (16k asyncwrite with 32 jobs).

Like I said, the only place I see any evidence of the problem is parsec frame time metrics show a spike when it occurs. Attached an example screenshot of it just now occuring. It had these little stutters maybe 7 or 8 times total and then went back to being fine. I don't feel like it's a parsec problem, just using this to show how short the little stutters are.
1713725486816.png


Thanks for any help / advice.
 
Last edited:
Hmm, this post is still awaiting approval before being displayed publicly, but I found something that *might* have corrected the issue, although I don't understand exactly why. I already had Numa checked when I set up the VM as I read that was good to have in a multi cpu system. I went in and set cpu affinity cores and haven't seen a stutter since. If this is the only VM running, how could having it run on specific cores prevent this stuttering effect? I mean it's possible it's not fixed and just working right now but it has been about 25 minutes with no stutter so far =/
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!