W2022 poor performance

Rickb

New Member
Jan 16, 2024
12
2
3
We have done two new servers for customers that need two W2022 vm's on brand new Dell Power Edge servers and we have been having some major performance issue in the VM for both customers. When you load the OS the VM's seem to be fine but once you start placing a load on the servers (files share and small SQL express DB) the servers are almost freezing and then good for brief time. If you are on the VM just access window explorer you will get the circule on the mouse just waiting and then just starts responding. The vm does now show the server under load at anytime either on the vm or if data is accessing remotely.

I did find the some posts about the current VirtiIO SCIS drive having issue with W2022 and I downgraded them to the 1.208 version and that helped a little but still really bad. We are not seeing this on our servers in the data center but we are using CEPH with no-raid on those clusters.

Anyone see stuff like this or any suggestions?

Dell PowerEdge T560
dual Silver 4309Y 2.80GHZ
PERC h760 Raid 5 Nearline SAS 12TB storage
128GB Ram

VM
VirtIO on SCSI and NIC
pc-q35-9.0 (tried 8.2 as well)
Windows 2022 std
24 core
64GB
500GB disk (os)
4TB (file share data)
lvm-thin datastore 11TB
UEFI


PVE 8.3 no-subscription (current as of today)
UEFI
Memory does not consume more then 60%
IO delay never exceeds 0.40%
SWAP avg 5%
KSM Sharing is 0 B
 
Last edited:
Hello Rickb! Just to be sure, did you follow the Windows 2022 guest best practices? There you can find some tips on how to improve performance. Also, please take a look at what uses so much CPU. Also, do you notice the issues only when doing storage-related tasks? I'm just trying to get an overview of the situation.
 
Is your storage „RAID5“ holding the virtual HDDs? Rotating disks are always a bottleneck, especially when they are used on concurrent access.
 
If you did a downgrade of virtio driver, you must be sure really use the older version, uninstall and reinstall it from installer is not enough, manually select older driver in virtio-scsi controller (in windows "Device manager") is needed.
Anyway with recent windows I think is better update to 0.1.266

Please post also full vm config using:
Code:
qm config <vmid>
Is possible there are other config to fix or improve but without data is impossible to know.
 
Good Morning,

So I spent sometime on these two servers this weekend that I am having this trouble with and have some additional info and the answers to the questions asked.

The two effected server are the same hardware configuration but the processors. Server #1 is a 32 x Intel(R) Xeon(R) Silver 4309Y CPU @ 2.80GHz (2 Sockets) and server #2 is 112 x Intel(R) Xeon(R) Gold 5420+ (2 Sockets)

On server #1 I changed all the VirtIO driver to the current 1.226 and changed the SCSI to "Red Hat VirtIO SCSI" and confirmed the driver versions in the windows device manager with machine version set at 7.2 instead of the tested 8.x and 9 version. This has seemed to fix the issue or at least seems to have based on the response or seems to be find from the server and from some the workstations on the network. I want to see what the feed back is under use from the staff.

I tried the same thing on server #2 but if use the 1.266 drivers and make it "Red Hat VirtIO SCSI" instead of the "Red Hat VirtIO SCSI pass-through controller" the vm's will not boot, they are just stuck at the black windows loading circle going around (no BSOD).

The only difference that I can see between the vm configs is that server #2 is configured/installed as BIOS where as server #1 is UEFI. I can't see how this would matter but that is all I can see at this point being the difference.

I reviewed the Windows 2022 guest best practice and don't see anything that stood out or was not done.

I know RAID5 over head but this falls into completely unusable performance. We have been doing other server with either windows directly on the hardware or VMWare and never have seen performance anywhere as bad as this unless the OS was completely hosed. As an example if you try to launch chrome it can take 15 sec to get white screen and then another 3-4sec to get the chrome default home page(not web page). Mouse courser just sits there and spines. During that time the CPU, IO delay. Mem usage are all pretty much zero on both Proxmox and VM. Even checked individual drive queue length and it was zero.

vm configs attached.

I am hoping that we don't have move these back to direct hardware and just use Hyper-V but the customers are loosing patents.

Let me know if anyone has any other suggestions.

Thanks for everyone ones input.
 

Attachments

It has been reported that server #2 is still having the issue as well. As I mentioned there was hugh increases in speed after the VM SCIS driver change to current but I could only test with two workstations after we made the changes afterhours. I have been monitoring the host and the vm but everything is telling me that the server is not under load. I know it is bad when they report that it was faster with the old server which was 11 years old and 1/4 the specs.