Windows Server guest CPU stuck every few seconds

Sep 13, 2022
69
9
8
Hi all,

I have a strange issue with a Windows Server 2022 VM: every few seconds the whole system gets stuck for a second or so. Apparently if loaded more, the often it happens, but always just a short moments, usually a second or less, sometimes up to 2 seconds, but not more. It is a RDP server and working with these "lags" is really bad. Issue is with RDP but with web console (noVNC) as well and can be visualized inside the VM without need for network.

I'm chasing this since several days, made a lot of tests and read many hours, but was not able to solve the issue yet. The VM runs on a two socket server (2 x 16C32T elderly Xeon) with 2 x PM1653 SAS (ZFS mirror) plus some spinning disks (for backup, not used by the VM), 256 GB RAM, all ZFS only; on PVE node summary page, RAM usage is below 50% (~115 GB), CPU day max 40%. The node has no swap space.

The VM has RAM usage ~30%, but during office time high CPU, 60-80% max during office hours. Inside the VM, I see a few 2-5% processes and a total of up to 80% (but I think adding all processes give 30%, not 80%). On top I often saw "interrupts" ("Systemunterbrechnungen") with ~2% CPU load.

The VM itself seems not to notice the lags, as it would hang entirely for a moment. "World freeze". When I ping from outside, I see high ping RTTs (500-2000 ms) perfectly correlate with the input lags / hangs. When I ping from the inside, I can see the Ping hang, but it claims =<3ms, as if the "clock" for the ping would also hang. However, when I use HD Tune Pro, a disk benchmark tool, I see not only the tool hang for a second or two, but also afterwards get a "down-spike" in the read speed, again 100% correlation to the "hangs". So Windows ping does not see the issue, but HD Tune Pro does. From the latter I conclude that I don't have any network related issue.

The problem is much worse during office hours that at night, but I failed to artificially provoke it (so that I could test at night in a maintenance window): I see no relatation to running prime95 on all vCores, nor running disk read benchmark, the problem does not get any worse by that.

What I already tried:
  • 1 socket 10 cores
  • 1 socket 24 cores
  • 2 socket 12 cores + NUMA
  • 48 GB RAM
  • add swap inside the VM (the host has no swap space, but uses only ~115 of 256 GB RAM)
  • Microsoft\TIP\TestResults check (nothing big there)
  • netsh int tcp set global rsc=disabled
  • Get-NetAdapterRsc | Disable-NetAdapterRsc
  • disable energy saving (max power profile, and disable whatever I could find) and setting display sleep to never.
  • check MSI IRQ (all are MSI except Balloon, SM und USB, these have each own IRQ number)
  • there is no hyper-V, there is no WSL (Disable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
First I thought zpool iostat would show a bigger number 2-3 seconds after each lag, but if there is a correlation, it is a weak one and not always the case. Generating artificial I/O using benchmark tools seem not to affect the problem (also CPU burn test does not make anything worse).

What does correlate good to is CPU wait. I run vmstat 1 and see
Code:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 9  0      0 5545684 138644288 2925236    0    0    16  5044 20044 148127  0  1 82  0  0
 5  0      0 5545684 138644288 2925236    0    0     0  8544 23378 151690  0  6 78  0  0
16  0      0 5545684 138644288 2925236    0    0     4   536 21245 153818  0  1 80  0  0
28  0      0 5545684 138644288 2925236    0    0   136  7684 23492 158319  0  3 77  0  0
# (lag here)
23  0      0 5545684 138644288 2925236    0    0     0  9948 29364 95193  0 28 57  0  0
20  0      0 5545684 138644288 2925236    0    0     0 51228 20392 182788  0  2 86  0  0
10  0      0 5545684 138644288 2925236    0    0     4 22852 18058 120430  0  1 90  0  0
11  0      0 5545684 138644288 2925236    0    0     0   376 16074 98264  0  1 91  0  0
 4  0      0 5545684 138644288 2925236    0    0     0   104 15975 98421  0  1 91  0  0
26  0      0 5545684 138644288 2925236    0    0     0  7392 27543 58830  0 23 69  0  0
# (lag here)
15  0      0 5545684 138644288 2925236    0    0    40 37276 24181 83393  0 13 80  0  0
13  0      0 5545684 138644288 2925236    0    0   248   572 15575 94255  0  1 92  0  0
 7  0      0 5545684 138644288 2925236    0    0     8   808 17581 92197  0  1 89  0  0
16  0      0 5545684 138644288 2925236    0    0    96 10072 16511 97244  0  1 88  0  0
17  0      0 5545684 138644288 2925236    0    0    28 16204 16982 119799  0  1 85  0  0
 8  0      0 5545684 138644288 2925236    0    0     8 53624 23660 134408  0  8 76  0  0
21  1      0 5545684 138644288 2925236    0    0     8 21352 17478 147227  0  2 79  0  0
17  0      0 5545684 138644288 2925236    0    0     4 32704 20080 153549  0  2 74  0  0
29  0      0 5545684 138644288 2925236    0    0     0 44948 46321 111858  0 22 59  0  0
# (lag here)
13  0      0 5545684 138644288 2925236    0    0     0 16372 29618 129393  0  9 78  0  0
14  0      0 5545684 138644288 2925236    0    0     0 150120 23411 176360  0  5 81  0  0
18  0      0 5545684 138644288 2925236    0    0    40  9496 39449 84687  0 29 61  0  0
# (lag here)
 5  0      0 5545684 138644288 2925236    0    0    24   900 14759 96520  0  1 92  0  0
10  0      0 5545684 138644288 2925236    0    0    40   740 14762 92890  0  1 87  0  0
 8  0      0 5545684 138644288 2925236    0    0     0    44 76008 67731  0  8 75  0  0

In the VM, usually neither CPU nor disk I/O looks bad, Windows Performance Indicator report marks all green. In Task Manager there are a few 2-5% tasks only, but on top often I saw "Systemunterbrechnungen" which I think relates to interrupts.

Unfortunately the problem seems to be best visible when several people work on the server (and I fail to stresstest-provoke it) and I cannot reboot it without maintance windows (night shifts work on it too).

I'm out of ideas and hope someone can point me to a direction what I could try next, please!

Code:
pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.11-8-pve)

Code:
root@pve-2:~# cat /etc/pve/qemu-server/107.conf
agent: 1
bios: ovmf
boot: order=virtio0;ide2;net0
cores: 12
cpu: host
efidisk0: local-zfs:vm-107-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
ide2: none,media=cdrom
machine: pc-q35-8.0
memory: 49152
meta: creation-qemu=8.0.2,ctime=1695813306
name: w2k22-ts
net0: virtio=0E:4B:CB:cc:bb:cc,bridge=vmbr0,firewall=1
numa: 1
onboot: 1
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=28f74c6e-bde3-49d5-b215-68a4031512803
sockets: 2
virtio0: local-zfs:vm-102-disk-1,cache=writethrough,iothread=1,size=432G
vmgenid: d16b6ad8-226f-4baf-a4d8-564331511392f

[PENDING]
balloon: 0
vga: virtio
 

Attachments

  • 19-04-2024_05-43-51.png
    19-04-2024_05-43-51.png
    51.9 KB · Views: 6
Last edited:
My problem is the same as yours. I didn't find relevant information on the whole network. I thought I was the only one with this problem.
My motherboard is: EP2C621D12 WS
CPU:8222L * 2(Two CPUs)
Memory:64G * 6
GPU:P40 * 2 + RX6600XT
HDD:Seagate ST20000NM007D 20T
SSD:Samsung 990 Pro 4T
Power(PSU?):1200W
 
cpu: host
Not sure what your problem is, but since you probably don't use live migration (or you use identical CPU systems), maybe you could try a different CPU type & see the results.

Maybe also make sure Virtio drivers are up to date in Windows Server VM.

Concerning CPU types: see here for best practices.
 
  • Like
Reactions: Kingneutron
Not sure what your problem is, but since you probably don't use live migration (or you use identical CPU systems), maybe you could try a different CPU type & see the results.

Maybe also make sure Virtio drivers are up to date in Windows Server VM.

Concerning CPU types: see here for best practices.
Hi,
thank you for your quick reply.
I use "host" CPU type, because I don't use live migration (otherwise, AFAIK, a different CPU type should be choosen).
Do you think there could be any relation between CPU time and my short random freezes? Which type should I use to test with?

I already have the latest stable (virtio-win-0.1.240.iso).

Any ideas what I could try next?
 
Last edited:
  • Like
Reactions: Scott Devs
This sounds similar to this issue [1] that has been around for months despite many efforts to sort it out. Try using opt-in kernel 6.8 [2] as it seems to be solved with it.


[1] https://forum.proxmox.com/threads/p...pu-issue-with-windows-server-2019-vms.130727/
[2] https://forum.proxmox.com/threads/o...le-on-test-no-subscription.144557/post-652354
Thank you very much for the pointers!
Actually I already read that but thought it would not match my issue, as I don't see 100% CPU, but several user reports indeed match very well, so I think I could face the same issue. Thanks for pointing, I'll plan upgrade to 6.8 kernel (today is the perfect day to do so I think).
 
  • Like
Reactions: Scott Devs
This sounds similar to this issue [1] that has been around for months despite many efforts to sort it out. Try using opt-in kernel 6.8 [2] as it seems to be solved with it.


[1] https://forum.proxmox.com/threads/p...pu-issue-with-windows-server-2019-vms.130727/
[2] https://forum.proxmox.com/threads/o...le-on-test-no-subscription.144557/post-652354
Thank you so much, that's it! The mitigation (a-sdettmer@pve2:~$ echo 0 | sudo tee /proc/sys/kernel/numa_balancing) indeed seem to solve the issue!
 
hank you so much, that's it! The mitigation (a-sdettmer@pve2:~$ echo 0 | sudo tee /proc/sys/kernel/numa_balancing) indeed seem to solve the issue!
Happy you got it solved. I believe your setting above - will not be reboot persistent.
You probably need to add to the command line (AFAIK) numa_balancing=disable, alternately you could make a script for your mod above & have it activate on boot.
 
I use systemd to make such things permanent:

Code:
cat <<EOT >> /etc/systemd/system/sysfs_disable_ksm_numa_merge.service
[Unit]
Description=Disable KSM merge across NUMA nodes
After=multi-user.target
StartLimitBurst=0

[Service]
Type=oneshot
Restart=on-failure
ExecStart=/bin/bash -c 'echo 0 > /sys/kernel/mm/ksm/merge_across_nodes'

[Install]
WantedBy=multi-user.target
EOT

And then:
Code:
systemctl daemon-reload
systemctl enable --now sysfs_disable_ksm_numa_merge.service

This survives reboots and system updates.
 
I use systemd to make such things permanent:

Code:
cat <<EOT >> /etc/systemd/system/sysfs_disable_ksm_numa_merge.service
[Unit]
Description=Disable KSM merge across NUMA nodes
After=multi-user.target
StartLimitBurst=0

[Service]
Type=oneshot
Restart=on-failure
ExecStart=/bin/bash -c 'echo 0 > /sys/kernel/mm/ksm/merge_across_nodes'

[Install]
WantedBy=multi-user.target
EOT

systemd makes things so easy! In the old days someone had to write
Code:
echo 'echo 0 > /sys/kernel/mm/ksm/merge_across_nodes' >> /etc/rc.local
and it would even work for single user. What a luck that we don't need this anymore.
SCNR.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!