Windows Server 2025 CPU suddenly at 100%

rlopez

Member
Apr 18, 2024
6
0
6
Hello,
I have a branch new Proxmox VE 9.2.3 installed with all updates.
I have some VMs, and one of them is a recently installed Windows Server 2025, It has no software or any particular configuration, only system installed, VirtioTools latest version, and all updates done.
Suddenly, working with windows explorer inside the vm, the entire vm "stucks" and stop respond, If I go to Proxmox VM Summary I can see that CPU is running at 99,99%. It does this for a minute or two and then goes again to normailty. This is happening with no pattern and apparently ramdomly.
The VM has 6GB RAM, 4 vcpu, 80GB SSD VirtioSCSI Single controller for harddisk...
Subyacent storage is ZFS RAIDZ1 with branch new disks.
I have the same configuration in a lot of servers with no problems at all.
Any ideas?

1782977508926.png
Thank you
 
qm config:
Code:
agent: 1
bios: ovmf
boot: order=virtio0;ide2;ide0;net0
cores: 4
cpu: host
efidisk0: RAID1-960-SSD:vm-100-disk-0,efitype=4m,ms-cert=2023k,pre-enrolled-keys=1,size=1M
ide0: none,media=cdrom
ide2: none,media=cdrom
machine: pc-q35-11.0
memory: 6144
meta: creation-qemu=11.0.0,ctime=1779222342
name: SRVMYLIDEAS
net0: virtio=BC:24:11:A2:F9:69,bridge=vmbr1,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=7dc06f2d-8bc2-4d89-9926-089e0f2ad618
sockets: 1
vga: virtio
virtio0: RAID1-960-SSD:vm-100-disk-1,cache=writeback,discard=on,iothread=1,size=80G
vmgenid: 1f8a98f8-8d0a-4d0c-86d2-c2dc71b6ed5c

pveversion:
Code:
proxmox-ve: 9.2.0 (running kernel: 7.0.6-2-pve)
pve-manager: 9.2.3 (running version: 9.2.3/d0fde103346cf89a)
proxmox-kernel-helper: 9.2.0
proxmox-kernel-7.0: 7.0.6-2
proxmox-kernel-7.0.6-2-pve-signed: 7.0.6-2
proxmox-kernel-7.0.2-6-pve-signed: 7.0.2-6
proxmox-kernel-7.0.2-4-pve-signed: 7.0.2-4
proxmox-kernel-7.0.2-3-pve-signed: 7.0.2-3
proxmox-kernel-7.0.2-2-pve-signed: 7.0.2-2
proxmox-kernel-6.17: 6.17.13-13
proxmox-kernel-6.17.13-13-pve-signed: 6.17.13-13
proxmox-kernel-6.17.13-11-pve-signed: 6.17.13-11
proxmox-kernel-6.17.13-9-pve-signed: 6.17.13-9
proxmox-kernel-6.17.13-8-pve-signed: 6.17.13-8
proxmox-kernel-6.17.13-7-pve-signed: 6.17.13-7
proxmox-kernel-6.17.2-1-pve-signed: 6.17.2-1
ceph-fuse: 19.2.3-pve2
corosync: 3.1.10-pve2
criu: 4.1.1-1
frr-pythontools: 10.6.1-1+pve2
ifupdown2: 3.3.0-1+pmx12
intel-microcode: 3.20251111.1~deb13u1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.1
libproxmox-backup-qemu0: 2.0.2
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.1.1
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.1.6
libpve-cluster-perl: 9.1.6
libpve-common-perl: 9.1.13
libpve-guest-common-perl: 6.0.3
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.6.6
libpve-notify-perl: 9.1.6
libpve-rs-perl: 0.15.3
libpve-storage-perl: 9.1.5
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 7.0.0-2
lxcfs: 7.0.0-pve1
novnc-pve: 1.7.0-1
proxmox-backup-client: 4.2.1-1
proxmox-backup-file-restore: 4.2.1-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.3
proxmox-kernel-helper: 9.2.0
proxmox-mail-forward: 1.0.3
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.4
proxmox-widget-toolkit: 5.2.3
pve-cluster: 9.1.6
pve-container: 6.1.10
pve-docs: 9.2.2
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.18-4
pve-ha-manager: 5.2.4
pve-i18n: 3.7.5
pve-qemu-kvm: 11.0.0-4
pve-xtermjs: 6.0.0-1
qemu-server: 9.1.16
smartmontools: 7.5-pve2
spiceterm: 3.4.2
swtpm: 0.8.0+pve3
vncterm: 1.9.2
zfsutils-linux: 2.4.2-pve1
 
Thanks for you answer.

Could you please set the guest CPU to a “non-host” type for the test, such as the default “x86-64-v2-AES”? Does it behave normally again after?
 
Hi!

This sounds rather similar with an issue we faced in this thread:
This is on german, but maybe your browser translation can help you :)

In general, we have downgraded the virtio driver from 285 to 271 and modified the CPU to a x86-64-v3 CPU. Host with removed nested-virt flag should be sufficient as well.
Also switched from mounting the disks as virtio to scsi.

Afterwards the issues went away. Not sure which modification was the correct one here, but we assume the CPU type.

Cheers!
 
Afterwards the issues went away. Not sure which modification was the correct one here, but we assume the CPU type.
Exactly, so @rlopez please start by changing "only" the CPU type so we can see if that causes any change in behavior with your setup. Thank you.
 
Might I suggest you also review the caching you have set on the VM disk? In OP I'm reading that the subsystem is ZFS.

Its my undertanding the recommendation is not to set any caching on VM disks themselves when using ZFS to avoid precisely what your seeing ... CPU lockups, duplicate data writes etc.

Happy Proxmox!
 
Hi! Could you state the source, where this is mentioned?
If the source is trustworthy and reproducable, this might be useful to mention here:
https://pve.proxmox.com/wiki/Performance_Tweaks#Disk_Cache
Keep in mind what ARC is when it comes to ZFS - "Adaptive Replacement Cache". Thus caching already happens in main memory.

I dug through my previous notes, there's mention of it here https://forum.proxmox.com/threads/vm-virtual-hdd-cache-settings-for-ssd-backed-zfs.114928/ but I have in mind I've seen it stated elsewhere in more general terms as well for qemu systems, not just applicable to Proxmox.

Happy Proxmox!
 
Hello,
After a few hours of running, I can confirm that the change of the CPU type seems to solve the problem. The question is why? is this a bug or a normal behavior? I have other servers running with same config and didn´t appreciate this before.
Should I change this for all my WS2025 machines?
Thank you so much.
 
Could this setting represent an impact on performance?
In other hand, I didn´t enable any virtualization characteristic on the Windows Server
 
VBS silently enables virtualization, even if hyperv or WSL not enabled.