Windows Server 2016 frequent lockups with 100% CPU

casalicomputers

Renowned Member
Mar 14, 2015
88
3
73
Hello,
since a few days we're experiencing frequent lockups on ALL Windows Server 2016 VMs (i got 3).
These vms have been running without any issue for months, but now all are experiencing the same issue.

As you can see from the attached screenshot

Selezione_057.png

CPU spikes at 100% and there's no way to access the affected VM via SPICE/noVNC or even reset it. When this happens I can only STOP the VM and then restart it, waiting for the next time this happens (usually within few hours).

Here's VM configuration:
Code:
boot: cdn
bootdisk: scsi0
cores: 4
ide2: none,media=cdrom
memory: 4096
name: dea-cnc
net0: virtio=5A:0B:3C:8E:B4:C4,bridge=vmbr0
numa: 0
onboot: 1
ostype: win8
parent: new
scsi0: raid10-ssd:vm-104-disk-1,size=200G
scsi1: raid10-ssd:vm-104-disk-2,size=5G
scsihw: virtio-scsi-pci
smbios1: uuid=a835c388-2f72-4540-9b81-50c464178624
sockets: 1
startup: order=10
vga: qxl

I really, really, really have no idea of what could be the cause, considering that I have absolutely no problems with linux VMs on the same server. Here's what I tried until now:

1) Upgraded PVE to latest version via APT
2) Powered off and on the server after the system upgrade
3) Disks were set to virtio-blk (virtioN) with default LSI SCSI controller. Migrated to virtio-scsi (scsiN) with VirtIO SCSI controller.
4) Updated spice-tools with latest package from spice-space.org

I suspect this has been caused by some microsoft update because, I repeat, the systems were working properly since a few days ago, and the weird thing is that only Windows vms are affected .... randomly of course.

Hope that someone could give hints on how to troubleshoot this.

Thanks
Michele
 
As of today the issue is still present and happens really often.
Just updated BIOS and storage controller's firmware to latest version from manufacturer.

Let's see how it goes.
 
Looks like the system behaves better, but lockups still happen even if less frequently.
I really don't know what to do next.

Please help!

EDIT:
I exclude any hardware defect because
  1. only windows server 2016 vms are affected
  2. the idrac on the server does not report any issue
 
Don't know if this could help, but I found some errors in the eventlog:

upload_2018-4-18_12-12-26.png

The computer has rebooted from a bugcheck. The bugcheck was: 0x00000109 (0xa39fd8da6d32975e, 0xb3b6e560bfb48ccf, 0x0000032000000000, 0x0000000000000017). A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: a755cf19-7b99-4c6c-a56c-b34dcbbb1f9a.

Bugcheck 0x00000109 with 0x17 as the 4th parameter means Local APIC modifications
(from https://docs.microsoft.com/en-us/wi...g-check-0x109---critical-structure-corruption)

Memory.dmp is roughly 80MB compressed - if someone needs it, I can send.
 
Last edited:
No luck. :(

VMs still experience lockups and I had to create a script that runs every 5 minutes, pings the guest agent and powercycles the vm when it doesn't reply. It's not the best solution, I know, but at least this reduces the downtime.
 
Hi caslicomputers,
Were you able to resolve this issue with your Windows Server 2016 vm?
I found this post as I am having the same issue like you do with a Windows Server 2016 Domain Controller. I just started 5 days ago and there seems to be two blue screens. Funny thing with me, I have another Windows Server 2016 F&P and that one seems to be OK. I am monitoring it now to make sure that there are no stop errors.
here is the bug check from the DC:
upload_2019-3-13_11-21-37.png
and this one:
upload_2019-3-13_11-22-10.png

If you were able to solve this, would you please let me know how you did it?
I'd really appreciate it.
Thanks!
OXIB.
 
Hi Oxib,
yes I've been able to resolve via Proxmox support.
Basically I installed a package (vlan) via APT which removed the package "proxmox-ve" and prevented pve kernels to be updated.
After I reinstalled the "proxmox-ve" package, did a apt dist-upgrade and reboot, everything started working again.

Hope this helps you as well.

Michele
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!