New windows VM keeps dying

Cecil

Well-Known Member
Sep 22, 2017
54
1
48
44
I just did a brand new clean windows server 2016 vm (cloned from a template)

It runs great but then the next day the console just shows "Start boot option" bios screen and the vm is no longer working but still shows running status.

I cannot stop/reset/shutdown it, just get errors after a very long time like:
trying to acquire lock...TASK ERROR: can't lock file '/var/lock/qemu-server/lock-1025.conf' - got timeout
TASK ERROR: VM quit/powerdown failed

Every time i have to go into /var/lock/wemu-server/ and delete the lock file before I can start it again.

When I look at the lock file I see:
root@pve:/var/lock/qemu-server# lsof lock-1025.conf
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
task\x20U 111255 root 6wW REG 0,25 0 15 lock-1025.conf

Anyone have some ideas of how I can diagnose what is going wrong?
 
I have some more info, it turns out that windows is crashing with:
The bugcheck was: 0x00000109 (0xa39fdadeac097720, 0xb3b6e764fe8a8a51, 0x0000034000000000, 0x0000000000000017)

I googled that and found:

This problem occurs because the system detects a Critical MSR modification, and then it crashes.

specifically for VM's they recommend:
This is a known issue that affects ESXi 5.0.x. For more information, contact VMWare.
To work around this issue, manually create a CPUID mask for the affected virtual machines.
(see: https://support.microsoft.com/en-us...-structure-corruption-on-a-vmware-virtual-mac)
is there a proxmox equivalent or something else that I might have set wrong?

I have the config execatly the same as my other server 2016 except that one has 2 sockets and 6 cores each where this one is 1 socket and 4 cores.
I turned on numa for both vm's but maybe I should disable it for a single socket?
 
Ok.. more updates haha, So there is already a thread : https://forum.proxmox.com/threads/blue-screen-with-5-1.37664/page-10

And seems latest kernel 4.13.8-26 fixes this?
What is weird is that I have 4 server 2016 VM's(only 1 is production sofar the others are test) and the first one runs perfect with no BSOD or crashes the 4th one that I'm trying to do testing with is crashing.

I guess I'll try and change my sources to pve-no-subscription and see if I can get the kernel updated and test again. (hopefully it doesn't break things!)

Using pve test respo I updated and had 3 server 2016 VM's running fine all night long :)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!