Proxmox vm random crash / freeze

eazit86

Member
May 1, 2021
7
0
6
38
Hi,

I`m having issues currently with proxmox, everything ran fine until vm`s random will crash.
I already moved the VM to another server but the problem of the has moved to that server.

When it freezes the VM gets an exclamation mark, only way to 50-50 fix it is to reset the VM, after that the resume button appears and the VM should boot again.
But sometimes it also does not work, and then the VM freezes during start giving (this also impacts others vm`s at the same moment, only a reboot fixes it now..)

This is what i found in the syslog;

May 1 14:48:40 pve06 kernel: [198892.051426] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 1 14:48:40 pve06 QEMU[2584]: KVM: entry failed, hardware error 0x80000021
May 1 14:48:40 pve06 QEMU[2584]: If you're running a guest on an Intel machine without unrestricted mode
May 1 14:48:40 pve06 QEMU[2584]: support, the failure can be most likely due to the guest entering an invalid
May 1 14:48:40 pve06 QEMU[2584]: state for Intel VT. For example, the guest maybe running in big real mode
May 1 14:48:40 pve06 QEMU[2584]: which is not supported on less recent Intel processors.
May 1 14:48:40 pve06 QEMU[2584]: RAX=00000000000c00d2 RBX=fffff80123152b40 RCX=0000000040000071 RDX=0000000000000000
May 1 14:48:40 pve06 QEMU[2584]: RSI=00000000000c00d2 RDI=0000000000000000 RBP=fffff80123152b00 RSP=fffff80123152a28
May 1 14:48:40 pve06 QEMU[2584]: R8 =00000000000c00d2 R9 =00000000000000d2 R10=fffff80121f8e5b0 R11=fffff80123152d98
May 1 14:48:40 pve06 QEMU[2584]: R12=0000000000000001 R13=0000000000000001 R14=fffff80123152af0 R15=00000000000000d2
May 1 14:48:40 pve06 QEMU[2584]: RIP=fffff8012195ce19 RFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
May 1 14:48:40 pve06 QEMU[2584]: ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
May 1 14:48:40 pve06 QEMU[2584]: CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
May 1 14:48:40 pve06 QEMU[2584]: SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
May 1 14:48:40 pve06 QEMU[2584]: DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
May 1 14:48:40 pve06 QEMU[2584]: FS =0053 0000000032928000 00003c00 0040f300 DPL=3 DS [-WA]
May 1 14:48:40 pve06 QEMU[2584]: GS =002b fffff80121aec000 ffffffff 00c0f300 DPL=3 DS [-WA]
May 1 14:48:40 pve06 QEMU[2584]: LDT=0000 0000000000000000 ffffffff 00c00000
May 1 14:48:40 pve06 QEMU[2584]: TR =0040 fffff8012313a000 00000067 00008b00 DPL=0 TSS64-busy
May 1 14:48:40 pve06 QEMU[2584]: GDT= fffff8012313b000 0000007f
May 1 14:48:40 pve06 QEMU[2584]: IDT= fffff80123139000 0000ffff
May 1 14:48:40 pve06 QEMU[2584]: CR0=80050031 CR2=000000000185dbf0 CR3=00000000001aa000 CR4=000406f8
May 1 14:48:40 pve06 QEMU[2584]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
May 1 14:48:40 pve06 QEMU[2584]: DR6=00000000ffff0ff0 DR7=0000000000000400
May 1 14:48:40 pve06 QEMU[2584]: EFER=0000000000000d01
May 1 14:48:40 pve06 QEMU[2584]: Code=40 49 c1 e0 20 4c 0b c0 49 8b d0 49 8b c0 48 c1 ea 20 0f 30 <c3> cc cc f0 83 0d 1c 08 18 00 01 f0 83 25 44 d2 1f 00 fe f0 83 25 ac d4 1f 00 fe c3 cc cc

The VM that is running is a Windows 2019 VM.

Any help?
 
Update: the crash is originating from the one vm above, even after migrating the vm to another cluster now that host in the other cluster is crashing (the crash takes the host with it now).

so I tried migrating the vm to a hyperv cluster, and now the hyperv host crashes every 12hours with that vm on it..
 
Can you post your vm config and hardware config. post the output of following commands

1. cat /etc/pve/qemu-server/<vmid>.conf

2. lscpu
3. pveversion -v
 
agent: 1
balloon: 0
boot: order=sata0;net0
cores: 6
cpu: host
machine: pc-i440fx-5.2
memory: 24576
name: TS62
net0: virtio=12:AC:A8:D2:0F:37,bridge=vmbr1
numa: 0
ostype: win10
sata0: SATASSD:720/vm-720-disk-0.raw,cache=writeback,discard=on,size=175G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=735f50e4-8ec6-42f4-939c-823ea5ed610c
sockets: 1
vmgenid: 0e7c78aa-09fd-4182-964d-8763c296cd8b


===CPU===
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Model name: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
Stepping: 7
CPU MHz: 2216.461
CPU max MHz: 3800.0000
CPU min MHz: 1200.0000
BogoMIPS: 5785.52
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 20480K
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d

===PVEVERSION===
pve-manager/6.4-4/337d6701 (running kernel: 5.4.106-1-pve)
 
already tried that...didn`t make a difference..
Did find something on other site (blog) that stated to switch off CFG in the windows VM (Control Flow guard).
Did that and it didn`t crashed yet... 20 hours now online,, before it would crash every 5/6 hours
 
Ok, everything is not crashing anymore after disabling CFG in the vm`s.

I also installed the intel-microcode module today on those host servers today as a precaution
 
Hello,

I have the same problem with a Linux VM. CFG is not available in Linux if I am not wrong.
Do you have some ides how I can resolve this issue ?

Thanks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!