VM shutdown, KVM: entry failed, hardware error 0x80000021

Micinka · May 20, 2022

daros said:
I upgrade to the latest -3 when i disabled the nested virtualization and since then no crash.
Could been fixed in -3 or the nested option works.

for me it did not work, crash after aprox 2,5 days

UntouchedWagons · May 20, 2022

mira said:
Can you try disabling SMM?
To do so you'll have to run the VM manually. First run qm showcmd <VMID> --pretty and copy the content to a file.
Modify the -machine line by adding ,smm=off.
Then run that command.

One more question, do all of you use UEFI with pre-enrolled keys and perhaps even secure boot?

The VM is running Server 2022 so yes UEFI with pre-enrolled keys.

Micinka · May 20, 2022

UntouchedWagons said:
The VM is running Server 2022 so yes UEFI with pre-enrolled keys.

same here VM created through Proxmox "wizzard" for Win Server 2022

alfe · May 20, 2022

Same here with Windows 11 Insider preview, of course with UEFI and pre-enrolled keys.

mira · May 20, 2022

We're now trying to reproduce it here with Windows Server 2022 and pre-enrolled keys.
Is there any specific software you're running in the VMs that run into this assertion?

Any load we could try to reproduce?

Binary Bandit · May 20, 2022

Throwing my hat in the ring for this issue ... well it looks like the same issue to me.

Let me know what I can contribute.

We're running a single Windows server 2022 VM on a new Dell T350 and just experienced this issue this morning, early AM.

A backup job finished at 3:39:52 and then the 2022 VM crashed at 3:46:58.

Code:

May 20 03:29:52 proxmox postfix/qmgr[2909]: 9B83980D67: removed
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], CHECK POWER STATUS spins up disk (0x
82 -> 0xff)
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Prefailure Attribute: 1 Raw_Re
ad_Error_Rate changed from 84 to 74
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Usage Attribute: 190 Airflow_T
emperature_Cel changed from 71 to 70
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperatu
re_Celsius changed from 29 to 30
May 20 03:46:58 proxmox kernel: [68465.912927] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 20 03:46:58 proxmox QEMU[3054]: KVM: entry failed, hardware error 0x80000021
May 20 03:46:58 proxmox QEMU[3054]: If you're running a guest on an Intel machine without unrestricted mode
May 20 03:46:58 proxmox QEMU[3054]: support, the failure can be most likely due to the guest entering an invalid
May 20 03:46:58 proxmox QEMU[3054]: state for Intel VT. For example, the guest maybe running in big real mode
May 20 03:46:58 proxmox QEMU[3054]: which is not supported on less recent Intel processors.
May 20 03:46:58 proxmox QEMU[3054]: EAX=00127c6a EBX=7d183180 ECX=00000000 EDX=00000000
May 20 03:46:58 proxmox QEMU[3054]: ESI=7d18f240 EDI=dafd80c0 EBP=68dc6470 ESP=68dc6290
May 20 03:46:58 proxmox QEMU[3054]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLTMay 20 03:29:52 proxmox postfix/qmgr[2909]: 9B83980D67: removed
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], CHECK POWER STATUS spins up disk (0x
82 -> 0xff)
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Prefailure Attribute: 1 Raw_Re
ad_Error_Rate changed from 84 to 74
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Usage Attribute: 190 Airflow_T
emperature_Cel changed from 71 to 70
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperatu
re_Celsius changed from 29 to 30
May 20 03:46:58 proxmox kernel: [68465.912927] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 20 03:46:58 proxmox QEMU[3054]: KVM: entry failed, hardware error 0x80000021
May 20 03:46:58 proxmox QEMU[3054]: If you're running a guest on an Intel machine without unrestricted mode
May 20 03:46:58 proxmox QEMU[3054]: support, the failure can be most likely due to the guest entering an invalid
May 20 03:46:58 proxmox QEMU[3054]: state for Intel VT. For example, the guest maybe running in big real mode
May 20 03:46:58 proxmox QEMU[3054]: which is not supported on less recent Intel processors.
May 20 03:46:58 proxmox QEMU[3054]: EAX=00127c6a EBX=7d183180 ECX=00000000 EDX=00000000
May 20 03:46:58 proxmox QEMU[3054]: ESI=7d18f240 EDI=dafd80c0 EBP=68dc6470 ESP=68dc6290
May 20 03:46:58 proxmox QEMU[3054]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLTMay 20 03:29:52 proxmox postfix/qmgr[2909]: 9B83980D67: removed
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], CHECK POWER STATUS spins up disk (0x
82 -> 0xff)
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Prefailure Attribute: 1 Raw_Re
ad_Error_Rate changed from 84 to 74
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Usage Attribute: 190 Airflow_T
emperature_Cel changed from 71 to 70
May 20 03:46:01 proxmox smartd[2401]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperatu
re_Celsius changed from 29 to 30
May 20 03:46:58 proxmox kernel: [68465.912927] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 20 03:46:58 proxmox QEMU[3054]: KVM: entry failed, hardware error 0x80000021
May 20 03:46:58 proxmox QEMU[3054]: If you're running a guest on an Intel machine without unrestricted mode
May 20 03:46:58 proxmox QEMU[3054]: support, the failure can be most likely due to the guest entering an invalid
May 20 03:46:58 proxmox QEMU[3054]: state for Intel VT. For example, the guest maybe running in big real mode
May 20 03:46:58 proxmox QEMU[3054]: which is not supported on less recent Intel processors.
May 20 03:46:58 proxmox QEMU[3054]: EAX=00127c6a EBX=7d183180 ECX=00000000 EDX=00000000
May 20 03:46:58 proxmox QEMU[3054]: ESI=7d18f240 EDI=dafd80c0 EBP=68dc6470 ESP=68dc6290
May 20 03:46:58 proxmox QEMU[3054]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT

Other notes:
- The server isn't in production yet and is only running our standard management software package ... remote control, patch management, backup, etc.
- The server ran with the same software in our office for two weeks with no issue.
- Two days ago it was placed into a small office environment with considerable traffic on th network.
- Yesterday at 9:17AM memory use within Windows reached 24GB, the size that is set in the VM. Windows was attempting to defragment the VM disks. I disabled this task and the utilization immediately dropped to less than 8GB.

best,

James

AntInf · May 20, 2022

hi
I have the same problem on Proxmox 7.2.4 and Windows Server 2022 VM. Suddenly it shuts down. The hyper-v service is not active and I find it in the log. No particular service is installed, all very standard, just a few shared folders and a management software.

basteagow · May 20, 2022

mira said:
Can you try disabling SMM?
To do so you'll have to run the VM manually. First run qm showcmd <VMID> --pretty and copy the content to a file.
Modify the -machine line by adding ,smm=off.
Then run that command.

Turning off SMM causes the VM to start but not POST (yes, I did launch swtpm manually), so I'm afraid this is not a viable test for VMs that require secure boot.

If it helps, here are potentially relevant kvm arguments you may want to use in trying to reproduce this problem:

-machine type=pc-q35-6.1+pve0
-smp 1,sockets=1,cores=12,maxcpus=12
-cpu host,+hv-tlbflush,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt
-numa node,nodeid=0,cpus=0-11,memdev=ram-node0
-drive if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE_4M.secboot.fd
-tpmdev emulator,id=tpmdev,chardev=tpmchar
-device tpm-tis,tpmdev=tpmdev

stefal · May 20, 2022

Is there any way to automatically restart VM on failure until this gets resolved?

stefal · May 20, 2022

mira said:
We're now trying to reproduce it here with Windows Server 2022 and pre-enrolled keys.
Is there any specific software you're running in the VMs that run into this assertion?

Any load we could try to reproduce?

Nothing special. Even OS installation gets irrecoverably interrupted, I had to repeat it 3 times. Looks like disabling CFG (Control flow guard) on guest makes crashes less often, maybe by three times. There may be more types of illegal instructions causing VM crash.
Steps to reproduce are just to install OS, connect it to internet and download and install updates. I experienced several crashes during this process.

basteagow · May 20, 2022

I just confirmed that simply downgrading the kernel to 5.13.19-6—with all other things unchanged—completely resolves the issue for me.

On kernel 5.15.35-1 my VM would invariably crash during Windows startup or shortly thereafter (before I could even log in), so in my particular case it's very quick and easy to know whether the problem is still there or not.

UntouchedWagons · May 20, 2022

mira said:
We're now trying to reproduce it here with Windows Server 2022 and pre-enrolled keys.
Is there any specific software you're running in the VMs that run into this assertion?

Any load we could try to reproduce?

Now that I think about it I had vanilla windows server 2019 VMs crash on me too but I don't know if those crashes were caused by the same thing. I'll spin up a couple and see if any of them crash.

[Edit] I tried starting up my two domain controller testing VM's that are both running server 2019 (UEFI, no secure boot) and one of them crashed within 10 seconds of starting up with the error "hardware error 0x80000021" in the syslog. That VM won't even start up anymore, the crash bricked the VM that hard. Thankfully I've got a backup. I'll try using a 5.13 kernel.

eider · May 21, 2022

This has been reported by others for 5.15.13 before here: https://old.reddit.com/r/VFIO/comments/s1k5yg/win10_guest_crashes_after_a_few_minutes/

Sadly, this happened for me too. Interestingly, this has only happened on single Windows 11 VM (21H1, 22000.xxx) but not on a Server 2019 (1809, 17763.xxxx), suggesting that this is something that occurs on newer builds only. Both VMs use same CPU configuration/args and both have TPM attached along with VBS enabled and running. In both cases Core Isolation is disabled as it is too slow when running on older hardware within VM.

There was nothing special happening on VM that crashed, it was idling with OpenVPN and single RDP session open. The time of crash is about few minutes after Veeam B&R scans entire AD network. While this VM has no backups configured it does have agent installed. I was able to crash it second time later on by manually scanning it (which basically only checks if B&R can ping device, if agent is installed and up-to date). Very unsure if this is related or not, as the crashes happened minutes after the fact, while actual scan takes few seconds at most. Possibly related to something that Windows kernel tries to do when switching from idling to working and back to idling (newer builds, and especially Windows 11, have more advanced kernel scheduling; also Server 2019 VM pretty much always does something so it never really idles below 1%)?

Downgrading to 5.13.19 is a solution that works for me (been over two weeks now since going back).

In case you need more information, here's my setup:

Code:

May  4 19:01:15 riko kernel: [    0.000000] microcode: microcode updated early to revision 0xec, date = 2021-04-29
May  4 19:01:15 riko kernel: [    0.000000] Linux version 5.15.30-2-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.30-3 (Fri, 22 Apr 2022 18:08:27 +0200) ()
May  4 19:01:15 riko kernel: [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.30-2-pve root=/dev/mapper/pve-root ro quiet console=tty0 console=ttyS1,115200n8 fsck.mode=force fsck.repair=preen nomodeset nmi_watchdog=0 intel_iommu=on l1tf=off mds=off tsx=on tsx_async_abort=off aacraid.expose_physicals=0 aacraid.dacmode=1 cpufreq.default_governor=schedutil

Code:

May  4 21:05:35 riko QEMU[4093]: KVM: entry failed, hardware error 0x80000021
May  4 21:05:35 riko QEMU[4093]: If you're running a guest on an Intel machine without unrestricted mode
May  4 21:05:35 riko QEMU[4093]: support, the failure can be most likely due to the guest entering an invalid
May  4 21:05:35 riko QEMU[4093]: state for Intel VT. For example, the guest maybe running in big real mode
May  4 21:05:35 riko QEMU[4093]: which is not supported on less recent Intel processors.
May  4 21:05:35 riko QEMU[4093]: EAX=00000000 EBX=00000000 ECX=40000070 EDX=00000000
May  4 21:05:35 riko QEMU[4093]: ESI=00000000 EDI=00326000 EBP=414bc476 ESP=00605920
May  4 21:05:35 riko QEMU[4093]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
May  4 21:05:35 riko QEMU[4093]: ES =0000 00000000 ffffffff 00809300
May  4 21:05:35 riko QEMU[4093]: CS =c400 7ffc4000 ffffffff 00809300
May  4 21:05:35 riko QEMU[4093]: SS =0000 00000000 ffffffff 00809300
May  4 21:05:35 riko QEMU[4093]: DS =0000 00000000 ffffffff 00809300
May  4 21:05:35 riko QEMU[4093]: FS =0000 00000000 ffffffff 00809300
May  4 21:05:35 riko QEMU[4093]: GS =0000 00000000 ffffffff 00809300
May  4 21:05:35 riko QEMU[4093]: LDT=0000 00000000 00000000 00000000
May  4 21:05:35 riko QEMU[4093]: TR =0030 00356040 00000067 00008b00
May  4 21:05:35 riko QEMU[4093]: GDT=     00356000 0000ffff
May  4 21:05:35 riko QEMU[4093]: IDT=     00000000 00000000
May  4 21:05:35 riko QEMU[4093]: CR0=00010030 CR2=385dda78 CR3=f9cc3000 CR4=00000000
May  4 21:05:35 riko QEMU[4093]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
May  4 21:05:35 riko QEMU[4093]: DR6=00000000ffff0ff0 DR7=0000000000000400
May  4 21:05:35 riko QEMU[4093]: EFER=0000000000000000
May  4 21:05:35 riko QEMU[4093]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.

Code:

# lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           158
Model name:                      Intel(R) Xeon(R) CPU E3-1275 v6 @ 3.80GHz
Stepping:                        9
CPU MHz:                         3800.000
CPU max MHz:                     3800.0000
CPU min MHz:                     800.0000
BogoMIPS:                        7599.80
Virtualization:                  VT-x
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        8 MiB
NUMA node0 CPU(s):               0-7
Vulnerability Itlb multihit:     KVM: Mitigation: Split huge pages
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX vulnerable
Vulnerability Mds:               Vulnerable; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds:             Mitigation; Microcode
Vulnerability Tsx async abort:   Vulnerable
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities

Code:

# cat /sys/module/kvm_intel/parameters/nested
Y

Code:

# cat /etc/pve/qemu-server/109.conf
agent: 1
args: -cpu host,+aes,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-time,hv-synic,hv-stimer,+hv-tlbflush,hv-ipi,hv-frequencies,hv-stimer-direct,hv-reenlightenment,hv-no-nonarch-coresharing=on,+kvm_pv_unhalt,+pcid,+pdpe1gb,+spec-ctrl,+ssbd
balloon: 4096
bios: ovmf
bootdisk: scsi0
cores: 4
cpu: host,flags=+pcid;+spec-ctrl;+ssbd;+pdpe1gb;+hv-tlbflush;+hv-evmcs;+aes
efidisk0: local:109/vm-109-disk-3.raw,efitype=4m,pre-enrolled-keys=1,size=528K
hotplug: disk,network,usb
ide0: none,media=cdrom
localtime: 1
machine: pc-q35-6.1
memory: 6144
name: nanoha
net0: virtio=66:04:FA:3A:15:D1,bridge=vmbr0
numa: 0
onboot: 1
ostype: win10
scsi0: local:109/vm-109-disk-1.qcow2,discard=on,size=128G,ssd=1,aio=native
scsihw: virtio-scsi-pci
smbios1: uuid=9cc7744e-098e-488c-a175-37a4a36d6135
sockets: 1
tablet: 0
tpmstate0: local:109/vm-109-disk-2.raw,size=4M,version=v2.0
usb0: spice,usb3=1
vga: qxl
vmgenid: d2f5ea32-9be4-4c64-96d6-09d49cbe6487

# cat /etc/pve/qemu-server/100.conf 
agent: 1
args: -machine kernel_irqchip=on -cpu host,+aes,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv-vpindex,hv-runtime,hv-time,hv-synic,hv-stimer,+hv-tlbflush,hv-ipi,hv-frequencies,hv-stimer-direct,hv-reenlightenment,hv-no-nonarch-coresharing=on,+kvm_pv_unhalt,+pcid,+pdpe1gb,+spec-ctrl,+ssbd
balloon: 8192
bios: ovmf
boot: cdn
bootdisk: scsi0
cores: 4
cpu: host,flags=+pcid;+spec-ctrl;+ssbd;+pdpe1gb;+hv-tlbflush;+hv-evmcs;+aes
efidisk0: local:100/vm-100-disk-3.raw,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 06:00,pcie=1
hotplug: disk,network,usb,cpu
ide0: none,media=cdrom
localtime: 1
machine: pc-q35-6.1
memory: 12288
name: azusa
net0: virtio=72:9F:D9:C6:42:AA,bridge=vmbr0
numa: 0
onboot: 1
ostype: win10
scsi0: local:100/vm-100-disk-0.qcow2,discard=on,iothread=1,size=128G,ssd=1,aio=native
scsi1: /dev/mapper/tank-veeam,backup=0,iothread=1,size=4T
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=bd604984-7547-449e-8d30-10b815c8bf22
sockets: 1
startup: order=1
tablet: 0
tpmstate0: local:100/vm-100-disk-2.raw,size=4M,version=v2.0
usb0: spice,usb3=1
vga: qxl
vmgenid: a3fbf8ce-21e9-4f85-ae05-d036f173a51e

VM 109 is Windows 11 that was crashing, VM 100 is Server 2019.

Micinka · May 21, 2022

mira said:
We're now trying to reproduce it here with Windows Server 2022 and pre-enrolled keys.
Is there any specific software you're running in the VMs that run into this assertion?

Any load we could try to reproduce?

Im running WinServer 2022 with MSSql server 2019 with small database under 10G, the VM crashed regularly, other VM with WinServer 2022 serving as windows fileserver does not crash, meaning on the same Proxmox instance

mcfly9 · May 21, 2022

I am also experiencing the issue once every few days on a Dell T630 and Win2022/Win11 VM's. It's usually a domain controller or an exchange server that crashes, but had other computers crash too. VM's have no pcie passthrough, have nothing special really.

Now I downgraded to 5.13.19-6-pve and disabled nested virtualization. Will monitor the situation.

fokustech · May 22, 2022

As I wrote in German forum: I am having two VM Windows Server 2022 on the pve with the 5.15.35-1 Kernel. One with and one without the Microsoft May updates. The VM with the updates crashes the other one is stable. Could someone confirm that the crashes are only with Microsoft May updates?

stefal · May 22, 2022

fokustech said:
As I wrote in German forum: I am having two VM Windows Server 2022 on the pve with the 5.15.35-1 Kernel. One with and one without the Microsoft May updates. The VM with the updates crashes the other one is stable. Could someone confirm that the crashes are only with Microsoft May updates?

The following ISO has this issue even during OS installation, with network not configured and internet not yet reachable. The only thing to select/customize is Datacenter with GUI (last in the list). It's the 180 day evaluation version of WS 2022. I doubt it already has any May updates.
https://software-download.microsoft...-1500.fe_release_SERVER_EVAL_x64FRE_en-us.iso

nick.kopas · May 22, 2022

Adding my crashing to the pile...

I have a Windows 11 VM that passes through an NVME drive, GPU, (2) USB controllers and a SATA controller. I run a WindowsImageBackup to a Samba share that lives on a ZFS pool on the Proxmox host. This backup runs at 4:00AM on Sunday and this morning I woke to an offline guest.

Some notes:

I had previously pinned 5.13.19-6-pve to resolve some issues with GPU passthrough.
I got a workaround functional, unpinned the kernel and rebooted on 5/10. (At this point I was running kernel version 5.15.35-2.)
The backup on 5/15 didn't fail.
On 5/17 I updated the kernel to version 5.15.35-2.

With the 5.13.19-6-pve kernel unpinned, running the WindowsImageBackup will always make it crash.

My syslog snippet follows, let me know if I can provide any more information to help get this resolved.

Code:

May 22 11:38:07 pve QEMU[3291]: KVM: entry failed, hardware error 0x80000021
May 22 11:38:07 pve kernel: [  380.096623] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 22 11:38:07 pve QEMU[3291]: If you're running a guest on an Intel machine without unrestricted mode
May 22 11:38:07 pve QEMU[3291]: support, the failure can be most likely due to the guest entering an invalid
May 22 11:38:07 pve QEMU[3291]: state for Intel VT. For example, the guest maybe running in big real mode
May 22 11:38:07 pve QEMU[3291]: which is not supported on less recent Intel processors.
May 22 11:38:07 pve QEMU[3291]: EAX=00000000 EBX=00000000 ECX=40000070 EDX=00000000
May 22 11:38:07 pve QEMU[3291]: ESI=00000000 EDI=0037c000 EBP=003b2d59 ESP=003b2cb0
May 22 11:38:07 pve QEMU[3291]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
May 22 11:38:07 pve QEMU[3291]: ES =0000 00000000 ffffffff 00809300
May 22 11:38:07 pve QEMU[3291]: CS =be00 7ffbe000 ffffffff 00809300
May 22 11:38:07 pve QEMU[3291]: SS =0000 00000000 ffffffff 00809300
May 22 11:38:07 pve QEMU[3291]: DS =0000 00000000 ffffffff 00809300
May 22 11:38:07 pve QEMU[3291]: FS =0000 00000000 ffffffff 00809300
May 22 11:38:07 pve QEMU[3291]: GS =0000 00000000 ffffffff 00809300
May 22 11:38:07 pve QEMU[3291]: LDT=0000 00000000 00000000 00000000
May 22 11:38:07 pve QEMU[3291]: TR =0030 003ac040 00000067 00008b00
May 22 11:38:07 pve QEMU[3291]: GDT=     003ac000 0000ffff
May 22 11:38:07 pve QEMU[3291]: IDT=     00000000 00000000
May 22 11:38:07 pve QEMU[3291]: CR0=00010030 CR2=00000000 CR3=17786000 CR4=00000000
May 22 11:38:07 pve QEMU[3291]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
May 22 11:38:07 pve QEMU[3291]: DR6=00000000ffff0ff0 DR7=0000000000000400
May 22 11:38:07 pve QEMU[3291]: EFER=0000000000000000
May 22 11:38:07 pve QEMU[3291]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.
May 22 11:38:08 pve kernel: [  380.262131] vmbr0: port 2(tap100i0) entered disabled state
May 22 11:38:08 pve qmeventd[14863]: Starting cleanup for 100
May 22 11:38:08 pve qmeventd[14863]: Finished cleanup for 100
May 22 11:38:11 pve systemd[1]: 100.scope: Succeeded.
May 22 11:38:11 pve systemd[1]: 100.scope: Consumed 18min 48.635s CPU time.

Kotuku52 · May 22, 2022

stefal said:
Is there any way to automatically restart VM on failure until this gets resolved?

Can anyone point me in the right direction of this?

Weirdly enough I have 2x Proxmox Hosts - duplicate hardware & 2x 2022 VM's with the same configuration, and only 1 VM that has this issue regardless of host.

stefal · May 23, 2022

Did some tests with WS 2022 installation.
Made it to the select partition for install screen, with VirtIO SCSI driver loaded. Format partition went OK. I left opened this screen for 24 hours. As soon as I confirmed and the installation started, in like 5 seconds the VM crashed. It seems is related to disk I/O.

VM shutdown, KVM: entry failed, hardware error 0x80000021

Member

Member

Member

Active Member

Proxmox Staff Member

Well-Known Member

Active Member

Attachments

Member

New Member

New Member

Member

Member

Well-Known Member

Member

Member

Member

New Member

Active Member

New Member

New Member