Some KVM VMs crash shortly after boot when using UEFI

Keyrate

New Member
Feb 22, 2022
2
0
1
I admit up-front that this is a very specific set of conditions that I suspect few other users are running into. Unless they are, but the bug(s) are being triggered slightly differently. I found a number of other posts that share the same errors in /var/log/syslog ... one thread that describes exactly what I am experiencing, and a post from that thread referenced yet another thread about Backups failing. Interestingly, following the suggested fix from the second thread of editing the /usr/share/perl5/PVE/QemuServer.pm module to always disable smm made things not work at all for me ... the VMs sit at the console screen saying that the guest hasn't initialized the display.

Anyway, I have a fully functional multi-node Proxmox VE cluster that's working fine.[1] However, I wished to setup another cluster for experimental purposes, testing out Infrastructure as Code automation, etc. I installed Proxmox VE under Hyper-V w/ nested virtualization turned on. It works ... surprisingly well. I'm sure there can't be more than a handful of nuts like me running Proxmox VE this way, but here we are.

Initially I installed a few containers in Proxmox under Hyper-V by manually creating a template from debian-11-genericcloud-amd64.qcow2. That worked fine. Then I created a VM using Alpine 3.15, and that also works totally fine. I'm using UEFI for the LXC containers and the Alpine VM and things are swell.

However, just about every other VM that I've tried to create just fails shortly after launching the Linux kernel from the bootloader. This exactly what's described in the first thread above. Distribution installers I've tested before posting here:
  • Debian Bookworm (testing) amd64 Net Install
  • Endeavour OS Atlantis Neo 21.5
  • NixOS 21.11 (GNOME, Minimal) and 22.05-daily builds
  • Guix 1.3.0 x86_64
Here's a sample config of a VM that crashes immediately after trying to load the kernel after Grub:
Code:
# qm config 100
agent: 1
balloon: 2048
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 4
efidisk0: local-zfs:vm-100-disk-1,efitype=4m,pre-enrolled-keys=1,size=1M
ide2: local:iso/nixos-gnome-22.05pre356180.d5f23787297-x86_64-linux.iso,media=cdrom,size=2034M
machine: q35
memory: 4096
meta: creation-qemu=6.1.1,ctime=1645491584
name: nixos
net0: virtio=E2:B4:18:20:9A:D6,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-zfs:vm-100-disk-0,discard=on,size=32G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=9e3077ec-998c-4ac5-a801-d0df7b241bd2
sockets: 1
vga: virtio
vmgenid: 7beac6d8-273b-4f77-90ea-bc3ed32f2c2d

Here's a VM config for Alpine 3.15 that works with no problems:
Code:
# qm config 101
balloon: 1024
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 1
efidisk0: local-zfs:vm-101-disk-1,efitype=4m,pre-enrolled-keys=1,size=1M
ide2: local:iso/alpine-virt-3.15.0-x86_64.iso,media=cdrom
machine: q35
memory: 2048
meta: creation-qemu=6.1.1,ctime=1645492984
name: alpine-vm
net0: virtio=AE:DB:A1:8E:A5:DD,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-zfs:vm-101-disk-0,discard=on,size=8G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=419f0b82-6810-4064-8747-771685795589
sockets: 1
vga: virtio
vmgenid: 59a417a2-2f20-4dd6-8e66-342719f86c67

If I switch from UEFI to SeaBIOS, everything seems to work fine. I can boot, I can run the installers, I get Desktop environments, etc. However, switching back to UEFI, I immediately get this kind of error in /var/log/syslog:
Code:
Feb 22 00:22:18 pve-hv QEMU[94657]: KVM: entry failed, hardware error 0xffffffff
Feb 22 00:22:18 pve-hv kernel: [ 3272.944686] SVM: set kvm_amd.dump_invalid_vmcb=1 to dump internal KVM state.
Feb 22 00:22:18 pve-hv QEMU[94657]: EAX=00000000 EBX=8be03d68 ECX=00000000 EDX=000000b2
Feb 22 00:22:18 pve-hv QEMU[94657]: ESI=ff98a000 EDI=00000058 EBP=0000000c ESP=8be03cd8
Feb 22 00:22:18 pve-hv QEMU[94657]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
Feb 22 00:22:18 pve-hv QEMU[94657]: ES =0000 00000000 ffffffff 00809300
Feb 22 00:22:18 pve-hv QEMU[94657]: CS =be00 7ffbe000 ffffffff 00809300
Feb 22 00:22:18 pve-hv QEMU[94657]: SS =0000 00000000 ffffffff 00809300
Feb 22 00:22:18 pve-hv QEMU[94657]: DS =0000 00000000 ffffffff 00809300
Feb 22 00:22:18 pve-hv QEMU[94657]: FS =0000 00000000 ffffffff 00809300
Feb 22 00:22:18 pve-hv QEMU[94657]: GS =0000 00000000 ffffffff 00809300
Feb 22 00:22:18 pve-hv QEMU[94657]: LDT=0000 00000000 00000000 00000000
Feb 22 00:22:18 pve-hv QEMU[94657]: TR =0040 00003000 00004087 00008b00
Feb 22 00:22:18 pve-hv QEMU[94657]: GDT=     00001000 0000007f
Feb 22 00:22:18 pve-hv QEMU[94657]: IDT=     00000000 00000000
Feb 22 00:22:18 pve-hv QEMU[94657]: CR0=00050032 CR2=31801000 CR3=001a2000 CR4=00000000
Feb 22 00:22:18 pve-hv QEMU[94657]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Feb 22 00:22:18 pve-hv QEMU[94657]: DR6=00000000ffff0ff0 DR7=0000000000000400
Feb 22 00:22:18 pve-hv QEMU[94657]: EFER=0000000000000000
Feb 22 00:22:18 pve-hv QEMU[94657]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.

Searching for the first and last log lines are how I found the initial forum posts. Hmm. Let's see. What else would be valuable. Well, here's the output of lscpu:
Code:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   48 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      25
Model:                           80
Model name:                      AMD Ryzen 9 5900HX with Radeon Graphics
Stepping:                        0
CPU MHz:                         3293.645
BogoMIPS:                        6587.29
Virtualization:                  AMD-V
Hypervisor vendor:               Microsoft
Virtualization type:             full
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        2 MiB
L3 cache:                        16 MiB
NUMA node0 CPU(s):               0-7
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full AMD retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht sysc
                                 all nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_a
                                 picid aperfmperf pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hyperv
                                 isor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext ssbd ibrs ibpb st
                                 ibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsa
                                 vec xgetbv1 xsaves clzero xsaveerptr rdpru arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassi
                                 sts pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm

I have configured /etc/modules to load the appropriate kernel modules for IOMMU and the kernel command line to include support for IOMMU. Although there's a few notable things: First I didn't know that if you're using UEFI (which I am from Hyper-V) and ZFS (which is one of the things I wanted to experiment with) Proxmox defaults to systemd-boot rather than Grub, so all of my attempts to make sure that wasn't the problem wasn't making a difference since I was updating the wrong bootloader. Second is that I don't think there is a way to do passthrough via Hyper-V to Proxmox from the baremetal machine. So once I did update the right bootloader, there's still no mention of DMAR and IOMMU in dmesg output.

Other things I tried while troubleshooting this:
  1. I opted in to the Proxmox 7.2 kernel ... I installed pve-kernel-5.15. This did not make a difference with respect to this problem from 5.13 despite thinking it might. I had hope since the Alpine 3.15 VM is running Linux Kernel 5.15-16 and it doesn't crash QEMU KVM.
  2. I had a problem with the UEFI firmware provided by PVE not booting off of attached ISO images (it would say access denied) with Secure Boot enabled. When I turned Secure Boot off, I was able to boot to the ISO images, but then QEMU would crash after trying to load the Linux kernel from the bootloader. I looked at the PVE Changelog for the OVMF firmware, and saw that they're Proxmox-specific builds, so just to try to eliminate this as the source of the problem I tried the Debian OVMF UEFI builds. Specifically I tried the ovmf_2021.11-1_all.deb and ovmf_2022.02~rc1-1_all.deb builds since they are both newer than the builds in pve-edk2-firmware. I tried the new Debian Unstable firmware also with Secure Boot both on, and off.
  3. I don't have a Proxmox Enterprise subscription. But I do have all the apt repos setup, all updated and upgraded. The installed PVE .dpkgs are all current as of the date of this post, with the exception of what comes in with the PVE 5.15 kernel from 7.2.
  4. I'm using the q35 machine type. kvm64 for the CPU type, although I also tried host. No CPU Flags.
  5. I have tried both Secure Boot on, and off in the UEFI Device Manager settings. If there are other UEFI values to tweak, I can try them, but I haven't changed them myself.
What else would be useful to triage and debug this? I believe there may be other users hitting the same code paths, even though they're not doing the same silliness above that I am.

Thanks!

[1]: My production Proxmox cluster is on 7.0, but not 7.1. The underlying bare metal machines are Intel CPUs, not AMD. I'm not using Hyper-V, All of my production containers and VMs are using UEFI and not SeaBIOS.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!