vGPU / host CPU setting / Win11 Auto repair loops

scyto

Active Member
Aug 8, 2023
359
64
28
Here are the top line facts:
  • I have i915 vGPU working on my host.
  • I install a new win11 VM.
  • I get the win11 VM working with VGPU
  • at some point later on a reboot the Win11 machine goes into automatic repair (this seems to be time based, not due to windows updates, not installing anything, not changing anything
  • if i set the CPU type to x86-64-v2-AES then the system boots...
  • ...but this breaks vGPU with the dreaded code 43
My gut is this is related to some of the issues seen with WSL on windows 11 with host CPU set - i assume because both use vt-d between host and guest

I have found other qemu posts (not proxmox) where this host passtrough on win11 seems to cause issues.
Things i have tried:

I need a way to keep host CPU enabled and avoid the reboot loops.
OR
I need a way with a virtualized CPU type to not have the windows intel drivers fail

I realize not everyone has this issue so i wonder if it is something to do with the specifics of my cpu?

  • Code:
    root@pve1:/etc/pve/qemu-server# cat /proc/cpuinfo
    processor       : 0
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 186
    model name      : 13th Gen Intel(R) Core(TM) i7-1360P
    stepping        : 2
    microcode       : 0x410e
    cpu MHz         : 691.790
    cache size      : 18432 KB
    physical id     : 0
    siblings        : 16
    core id         : 0
    cpu cores       : 12
    apicid          : 0
    initial apicid  : 0
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 32
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr ibt flush_l1d arch_capabilities
    vmx flags       : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs ept_mode_based_exec tsc_scaling usr_wait_pause
    bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs eibrs_pbrsb
    bogomips        : 5222.40
    clflush size    : 64
    cache_alignment : 64
    address sizes   : 39 bits physical, 48 bits virtual
    power management:
 
Last edited:
this is my VM config
Code:
agent: 1
bios: ovmf
boot: order=scsi0;ide2
cores: 4
cpu: host,hidden=1
efidisk0: vDisks:vm-102-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:00:02.1,x-vga=1
hotplug: disk,network,usb
ide2: ISOs-Templates:iso/virtio-win-0.1.240.iso,media=cdrom,size=612812K
machine: pc-q35-8.1
memory: 4096
meta: creation-qemu=8.1.5,ctime=1712446324
net0: virtio=BC:24:11:6F:38:6F,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsi0: vDisks:vm-102-disk-1,cache=writeback,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=4a38e490-22ea-4c79-954f-514de484091e
sockets: 1
tpmstate0: vDisks:vm-102-disk-2,size=4M,version=v2.0
vga: virtio
vmgenid: 61c12c7c-2778-4db7-a46f-1046278510a3
 
when i have CPU set to x86-64-v2-AES i believe this is the root issue stopping vGPU from working

1712467112934.png
 
I have recreated the VM from scratch.
  • host CPU
  • install windows, bypass joining to AAD or using ms account (i have a thesis windows hello enabling VBS is causing an issue on QEMU based systems, similarly to those having issues with WSL2)
  • dns was up long enough to download intel drivers
  • then disabled DNS in guest, then enabled RDP
  • remove default video and added vGPU PCIE device
  • rebooted
  • RDP in and install drivers
  • i installed no virtio drivers whatsoever and did not install guest tools
  • added bcdedit /set hypervisorlaunchtype off for later - i am hoping this prevents things installing i don't want to....
System is working with vGPU
Only difference I can see between working system and the system that did automatic repair is that VM secuity and nested virtualization is all turned off in this working system.

Now for a few reboot to see if that causes the failure that causes automatic repair loops
 

Attachments

  • Screenshot 2024-04-07 141522 (Large).png
    Screenshot 2024-04-07 141522 (Large).png
    721 KB · Views: 3
Last edited:
After two reboots:
1. worked on first start after rebooting pve node
2. code 43 appeared on the intel device that was working after 1 reboot
2. then after another vm reboot the intel device was working fine
3. then after another 4 vm reboots still gine

Code:
$ dmesg | grep -i kvm

[  215.914651] x86/split lock detection: #AC: CPU 2/KVM/4941 took a split_lock trap at address: 0x7ef1d050
[  215.914651] x86/split lock detection: #AC: CPU 3/KVM/4942 took a split_lock trap at address: 0x7ef1d050
[  215.914653] x86/split lock detection: #AC: CPU 1/KVM/4940 took a split_lock trap at address: 0x7ef1d050
[  221.196100] x86/split lock detection: #AC: CPU 1/KVM/5096 took a split_lock trap at address: 0x7ef3d050
[  233.697853] kvm: kvm [5026]: ignored rdmsr: 0xc0011029 data 0x0
[  235.680886] kvm: kvm [5026]: ignored rdmsr: 0x309 data 0x0
[  235.680917] kvm: kvm [5026]: ignored rdmsr: 0x30a data 0x0
[  235.680928] kvm: kvm [5026]: ignored rdmsr: 0x30b data 0x0
[  235.680938] kvm: kvm [5026]: ignored rdmsr: 0x38d data 0x0
[  235.680948] kvm: kvm [5026]: ignored rdmsr: 0x38e data 0x0
[  235.680959] kvm: kvm [5026]: ignored rdmsr: 0x38f data 0x0
[  235.680969] kvm: kvm [5026]: ignored rdmsr: 0x390 data 0x0
[  235.680981] kvm: kvm [5026]: ignored rdmsr: 0xc3 data 0x0
[  235.680991] kvm: kvm [5026]: ignored rdmsr: 0xc4 data 0x0
[  240.804382] x86/split lock detection: #AC: CPU 0/KVM/4939 took a split_lock trap at address: 0xfffff80534c316bd

I get the split_locks when the VM adcn vGPU works ok
I saw the Ignore rdmsr on the boot where the vGPU had a code 43

oh as a point of note i also updated the intel microcode just incase....

Code:
root@pve1:~# dmesg | grep -i microcode
[    0.000000] microcode: updated early: 0x410e -> 0x411c, date = 2023-08-30
[    1.345209] microcode: Microcode Update Driver: v2.2.
 
Last edited:
  • enabled DNS and applied all windows updates and rebooted when prompted, then did 4 more reboots - no adverse issues
  • installed virtio guest tools and then 4 more reboots - no adverse issues
  • bound my microsoft account - 2 reboots later automatic repair
tl;dr -
  1. anything that enables virtualization protection (WSL, windows hello, etc) will cause the repair boot loop issue when host is set as CPU
  2. yes this can be mitigated with cpu args, however this breaks vGPU passthrough for me
hope that finally answers for many the mystery of what is causing the issue, don't use host CPU and dont' enroll the guest OS inAAD or use a Microsoft account or anything that uses windows hello / hyper-v / wsl

if you want to use nested virtualization or windows hello accept you are not using vGPU
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!