Nested Virtualization Error or GPU Code 43

Mar 29, 2020
17
0
6
54
Hello,

I am running into issues passing a GPU Nvidia Quadro T1000 presented in the following post on your support forum:

https://forum.proxmox.com/threads/question-about-cpu-type.125844/#post-549814

Host is a Ryzen 7950X w/ 128 GB Ram, 2 GPU (RTX 3090 Ti & Quadro T1000). I want to pass the Quadro T1000 to the Windows 10 VM.

Here are more details:

1/ GRUB Cmd line:

Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet video=vesafb:off video=efifb:off video=simplefb"

2/ /etc/modprobe.d/pve-blacklist.conf:

Code:
blacklist nvidiafb
blacklist amdgpu
blacklist snd_hda_intel
blacklist nouveau
blacklist nvidia
blacklist radeon

3/ The VM Setup:

Code:
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide0
cores: 8
cpu: qemu64
efidisk0: zfs_tank:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:0b:00,pcie=1,rombar=0
hostpci2: 0000:0a:00,pcie=1,x-vga=1
ide0: local:iso/virtio-win-0.1.229.iso,media=cdrom,size=522284K
machine: pc-q35-7.2
memory: 16384
meta: creation-qemu=7.2.0,ctime=1680520922
name: Windows10VM
net0: e1000=4E:5F:A4:6A:A7:45,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: win10
scsi0: zfs_tank:vm-101-disk-1,aio=threads,cache=writeback,discard=on,iothread=1,size=500G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=6cb36ac2-ad1e-4c03-aff7-48308db4eeeb
sockets: 1
tablet: 0
usb0: host=256f:c652
usb1: host=046d:c328
usb2: host=046d:c07e
vmgenid: c1221ab6-4739-45f2-8ab1-a1b94af742c0

4/ Our Windows setup uses nested virtualization (Hyper-V stuff).

  • If I use 'host' as the CPU type, the Hyper-V stuff works, but the GPU gets a code 43.
  • If I use CPU Type 'qemu64', the GPU works but the Hyper-V stuff fails to work (2 drivers gets code 37).
Any help you can provide would be much appreciated - thanks!
 
Maybe this will not help you at all, but...
GRUB_CMDLINE_LINUX_DEFAULT="quiet video=vesafb:off video=efifb:off video=simplefb"
Please note that video=vesafb:off video=efifb:off video=simplefb have not worked on Proxmox for some time. If you want to passthrough the boot GPU, you need to use this work-around. Also double check with cat /proc/cmdline because not all Proxmox installations use GRUB.
 
Maybe this will not help you at all, but...

Please note that video=vesafb:off video=efifb:off video=simplefb have not worked on Proxmox for some time. If you want to passthrough the boot GPU, you need to use this work-around. Also double check with cat /proc/cmdline because not all Proxmox installations use GRUB.
I thought
Code:
simplefb:off
was brand new as of 7.2?

Code:
cat /proc/cmdline:

BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet video=vesafb:off video=efifb:off video=simplefb

I knew it used grub because I set it up that way ;-)
 
Why aren't you using the iommu flags on kernel? like:

amd_iommu=on iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1 vfio-pci.ids=XXXX:XXXX,XXXX:XXXX

If you're using a 7950x, that processor has integrated graphics ... so you don't need to blacklist amdgpu module. I'm trying to do the same thing, a Win10 VM with Hyper-V enabled but I'm stuck at error 43. Without enabling Hyper-V the passthrough works, but I need all the iommu and vfio stuff.
 
I have the exactly same issue running Proxmox 7.4-3

The minimal necessary setup for AMD-processor (have an 5600G myself) and an NVIDIA-Card (GTX 1060 in my case) should be:

UEFI/BIOS:
- IOMMU and SVM activated and explicity enabled (not auto) in UEFI/BIOS

Modules (/etc/modules):
vfio vfio_iommu_type1 vfio_pci vfio_virqfd

Kernel Flag (Edit /etc/kernel/cmdline or grub equivalent)
- iommu=pt as the only kernel flag (amd_iommu is enabled by default already, amdgpu on the Ryzen can work for the console on the hypervisor and Nvidia Cards can reset themselves usually without further parameters here (Pascal 1000 GPUs see update2))

Drivers and blacklisting via modprobe
- /etc/modprobe.d/vfio.conf (or similar name) for explicitly listing the ids of the devices for paththrough, disable_vga as a precaution against vga arbitrage/hypervisor utilization - The IDs have to be properly adapted to everyones own setup
options vfio-pci ids=10de:1c03,10de:10f1 disable_vga=1
- /etc/modprobe.d/blacklist-nvidia.conf
blacklist nouveau blacklist nvidia
- /etc/modprobe.d/
This helps to avoid certain crashes of the Windows guest by ignoring debugging functions and occurring errors arising there from.
options kvm ignore_msrs=1 options kvm report_ignored_msrs=0

Everything else should be unnecessary for the given case (AMD APU with Nvidia Card)
---
@smorvan What I could make working is the vGPU Setup with Nvidia. If you install the enterprise driver of Nvidia and create vGPU Profiles and attach them to a VM with cpu type host it worked out for me.

(vgpu-unlock / vgpu-unlock-rs with the keyword proxmox will give you plenty of results to work with, even though unlock is specifically for consumer cards, but the rest will apply to you equally)

---
I was interested in directly making gpu passthrough work as I want to directly utilize the monitor output of my GTX 1060 for a virtualized but on-hand Workstation. It helped a lot for debugging the issue as I can tell by direct output that the card is working. Even I get the code 43 in that scenario, it still lets the Nvidia GPU reset correctly every single time, it will usually boot up (even though with slight delay) and Windows initializing an Output every time, even though limited to 1280x800 and I can´t change resolution and refresh rate to higher values.

That might help as an information, because with a Quadro without output that kind of function is harder to tell upon.

I also tried simulating other Processor Architectures, add Arguments, tried kvm=off, hidden=1, hw_vendor_id etc. pp.

I settled for the following args (but that is for nested virtualization and hyper-v only):
args: -cpu host,+svm,-hypervisor,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex,hv_runtime,hv_crash,hv_time,hv_synic,hv_stimer,hv_tlbflush,hv_ipi,hv_reset,hv_frequencies,hv_reenlightenment,hv_stimer_direct,hv-no-nonarch-coresharing=auto

Even though most of them would probably not be necessary. I hope that further information might give someone a better idea.

Update: I got it working 2-3 or three times after changing some hardware wildly around, but it usually would never stick after a reboot. But it can just be a driver issue, because with an Ubuntu Desktop it works out of the box with Full acceleration, Multiple-Monitor Output, High refresh rate and Neste Virtualization

---
Update2:
For my case Pascal 1000 series (GTX 1060 in my case) it is necessary to patch the VGA Bios of the card to unlock it for Usage within the VM. For further information follow the link in the Proxmox Wiki "PCI Passthrough", section "GPU passthrough", note on the top and keyword vbios patcher. (In case of Pascal 1000 series GPU ONLY).

To make sure the graphics card is not initialized wrongly the kernel parameter initcall_blacklist=sysfb_init is recommendable, but not necessary for me personally.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!