GPU passthrough issue

btoloni

Member
Jun 20, 2020
5
0
6
45
Hi all,

I am posting because I have been trying to get this to work for several days and cannot find any discussion of anyone having this specific issue. I have followed several guides and I was able to get my Nvidia GTX 750 Ti to show up under Device Manager in Windows 10 Pro (in addition to Microsoft Remote Display Adapter), and I was also able to install the latest Nvidia drivers. However, when I try to run any software or benchmark that requires Direct3D, it crashes. Steam games for example crash with "Failed to create D3D device", and Userbenchmark fails to run the GPU benchmark with the error "WARN: skipping NVIDIA GeForce GTX 750 Ti - unable to locate attached display." The benchmark detects 2 GPUs including the Microsoft Remote Display Adapter and gives the same error on this one. I also have an HDMI cable plugged into the GPU but it is not connected to a display.

In addition to not being able to run Direct3D, the VM display is running very sluggish on a wired 1Gb connection (slower than without GPU passthrough). Surprisingly Cinebench R15 GPU (OpenGL) benchmark runs and scores reasonably.

Below is my configuration. I would greatly appreciate any help, thank you very much!

/etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:eek:ff,efifb:eek:ff"

/etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/modprobe.d/blacklist.conf
blacklist nvidiafb
blacklist nvidia
blacklist radeon
blacklist nouveau

/etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1

root@pve:~# lspci -n -s 01:00
01:00.0 0300: 10de:1380 (rev a2)
01:00.1 0403: 10de:0fbc (rev a1)

/etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1380,10de:0fbc disable_vga=1

/etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1

/etc/pve/qemu-server/101.conf
agent: 1
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NV43FIX,kvm=off'
balloon: 2048
bios: ovmf
bootdisk: scsi0
cores: 4
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-101-disk-1,size=4M
hostpci0: 01:00,pcie=1,x-vga=on,romfile=GTX750Ti.rom
ide0: local:iso/virtio-win-0.1.173.iso,media=cdrom,size=384670K
ide2: local:iso/Windows10-Official.iso,media=cdrom
machine: q35
memory: 8096
name: Windows10test
net0: virtio=6A:E5:71:71:AB:24,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: local-lvm:vm-101-disk-0,cache=writeback,size=60G
scsihw: virtio-scsi-pci
smbios1: uuid=ccfc7b76-99ed-4c9d-a770-71784d079b8c
sockets: 1
vga: none
vmgenid: d374e8c3-62ed-4d08-8341-31c1cf54c6f9

(I downloaded the ROM file from techpowerup but this did not help either)

Host System:
Asrock Z97e-ITX
Intel i5 4590S
Nvidia GTX 750 Ti
16GB DDR3
 
Nothing really sticks out aside from the two lines highlighted in bold. Not really sure what else to try at this point aside from trying an older version of Proxmox.


root@pve:~# dmesg -T | grep -e BAR -e Intel -e iommu -e IOMMU -e passthrough -e DMAR -e bug -e Bug

[Sat Jun 20 00:01:24 2020] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.34-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on
[Sat Jun 20 00:01:24 2020] Intel GenuineIntel
[Sat Jun 20 00:01:24 2020] ACPI: DMAR 0x00000000BD5FECA0 0000B8 (v01 INTEL BDW 00000001 INTL 00000001)
[Sat Jun 20 00:01:24 2020] ACPI: SSDT 0x00000000BD5FED58 000579 (v01 Intel_ IsctTabl 00001000 INTL 20120711)
[Sat Jun 20 00:01:24 2020] Reserving Intel graphics memory at [mem 0xbf200000-0xcf1fffff]
[Sat Jun 20 00:01:24 2020] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x22 (or later)
[Sat Jun 20 00:01:24 2020] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.34-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on
[Sat Jun 20 00:01:24 2020] DMAR: IOMMU enabled
[Sat Jun 20 00:01:24 2020] DMAR: Host address width 39
[Sat Jun 20 00:01:24 2020] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[Sat Jun 20 00:01:24 2020] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
[Sat Jun 20 00:01:24 2020] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[Sat Jun 20 00:01:24 2020] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c20660462 ecap f010da
[Sat Jun 20 00:01:24 2020] DMAR: RMRR base: 0x000000bdea8000 end: 0x000000bdeb6fff
[Sat Jun 20 00:01:24 2020] DMAR: RMRR base: 0x000000bf000000 end: 0x000000cf1fffff
[Sat Jun 20 00:01:24 2020] DMAR-IR: IOAPIC id 8 under DRHD base 0xfed91000 IOMMU 1
[Sat Jun 20 00:01:24 2020] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[Sat Jun 20 00:01:24 2020] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[Sat Jun 20 00:01:24 2020] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.

[Sat Jun 20 00:01:24 2020] DMAR-IR: Enabled IRQ remapping in xapic mode
[Sat Jun 20 00:01:24 2020] smpboot: CPU0: Intel(R) Core(TM) i5-4590S CPU @ 3.00GHz (family: 0x6, model: 0x3c, stepping: 0x3)
[Sat Jun 20 00:01:24 2020] Performance Events: PEBS fmt2+, Haswell events, 16-deep LBR, full-width counters, Intel PMU driver.
[Sat Jun 20 00:01:24 2020] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[Sat Jun 20 00:01:24 2020] pci 0000:00:02.0: BAR 2: assigned to efifb
[Sat Jun 20 00:01:24 2020] pci 0000:00:1c.0: Intel PCH root port ACS workaround enabled
[Sat Jun 20 00:01:24 2020] pci 0000:00:1c.3: Intel PCH root port ACS workaround enabled
[Sat Jun 20 00:01:24 2020] iommu: Default domain type: Translated
[Sat Jun 20 00:01:24 2020] DMAR: No ATSR found
[Sat Jun 20 00:01:24 2020] DMAR: dmar0: Using Queued invalidation
[Sat Jun 20 00:01:24 2020] DMAR: dmar1: Using Queued invalidation
[Sat Jun 20 00:01:24 2020] pci 0000:00:00.0: Adding to iommu group 0
[Sat Jun 20 00:01:24 2020] pci 0000:00:01.0: Adding to iommu group 1
[Sat Jun 20 00:01:24 2020] pci 0000:00:02.0: Adding to iommu group 2
[Sat Jun 20 00:01:24 2020] pci 0000:00:03.0: Adding to iommu group 3
[Sat Jun 20 00:01:24 2020] pci 0000:00:14.0: Adding to iommu group 4
[Sat Jun 20 00:01:24 2020] pci 0000:00:16.0: Adding to iommu group 5
[Sat Jun 20 00:01:24 2020] pci 0000:00:19.0: Adding to iommu group 6
[Sat Jun 20 00:01:24 2020] pci 0000:00:1a.0: Adding to iommu group 7
[Sat Jun 20 00:01:24 2020] pci 0000:00:1b.0: Adding to iommu group 8
[Sat Jun 20 00:01:24 2020] pci 0000:00:1c.0: Adding to iommu group 9
[Sat Jun 20 00:01:24 2020] pci 0000:00:1c.3: Adding to iommu group 10
[Sat Jun 20 00:01:24 2020] pci 0000:00:1d.0: Adding to iommu group 11
[Sat Jun 20 00:01:24 2020] pci 0000:00:1f.0: Adding to iommu group 12
[Sat Jun 20 00:01:24 2020] pci 0000:00:1f.2: Adding to iommu group 12
[Sat Jun 20 00:01:24 2020] pci 0000:00:1f.3: Adding to iommu group 12
[Sat Jun 20 00:01:24 2020] pci 0000:01:00.0: Adding to iommu group 1
[Sat Jun 20 00:01:24 2020] pci 0000:01:00.1: Adding to iommu group 1
[Sat Jun 20 00:01:24 2020] pci 0000:03:00.0: Adding to iommu group 13
[Sat Jun 20 00:01:24 2020] DMAR: Intel(R) Virtualization Technology for Directed I/O
[Sat Jun 20 00:01:24 2020] ehci-pci 0000:00:1a.0: debug port 2
[Sat Jun 20 00:01:24 2020] ehci-pci 0000:00:1d.0: debug port 2
[Sat Jun 20 00:01:25 2020] intel_pstate: Intel P-state driver initializing
[Sat Jun 20 00:01:25 2020] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[Sat Jun 20 00:01:25 2020] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[Sat Jun 20 00:01:25 2020] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
[Sat Jun 20 00:01:27 2020] Disabling lock debugging due to kernel taint
[Sat Jun 20 00:01:28 2020] i915 0000:00:02.0: DMAR active, disabling use of stolen memory
[Sat Jun 20 00:01:28 2020] input: HDA Intel PCH Front Mic as /devices/pci0000:00/0000:00:1b.0/sound/card1/input4
[Sat Jun 20 00:01:28 2020] input: HDA Intel PCH Rear Mic as /devices/pci0000:00/0000:00:1b.0/sound/card1/input5
[Sat Jun 20 00:01:28 2020] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1b.0/sound/card1/input6
[Sat Jun 20 00:01:28 2020] input: HDA Intel PCH Line Out Front as /devices/pci0000:00/0000:00:1b.0/sound/card1/input7
[Sat Jun 20 00:01:28 2020] input: HDA Intel PCH Line Out Surround as /devices/pci0000:00/0000:00:1b.0/sound/card1/input8
[Sat Jun 20 00:01:28 2020] input: HDA Intel PCH Line Out CLFE as /devices/pci0000:00/0000:00:1b.0/sound/card1/input9
[Sat Jun 20 00:01:28 2020] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1b.0/sound/card1/input10
[Sat Jun 20 00:01:28 2020] input: HDA Intel HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.0/sound/card0/input12
[Sat Jun 20 00:01:28 2020] input: HDA Intel HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.0/sound/card0/input13
[Sat Jun 20 00:01:28 2020] input: HDA Intel HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.0/sound/card0/input14
[Sat Jun 20 00:01:28 2020] input: HDA Intel HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.0/sound/card0/input15
[Sat Jun 20 00:01:28 2020] input: HDA Intel HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.0/sound/card0/input16
 
Looking at 'dxdiag', it looks like my primary display is Microsoft Remote Display Adapter. This is despite selecting the Nvidia GTX 750 Ti as the primary GPU. If I remove Microsoft Remote Display Adapter from device manager, it just comes back on the next reboot.

dxdiag2.PNG
 
Not much to to work with here, indeed, all looks good.

Not sure if this line should read passthrough instead, i know it shows passthrough with my setup.
[Sat Jun 20 00:01:24 2020] iommu: Default domain type: Translated

In the past i noticed reordering kernel boot paramters does make a difference. Just a guess to try

compare cat /proc/cmdline with the contents of /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT=
"quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:eek:ff,efifb:eek:ff"

consider
"iommu=pt intel_iommu=on pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:eek:ff,efifb:eek:ff quiet"
 
I am having this same issue with a GTX 1650 Super on a Windows 10 Pro VM. Were you ever able to find a fix for this?
 
/etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:eek:ff,efifb:eek:ff"
video=vesafb:off,efifb:off is a common mistake. It should be video=vesafb:off video=efifb:off. But beware that you might not have any output to the console for troubleshooting or a console to login.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!