Nvidia GPU Passthrough Assistance!

mattlach

Renowned Member
Mar 23, 2016
181
21
83
Boston, MA
Hi all,

I have been following the guide in the Proxmox Wiki on the topic, but I have run into issues.

First off, let me share my plan:
My proxmox box will also serve as a HTPC for two different TV's using two separate Geforce 720 GT cards passed through to VM's running Ubuntu and KODI. Since it is the latest LTS, I have started my testing with Ubuntu 16.04 in my VM.

I know iommu is working properly and passthrough is functioning (I previously passed through a couple of LSI SAS controllers to a different guest), but this GPU is giving me trouble.

First off, it says that the OVMF method is recommended, but doesn't say why. Can anyone share any more information here on why this is the preferred method?

I did the rom test, and I came up with negative results. (At least so I think, I got the first few lines then it ended with an error, something about the end of the file or something, forgot to take it down)

While the 720GT certainly is new enough to support UEFI, it seems like this particular model doesn't have it in the firmware. Either way, my host is a LGA1366 Xeon, pre-UEFI, so I don't think it would have worked anyway.

Because of this, I first tried the seabios PCIe method, followed by the PCI method, and neither are working for me.

nouveau is blacklisted properly and not running in either the VM or the host, and furthermore the nvidia GPU and audio are both added to the pci-stub in the host.

Here's my vmid.conf
Code:
args: -machine pc,max-ram-below-4g=1G
bootdisk: virtio0
cores: 2
cpu: host
cpuunits: 4096
hostpci0: 06:00.0,pcie=1,x-vga=on
hostpci1: 06:00.1,pcie=1
ide2: none,media=cdrom
machine: q35
memory: 2048
name: htpc1
net0: bridge=vmbr0,virtio=3A:63:62:33:32:36
net1: bridge=vmbr1,virtio=36:64:37:39:34:31
numa: 0
ostype: l26
smbios1: uuid=e0ccb955-9f1d-4810-9a6e-332f1fce5a94
sockets: 1
virtio0: local:151/vm-151-disk-1.qcow2,cache=writethrough,size=32G

Both the GPU and sound successfully pass through to the guest and are visible in lspci as follows:
Code:
$ lspci |grep -i nv
01:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 720] (rev a1)
02:00.0 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)

I actually get console output from the Ubuntu 16.04 VM on the screen connected to the GPU, so it is clearly passed through and functioning (albeit as a vga only console right now)

Then when I install the nvidia binary driver blob (361.42, latest in Ubuntu repository) it appears to work, but afterwards the x server fails to start, and nvidia-smi when run complains as follows:

Code:
$ nvidia-smi
No devices were found. Please make sure /dev/nvidia* files are readable by current user.

The /dev/nvidia files are present, and permissions look right:
Code:
$ ls -l /dev/nv*
crw-rw-rw- 1 root root 195,   0 May  3 21:18 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 May  3 21:18 /dev/nvidiactl
crw-rw-rw- 1 root root 246,   0 May  3 21:18 /dev/nvidia-uvm

But a look at dmesg shows that there is some sort of problem:
Code:
[    8.891906] nvidia: module license 'NVIDIA' taints kernel.
[    8.963904] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    9.081268] nvidia-nvlink: Nvlink Core is being initialized, major device number 247
[    9.083513] [drm] Initialized nvidia-drm 0.0.0 20150116 for 0000:01:00.0 on minor 0
[    9.083527] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  361.42  Tue Mar 22 18:10:58 PDT 2016
[    9.158369] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1c.1/0000:02:00.0/sound/card1/input5
[    9.158467] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1c.1/0000:02:00.0/sound/card1/input6
[    9.712317] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  361.42  Tue Mar 22 17:29:54 PDT 2016
[    9.743507] nvidia-uvm: Loaded the UVM driver in lite mode, major device number 246
[   10.459331] NVRM: RmInitAdapter failed! (0x25:0x40:1170)
[   10.459458] NVRM: rm_init_adapter failed for device bearing minor number 0
[   12.573279] NVRM: RmInitAdapter failed! (0x25:0x40:1170)
[   12.573503] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 1079.558553] NVRM: RmInitAdapter failed! (0x25:0x40:1170)
[ 1079.558827] NVRM: rm_init_adapter failed for device bearing minor number 0

At this point I am totally stuck, and would appreciate any suggestions anyone might have.

If I were able to get UEFI working (I could possibly get firmware from a different GT720 vendor) would that help? Since that is the recommended mode, is it more reliable? Would it even work on my older non-UEFI server?

I'll take any suggestions I can get at this point.

Much obliged,
Matt
 
Which version of proxmox do you use ?

#pveversion -v ?


First off, it says that the OVMF method is recommended, but doesn't say why. Can anyone share any more information here on why this is the preferred method?
because with seabios, some tricks need to be implemented to get the vga passthrough working , and it can work or not work, depend of the model of the card
 
Which version of proxmox do you use ?

#pveversion -v ?

Ahh, my bad, should have posted that. I ahve everything updated to current with 4.2. This is a relatively fresh install:

Code:
root@proxmox:~# pveversion -v
proxmox-ve: 4.2-48 (running kernel: 4.4.6-1-pve)
pve-manager: 4.2-2 (running version: 4.2-2/725d76f0)
pve-kernel-4.4.6-1-pve: 4.4.6-48
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-72
pve-firmware: 1.1-8
libpve-common-perl: 4.0-59
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-14
pve-container: 1.0-62
pve-firewall: 2.0-25
pve-ha-manager: 1.0-28
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie




because with seabios, some tricks need to be implemented to get the vga passthrough working , and it can work or not work, depend of the model of the card

Ahh OK. If the seabios method winds up not working for me, can my non-UEFI host actually host a UEFI guest with KVM? (This presumes I am able to find an UEFI firmware for my 720GT (GK208) from a different vendor.)

At least I am assuming that my video card currently lacks UEFI support based on the following rom-parser output:
Code:
root@proxmox:~/rom/rom-parser# ./rom-parser image.rom
Valid ROM signature found @0h, PCIR offset 190h
        PCIR: type 0 (x86 PC-AT), vendor: 10de, device: 1288, class: 030000
        PCIR: revision 0, vendor revision: 1
Error, ran off the end

Though maybe that error code means I can't trust it? I have no idea what "ran off the end" means.
 
Last edited:
Ahh OK. If the seabios method winds up not working for me, can my non-UEFI host actually host a UEFI guest with KVM? (This presumes I am able to find an UEFI firmware for my 720GT (GK208) from a different vendor.)

At least I am assuming that my video card currently lacks UEFI support based on the following rom-parser output:
Code:
root@proxmox:~/rom/rom-parser# ./rom-parser image.rom
Valid ROM signature found @0h, PCIR offset 190h
        PCIR: type 0 (x86 PC-AT), vendor: 10de, device: 1288, class: 030000
        PCIR: revision 0, vendor revision: 1
Error, ran off the end

Though maybe that error code means I can't trust it? I have no idea what "ran off the end" means.


Ahh, never mind. It looks like it definitely lacks UEFI support on the TechPowerup site. My version of the card has a different cooler with a fan, but the PCI id's are identical.

There is only one other 1024mb gt720 on tech powerup's site (by MSI) and it also lacks UEFI. Asus's 2GB model also lacks UEFI. So it looks like UEFI is not in the cards (pardon the bad pun).

I'd appreciate any help anyone can provide in getting this to work with Seabios though! :)
 
Ahh, never mind. It looks like it definitely lacks UEFI support on the TechPowerup site. My version of the card has a different cooler with a fan, but the PCI id's are identical.

There is only one other 1024mb gt720 on tech powerup's site (by MSI) and it also lacks UEFI. Asus's 2GB model also lacks UEFI. So it looks like UEFI is not in the cards (pardon the bad pun).

I'd appreciate any help anyone can provide in getting this to work with Seabios though! :)


Do you have tried ?:

before
Code:
hostpci0: 06:00.0,pcie=1,x-vga=on
hostpci1: 06:00.1,pcie=1
machine: q35

after
Code:
hostpci0: 06:00,x-vga=on

(don't use pcie/q35, avec use multifunction device)
 
Do you have tried ?:

before
Code:
hostpci0: 06:00.0,pcie=1,x-vga=on
hostpci1: 06:00.1,pcie=1
machine: q35

after
Code:
hostpci0: 06:00,x-vga=on

(don't use pcie/q35, avec use multifunction device)

I definitely tried passing it though as a multifunction device, and without PCIe, but I don't think I ever tried removing the Q35 machine setting (I'm not quite sure what this does).

I appreciate the recommendation and will try it this evening after work!

Thank you.
 
Do you have tried ?:

before
Code:
hostpci0: 06:00.0,pcie=1,x-vga=on
hostpci1: 06:00.1,pcie=1
machine: q35

after
Code:
hostpci0: 06:00,x-vga=on

(don't use pcie/q35, avec use multifunction device)


Thanks for the suggestion.

When I try this, the VM never becomes available on the network for me to ssh into.

I'm not quite sure what goes wrong, as I don't have any console (due to the x-vga=on configuration) so I can't see what is preventing it from working :(

There might be something in a log file somewhere, but I am not certain what I am looking for.
 
A few more notes:

I've tried the following without success:
  • All nvidia driver versions in 16.04 VM's repository, none work.
  • Tried installing Ubuntu 14.04 in the VM instead. Tried all driver versions, down as low as 340.96, still no success.
  • Tried disabling MSI/MSIX by passing pci-nomsi to VM kernel.
  • Tried just using "hostpci0: 06:00,x-vga=on" without q35 as you recommended above, but in 14.04 instead. 14.04 at least boots like this, but audio device does not show up in lspci in VM, (only GPU), and still have same "RmInitAdapter failed" problem
  • Based on an older thread tried disabling everything virtio, just in case the nvidia driver is seeing it. Still same problem.

At this point I am stuck, and I'd appreciate any suggestions anyone might have!

Thanks,
Matt
 
You might want to try emailing the vendor of your video card to see if they have an updated BIOS available. I had a EVGA Nvidia 660 Ti which techpowerup listed as not having UEFI support. I emailed EVGA support and they were able to give me an updated UEFI BIOS for my card.

I was able to get passthrough working with that card with a Windows 2012 vm. Here is what my config looked like:
Code:
bios: ovmf
boot: cdn
bootdisk: scsi0
cores: 3
cpu: host
hostpci0: 01:00,x-vga=on,pcie=1 #video
hostpci1: 00:1b.0,pcie=1 #onboard sound
hostpci2: 00:1d.0,pcie=1 #usb controller
hostpci3: host=02:00.0,pcie=1 #raid controller
machine: q35
memory: 8000
name: Windows
net0: virtio=32:31:37:66:31:35,bridge=vmbr0
numa: 0
ostype: other
scsihw: virtio-scsi-pci
smbios1: uuid=c4583b06-fc2f-4b16-b488-d1b372b5cfa9
sockets: 1
tablet: 0
usb0: host=1532:010d #mouse
usb1: host=04d9:fc02 #keyboard
scsi1: storage:vm-100-disk-1,size=1024G

You could try passing the whole video card through
Code:
hostpci0: 06:00,pcie=1,x-vga=on
machine: q35
That's what I did with mine.
I also saw somewhere that cpu should be set to host and ostype should be other. I can't seem to find the source for that statement though.
 
You might want to try emailing the vendor of your video card to see if they have an updated BIOS available. I had a EVGA Nvidia 660 Ti which techpowerup listed as not having UEFI support. I emailed EVGA support and they were able to give me an updated UEFI BIOS for my card.

Good advice. I had an email conversation with PNY support today, and they informed me that they have "all of the GT 720 cards we have ever manufactured have a UEFI BIOS", so the rom parser and Techpowerup must both be wrong I guess.

I am going to try to reinstall Ubuntu in UEFI mode with OVMF and see if it works!
 
Hmm.

Well, I Reinstalled my ubuntu VM in OVMF/UEFI mode, and unfortunately I am still getting the "NVRM: RmInitAdapter failed! (0x25:0x28:1197)" message in dmesg.

Things I ahve tried in order to resolve it:
  • Change OS type to "other"
  • Switch to machine q35 and pcie=1 (this now results in the VM failing to boot)
  • Switch to single device (passing through 06:00, instead of 06:00.0 and 06:00.1 separately) This results in the audio device not showing up in lspci, and the GPU still not activating.

I'd appreciate any other suggestions anyone might have!
 
Whoops. Moving back here, this was in the wrong thread:


Ahh, I did see that, but I thought they were just there as an example file, not as a required setting.


I will try that and see if it works. Thank you :)

So, I have addressed all the above, and gone over the wiki again with a fine toothed comb, just to make sure I haven't made any other silly assumptions, like the one above. Still, I have nothing.

I am a booting with OVMF/UEFI, I have made sure that my scsi settings match those in the guide, I have tried setting ostype to "other" and still, I have the same error:

Current Config:
Code:
args: -machine pc,max-ram-below-4g=1G
bios: ovmf
boot: c
bootdisk: scsi0
cores: 2
cpu: host
cpuunits: 4096
hostpci0: 06:00.0,x-vga=on
hostpci1: 06:00.1
ide2: none,media=cdrom
memory: 2048
name: htpc1
net0: bridge=vmbr0,virtio=32:35:66:39:37:65
net1: bridge=vmbr1,virtio=36:35:63:35:64:32
numa: 0
ostype: other
scsi0: container:vm-151-disk-1,cache=writethrough,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=60af2eae-e53b-4d1b-95d0-9a6faff5521f
sockets: 1

And still getting the same error in dmesg:
Code:
[    4.696458] nvidia: module license 'NVIDIA' taints kernel.
[    4.720145] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    4.863854] [drm] Initialized nvidia-drm 0.0.0 20150116 for 0000:00:10.0 on minor 0
[    4.863861] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  340.96  Sun Nov  8 22:33:28 PST 2015
[    5.141946] nvidia_uvm: Loaded the UVM driver, major device number 247
[    5.379765] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:11.0/sound/card0/input5
[    5.379899] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:11.0/sound/card0/input6
[    5.854321] NVRM: RmInitAdapter failed! (0x25:0x28:1197)
[    5.854433] NVRM: rm_init_adapter failed for device bearing minor number 0
[    5.854920] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5

It is frustrating as well, because I have found evidence that others have succeeded with their GT 720's both in Seabios AND OVMF, but neither seem to work for me...

I just noticed that my ethernet devices for the VM are virtio. Could these be triggering the Nvidia driver to not load in order to block passthrough? Also LSPCI from inside the VM is full of RedHat/Virtio/QEMU references. Could these be a problem too?

Code:
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:03.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
00:05.0 SCSI storage controller: Red Hat, Inc Virtio SCSI
00:10.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 720] (rev a1)
00:11.0 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
00:12.0 Ethernet controller: Red Hat, Inc Virtio network device
00:13.0 Ethernet controller: Red Hat, Inc Virtio network device
00:1e.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
00:1f.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge

Again, any suggestions anyone might have would be GREATLY appreciated!
 
I just noticed that my ethernet devices for the VM are virtio. Could these be triggering the Nvidia driver to not load in order to block passthrough?

Nope, That didn't do it. Switched them to E1000, still same errors.

There are also still a lot of RedHat/Virtio/Qemu references in LSPCI, are these a problem?

Code:
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:03.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
00:05.0 SCSI storage controller: Red Hat, Inc Virtio SCSI
00:10.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 720] (rev a1)
00:11.0 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
00:12.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
00:13.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
00:1e.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
00:1f.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
I'm at my wits end here, and I must get this work!

I'd appreciate any and all suggestions!
 
Last edited:
Some random ideas:

Try setting motherboard BIOS to offboard VGA, instead of using onboard.

Try setting "Start at boot" to "Yes" and rebooting the Proxmox VE node.

Install Ubuntu 16.04 on a USB key and boot directly into that, and make sure it is able to use the GFX card on bare metal by running glmark2.
 
Try reducing memory allocated < 2 GB.

Try running in Proxmox VE: update-initramfs -u

Try using 32-bit Ubuntu in the guest VM.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!