[SOLVED] PCI-E Passthrough GTX 1080 Ti on Ryzen platform

szafran

Renowned Member
Aug 31, 2012
45
0
71
Hi,

I'm trying to resolve the infamous 43 error with installing NVidia drivers on a Windows 10 VM with a GTX 1080 Ti passed to it. I've tried some different configs and nothing works. Earlier I was testing the setup with my little 10" HDMI test monitor, and I've managed to setup the drivers once (using BIOS setup), but after that (while waiting for the main monitor to arrive) I've started to try different configs, and now nothing seems to be working.
I've reverted most of the changes that I've done, and am ready to start over.
Would appreciate if someone could help me with this.

My HW:
MB: ASRock AB350M Pro4
CPU: AMD Ryzen 1700
MEM: 64GB
Storage: >30TB
GPU in slot 2: MSI Armor OC GTX 1080 Ti (for passthrough)
GPU in slot 3: Some old Radeon X300 (for PVE console)
Monitor: Acer Predator Z1 (2560x1080)

Now some config info from PVE etc.:
Code:
root@vmserver:/# cat /etc/default/grub | grep GRUB_CMDLINE_LINUX_DEFAULT
GRUB_CMDLINE_LINUX_DEFAULT="modprobe.blacklist=nouveau quiet amd_iommu=on amd_iommu=pt

Code:
root@vmserver:/# dmesg | grep AMD-Vi
[    0.861498] AMD-Vi: IOMMU performance counters supported
[    0.863480] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[    0.863480] AMD-Vi: Extended features (0xf77ef22294ada):
[    0.863483] AMD-Vi: Interrupt remapping enabled
[    0.863483] AMD-Vi: virtual APIC enabled
[    0.863725] AMD-Vi: Lazy IO/TLB flushing enabled

Code:
root@vmserver:/# lspci | grep "VGA\|IOMMU"
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Device 1451
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV370 [Radeon X300]
09:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1)

Code:
root@vmserver:/# for a in /sys/kernel/iommu_groups/*; do find $a -type l; done
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.1
/sys/kernel/iommu_groups/10/devices/0000:00:18.6
/sys/kernel/iommu_groups/10/devices/0000:00:18.4
/sys/kernel/iommu_groups/10/devices/0000:00:18.2
/sys/kernel/iommu_groups/10/devices/0000:00:18.0
/sys/kernel/iommu_groups/10/devices/0000:00:18.7
/sys/kernel/iommu_groups/10/devices/0000:00:18.5
/sys/kernel/iommu_groups/10/devices/0000:00:18.3
/sys/kernel/iommu_groups/10/devices/0000:00:18.1
/sys/kernel/iommu_groups/11/devices/0000:01:00.0
/sys/kernel/iommu_groups/12/devices/0000:07:00.0
/sys/kernel/iommu_groups/12/devices/0000:03:00.1
/sys/kernel/iommu_groups/12/devices/0000:04:01.0
/sys/kernel/iommu_groups/12/devices/0000:06:00.0
/sys/kernel/iommu_groups/12/devices/0000:04:04.0
/sys/kernel/iommu_groups/12/devices/0000:05:00.0
/sys/kernel/iommu_groups/12/devices/0000:04:00.0
/sys/kernel/iommu_groups/12/devices/0000:07:00.1
/sys/kernel/iommu_groups/12/devices/0000:03:00.2
/sys/kernel/iommu_groups/12/devices/0000:03:00.0
/sys/kernel/iommu_groups/13/devices/0000:09:00.1
/sys/kernel/iommu_groups/13/devices/0000:09:00.0
/sys/kernel/iommu_groups/2/devices/0000:00:01.3
/sys/kernel/iommu_groups/3/devices/0000:00:02.0
/sys/kernel/iommu_groups/4/devices/0000:00:03.0
/sys/kernel/iommu_groups/5/devices/0000:00:03.1
/sys/kernel/iommu_groups/6/devices/0000:00:04.0
/sys/kernel/iommu_groups/7/devices/0000:11:00.2
/sys/kernel/iommu_groups/7/devices/0000:11:00.0
/sys/kernel/iommu_groups/7/devices/0000:00:07.1
/sys/kernel/iommu_groups/7/devices/0000:11:00.3
/sys/kernel/iommu_groups/7/devices/0000:00:07.0
/sys/kernel/iommu_groups/8/devices/0000:12:00.2
/sys/kernel/iommu_groups/8/devices/0000:00:08.1
/sys/kernel/iommu_groups/8/devices/0000:12:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:08.0
/sys/kernel/iommu_groups/9/devices/0000:00:14.0
/sys/kernel/iommu_groups/9/devices/0000:00:14.3

Code:
root@vmserver:/# pveversion -v
proxmox-ve: 5.0-19 (running kernel: 4.10.17-2-pve)
pve-manager: 5.0-30 (running version: 5.0-30/5ab26bc)
pve-kernel-4.10.17-2-pve: 4.10.17-19
pve-kernel-4.10.15-1-pve: 4.10.15-15
pve-kernel-4.10.17-1-pve: 4.10.17-18
libpve-http-server-perl: 2.0-5
lvm2: 2.02.168-pve3
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-12
qemu-server: 5.0-15
pve-firmware: 2.0-2
libpve-common-perl: 5.0-16
libpve-guest-common-perl: 2.0-11
libpve-access-control: 5.0-6
libpve-storage-perl: 5.0-14
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-9
pve-qemu-kvm: 2.9.0-3
pve-container: 2.0-15
pve-firewall: 3.0-2
pve-ha-manager: 2.0-2
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.0.8-3
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.6.5.9-pve16~bpo90

I've remove all the config files that I've created since I've started to play around with passthrough (except the iommu and blacklist in the grub config).

I do have 2 more days that I can test this setup before I have to return the card. I'd like to keep it and make it work - since the alternative would be getting an AMD card (maybe RX 580), and my monitor has G-Sync, which won't work with AMD.

Does anyone have any ideas ? If more info is needed then please write what I have to provide.

Best regards
Szafran
 
That would be great. But AMD means more $ for less fps and higher TDP. And on top of that I'd have to buy a new monitor with FreeSync (and I really like this one).

I've managed to put offline PVE for some time yesterday and did install a clean Windows 10 as main OS. Now I know that my config works 100% at 2560x1080 at 100Hz using my 10m DP cable (I've got a 15m cable on the way for testing because a10+2m extension didn't work). So I can go back to trying to make it work inside a VM.
 
I get it, it sucks.. But you are using a function Nvidia does not want you to use, which mean you will get problems in the future as well.. maybe you solve this, but until next time they do something and you have to start over..

That said.. Look at unraid forums for nvidia-solutions..
 
Thanks for any tips guys.

I don't get it - now it just works :D
I've Cloned my standard Win10 template, and just used basic setup (the weird thing is I had to use romfile before, and now it works without) using vfio but used standard SeaBIOS.
 
... And the story continues... As @vooze suggested I'm now in the process of trying to setup a Vega 64 (Sapphire RX Vega 64 8 GB in my case) instead of the GTX... As I found out in the process those are to troublesome. Lots of AMD cards suffer from "AMD reset bug" - this means that if the card is initialized by an OS it's not being after that given back and one must reboot the whole server to make it work again. And RX Vegas suffer from that too.

It took me two days to force the PC to initialize the second card and leave the first one with a black screen. Now I'm trying to install that on a Windows 10 machine. After couple dozens of tries I'm still getting a BSOD while the drivers are trying to install. And this as it turns out is also a known problem, and as the reset problem this one is also not fixed by AMD after a long time. This is more frustrating than the NVIDIA installation as every time I start the VM and something goes wrong I have to reboot the whole Proxmox server so the card resets itself.

I've tried different methods etc. Nothing helps with the driver install.

Maybe someone here has a solution?

EDIT: And the Vega goes back to the seller, as the only workaround to the driver install BSOD I was able to find is to install Win7 - and I'm not going to do that. So it's back to the GTX for me.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!