[SOLVED] PCI-E Passthrough: RTX8000 works but cannot get to work RTX2080

landei

New Member
Jan 7, 2020
24
4
3
37
Hi,

I have trouble passing RTX2080 GPUs to a VM.
First I passed all GPUs (4x RTX2080Ti, 1x RTX8000) to a VM and all of them were found with lspci -nnk | grep "VGA".

Trying nvidia-smi -L correctly identifies the RTX8000, but for all 2080Ti I get: Unable to determine the device handle for gpu 0000:01:00.0: Unknown Error.

I tried passing only one 2080Ti and then:
- tested different drivers (430 and 435)
- changed the rombar setting between visible and unvisible
- checked OVMF compability with rom-parser
- downloaded rom from gpu and used it in the settings file of the VM

Nothing helped, the error stays the same and I have no more ideas. :-(

In the following you can see the current configuration:

This is my VM config:
Code:
bios: ovmf
bootdisk: scsi0
cores: 8
efidisk0: local-zfs:vm-201-disk-1,size=128K
hostpci0: 1b:00,pcie=1,romfile=bios1b.bin
ide2: local:iso/ubuntu-18.04.3-live-server-amd64.iso,media=cdrom
machine: q35
memory: 70000
name: b1234
net0: virtio=42:66:1A:A9:C4:FE,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-zfs:vm-201-disk-0,size=50G
scsihw: virtio-scsi-pci
smbios1: uuid=5c0209ba-5f96-4c86-9ca1-bf4b10efb2a9
sockets: 2
vmgenid: fbae32d8-b1b0-49b7-bacf-889059a6ccde

Bootline:
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off,efifb:off

/etc/modprobe.d/iommu_unsafe_interrupts.conf:
options vfio_iommu_type1 allow_unsafe_interrupts=1

/etc/modprobe.d/kvm.conf:
options kvm ignore_msrs=1 allow_unsafe_assigned_interrupts=1

/etc/modprobe.d/vfio.conf:
options vfio-pci ids=10de:1e04,10de:1ad7,10de:1e30,10de:1e04,10de:1e04,10de:10f7,10de:1ad6,10de:1ad7 disable_vga=1
Here I noticed, that all GPUs share the same vendor IDs for audio (10de:10f7), serial (10de:1ad6) and usb (10de:1ad7).

/etc/modules:
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/modprobe.d/blacklist.conf:
Code:
blacklist radeon
blacklist nouveau
blacklist nvidia


What could be the problem and what could I try?
Thank you very much! :-)
 
Last edited:
Hi, I just rewrote the whole thread entry to make everything clearer and added updated information.
Thanks
 
Hi Landei, I think this part could be the problem

Code:
video=vesafb:off,efifb:off


You should try

Code:
video=vesafb:off, video=efifb:off
 
and also the blacklist.conf

Code:
echo "blacklist i2c-nvidia-gpu" >> /etc/modprobe.d/blacklist.conf
update-initramfs -u -k all
reboot
 
Hi,
thank you very much for the input!
In the meantime I gave up the PCI-E-passthrough-approach and went with LXC-containers instead.

With this I will stay.
Anyway, thanks again and your information might be usefull for someone else! :)
 
Hi,
benefits... well it works :)

You need to inform yourself of benefits and drawbacks of LXC-Containers and VMs in general.
LXC has less overhead but only works with Linux and is not as isolated from the host as a VM is.

Regarding the GPUs one issue with LXC is the only on the host you can see the GPU-utilization using nvidia-smi.
An advantage is that even consumer GPUs can be accessed from several containers.

I basically followed these instructions:
1585136623414.png

I did not use the patch for the max. encoding sessions.

I hope this helps.