GPU Nvidia 1660 passthrough

ezerez

New Member
Nov 13, 2020
2
0
1
38
Hello

I am having trouble with passing through my Nvidia GPU. The stupid thing is that it was working perfectly before. I had some trouble with my nvme drive and replaced it. So I did a fresh install of proxmox and now i cant get my gpu to work anymore. I already followed dozens of guides and fixes but nothing seems to work, the GPU wont work in the VM alltough in the host everything seems fine. I will include some info from my configs that i am using now, alltough i have tried multiple versions already. I am wondering if there was an update in proxmox that broke it since i updated the distro after install and the previous install was probably running a lower version. What I also did was update the bios because i though that might be the problem with my nvme drive, but i reversed the bios already to the version i was using before and it still isnt working.

What I already tried
- Installing debian, ubuntu 20.04, 18.04
- only passing through vga and hdmi audio
- numa on and off
- alot I found around this forum, reddit, google etc :rolleyes:

any help or tips what I could try would be greatly appreciated

Hardware
Code:
CPU; AMD Ryzen 2700
GPU: GeForce GTX 1660 Aorus, GV-N1660-OC-6GD
SSD: Samsung 960 Evo 500gb
Memory: Corsair Vengeance LPX 2x16gb 3200mhz
Mainbord: Asrock X570M Pro4


100.conf
Code:
agent: 1
balloon: 0
bios: ovmf
boot: order=ide2;scsi0;net0
cores: 4
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-100-disk-1,size=4M
hostpci0: 08:00,pcie=1
ide2: none,media=cdrom
kvm: 1
machine: q35
memory: 4096
name: docker
net0: virtio=3A:46:6D:6E:AE:B2,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-0,size=128G
scsihw: virtio-scsi-pci
smbios1: uuid=53ccee07-e1f8-4f73-992b-f31cd6685a45
sockets: 1
vmgenid: f666db2f-fee6-4eef-8fd9-19ce771aab3c

/etc/default/grub
Code:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"
GRUB_CMDLINE_LINUX=""

/etc/modprobe.d/blacklist.conf
Code:
blacklist radeon
blacklist nouveau
blacklist nvidia
blacklist i2c_nvidia_gpu
blacklist nvidiafb

/etc/modprobe.d/vfio.conf
Code:
options vfio-pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed disable_vga=1

Code:
root@proxmox:~# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[    1.058146] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    1.064301] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    1.064302] pci 0000:00:00.2: AMD-Vi: Extended features (0xf77ef22294ada):
[    1.064304] AMD-Vi: Interrupt remapping enabled
[    1.064304] AMD-Vi: Virtual APIC enabled
[    1.064440] AMD-Vi: Lazy IO/TLB flushing enabled
[    1.065646] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).


lspci -k
Code:
08:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd TU116 [GeForce GTX 1660]
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau
08:00.1 Audio device: NVIDIA Corporation Device 1aeb (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3fc8
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel
08:00.2 USB controller: NVIDIA Corporation Device 1aec (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3fc8
        Kernel driver in use: vfio-pci
        Kernel modules: xhci_pci
08:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1aed (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3fc8
        Kernel driver in use: vfio-pci
        Kernel modules: i2c_nvidia_gpu

ON VM

sudo dmesg | grep failed
Code:
[    1.539079] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    6.035670] ucsi_ccg 0-0008: i2c_transfer failed -110
[    6.037026] ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
[    6.037799] ucsi_ccg: probe of 0-0008 failed with error -110

sudo dmesg | grep nvidia
Code:
[    1.529907] nvidia: loading out-of-tree module taints kernel.
[    1.529945] nvidia: module license 'NVIDIA' taints kernel.
[    1.539079] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    1.553514] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[    1.624849] nvidia 0000:06:10.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    1.685407] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  450.80.02  Wed Sep 23 00:48:09 UTC 2020
[    1.688677] [drm] [nvidia-drm] [GPU ID 0x00000610] Loading driver
[    1.689383] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:06:10.0 on minor 1
[    4.892033] nvidia-uvm: Loaded the UVM driver, major device number 234.
[    6.035003] nvidia-gpu 0000:06:10.3: i2c timeout error e0000000
[    6.293915] audit: type=1400 audit(1605267181.728:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=662 comm="apparmor_parser"
[    6.293918] audit: type=1400 audit(1605267181.728:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=662 comm="apparmor_parser"

lspci
Code:
06:10.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660] (rev a1)
06:10.1 Audio device: NVIDIA Corporation TU116 High Definition Audio Controller (rev a1)
06:10.2 USB controller: NVIDIA Corporation TU116 USB 3.1 Host Controller (rev a1)
06:10.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller (rev a1)
 
Last edited:
According to your logs, that looks like a successful pass through? The GPU is detected and initialized in the guest...

What exactly is the problem when you say "isn't working"?
 
When I used the nvidia-smi command it wouldnt say it didnt find the device. I finally got it working by adding video=vesafb:off video=efifb:off in grub. I think with the amd processor that doesnt have a igpu it always takes control of the videocard so you really need to specify that it shouldnt.