Proxmox GPU Passthrough Issue/NVIDIA-SMI No device found

D-Mac

New Member
Jul 14, 2022
2
0
1
Hey guys,

I've been trying to passthrough my GPU to a VM in Proxmox for the last two weeks but can't seem to get it quite right. I have been able to get my Ubuntu 22.04 server VM to see the GPU, but no matter what settings I change around and configs I edit, the nvidia-smi command always returns with "no device found".

I have followed these guides to do this, scoured around a bunch of forums and followed Proxmox documentation trying to work out the issue to no avail:
https://natebent.com/2021/10/gpu-passthrough-in-proxmox/
https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough/
https://forum.proxmox.com/threads/gpu-passthrough-tutorial-reference.34303/
https://manjaro.site/tips-to-create-ubuntu-20-04-vm-on-proxmox-with-gpu-passthrough/
https://www.reddit.com/r/Proxmox/comments/mib3u6/a_guide_to_how_i_got_nvidia_gpu_passthrough_to_a/

I have tried with an old and a fresh install of Ubuntu Server 22.04, the differences being the BIOS/OVMF(UEFI),Default/Q35 machine type,none/Default for Display and the different options for the PCI Device (PCI-e,Primary GPU).
I've double checked all the configs on the Proxmox server and they all seem to be fine, however I did notice when I ran dmesg | grep -e DMAR -e IOMMU, I received no output.

Server Specs:
- Ryzen 9 3900X
- GeForce GTX 1650
- ASROCK X570 Phantom Gaming


In the VM:

lspci -v
Code:
01:00.0 VGA compatible controller: NVIDIA Corporation TU117 [GeForce GTX 1650] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: Gigabyte Technology Co., Ltd TU117 [GeForce GTX 1650]
    Physical Slot: 0
    Flags: bus master, fast devsel, latency 0, IRQ 16
    Memory at c0000000 (32-bit, non-prefetchable) [size=16M]
    Memory at 800000000 (64-bit, prefetchable) [size=256M]
    Memory at 810000000 (64-bit, prefetchable) [size=32M]
    I/O ports at d000 [size=128]
    Expansion ROM at c1020000 [virtual] [disabled] [size=128K]
    Capabilities: [60] Power Management version 3
    Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [78] Express Legacy Endpoint, MSI 00
    Capabilities: [100] Virtual Channel
    Capabilities: [250] Latency Tolerance Reporting
    Capabilities: [128] Power Budgeting <?>
    Capabilities: [420] Advanced Error Reporting
    Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
    Kernel driver in use: nvidia
    Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

01:00.1 Audio device: NVIDIA Corporation Device 10fa (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device 3fca
    Physical Slot: 0
    Flags: bus master, fast devsel, latency 0, IRQ 17
    Memory at c1000000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: [60] Power Management version 3
    Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [78] Express Endpoint, MSI 00
    Capabilities: [100] Advanced Error Reporting
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

nvidia-smi
Code:
No devices were found

sudo dmesg | grep NVRM
Code:
[    2.422946] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  515.48.07  Fri May 27 03:26:43 UTC 2022
[   37.366452] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x25:0x65:1417)
[   37.366661] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[   91.435512] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x25:0x65:1417)
[   91.435687] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[  125.889436] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x25:0x65:1417)
[  125.889625] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

 
I have already passed through all of those options, still getting no device found from nvidia-smi. I'll look through your attached links after work and get back to you.
 
I have already passed through all of those options, still getting no device found from nvidia-smi. I'll look through your attached links after work and get back to you.
No need to do it for me, I know nothing about this and just thought that the software might assume something about the PCIe layout, which is slightly different between a VM and bare metal. I sometimes ask if maybe NVidia documentation/forum/support might help with their software, but people seem to think that is crazy.
 
Any update on this topic:
I'm also getting "No device found" with the command 'nvidia-smi'


Bash:
[    6.071817] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  535.129.03  Thu Oct 19 18:56:32 UTC 2023
[   97.555356] NVRM: GPU 0000:06:10.0: RmInitAdapter failed! (0x11:0x45:2566)
[   97.557251] NVRM: GPU 0000:06:10.0: rm_init_adapter failed, device minor number 0
[  102.370722] NVRM: GPU 0000:06:10.0: RmInitAdapter failed! (0x11:0x45:2566)
[  102.372435] NVRM: GPU 0000:06:10.0: rm_init_adapter failed, device minor number 0
 
Hi, did you find the solution as I am having the same issue. Strange as it works with Seabios but not UEFI which I require to handle larger volume sizes. The card I am using is a A6000.