GPU suddenly not found

Jan 19, 2019
108
6
23
52
I have been using a Nvidia GPU for months now with great success, then I updated to 7.2 and now the VM says "No devices were found".
I do see it using lspci, I can find it using a PCI Device (hostpci0) in the hardware section.
This is the conf file for the guest OS.

Code:
gent: 1
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NV43FIX,kvm=off'
balloon: 8192
bios: ovmf
boot: order=scsi0;net0
cores: 8
cpu: host,hidden=1,flags=+pcid
efidisk0: vault:vm-108-disk-1,efitype=4m,size=1M
hostpci0: 42:00,pcie=1,x-vga=on
machine: q35
memory: 24576
meta: creation-qemu=6.1.0,ctime=1649526334
name: ubuntu-nvidia
net0: virtio=0E:33:50:84:DD:60,bridge=vmbr0
numa: 0
onboot: 1
ostype: l26
scsi0: vault:vm-108-disk-0,cache=writeback,size=128G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=25ec1907-15ae-4cd5-b1f8-176cf812cd6f
sockets: 2
vga: qxl
vmgenid: 29c97199-facf-44e8-b6e2-32a98c6dcb34

Anyone got any suggestions how to mend this?
 
Jan 19, 2019
108
6
23
52
When I run nvidia-smi.

Task-log from the hosts? How do Identify my VM in the tasks folder?
 
Last edited:

dcsapak

Proxmox Staff Member
Staff member
Feb 1, 2016
8,039
986
163
34
Vienna
When I run nvidia-smi.
on the host or the guest (i guess the guest, since the host will not be able to access it anymore; in that case the task log is not really interesting)

the output of 'dmesg' of both the host and guest would be good
 
Jan 19, 2019
108
6
23
52
on the host or the guest (i guess the guest, since the host will not be able to access it anymore; in that case the task log is not really interesting)

the output of 'dmesg' of both the host and guest would be good
Here you go: https://cloud.grillgeek.se/s/X4Han8GgfypG49K

Found this in the guest-dsmeg:
[ 23.975002] NVRM: GPU 0000:01:00.0: RmInitAdapter failed!
I searched for that and found a thread at Nvidia: https://forums.developer.nvidia.com/t/nvrm-rminitadapter-failed-proxmox-gpu-passthrough/199720
He has the same issue and the solution is to add cpu: host, hidden=1
Problem is I already have that added I also got: cpu: host, hidden=1, flags=+pcid
 
Last edited:

dcsapak

Proxmox Staff Member
Staff member
Feb 1, 2016
8,039
986
163
34
Vienna
ok because you overrode the '-cpu' flag with the 'args' one, the 'hidden=1' in your cpu line does not do anything
try to remove the whole 'args' part of your config and try again please
 
Jan 19, 2019
108
6
23
52
ok because you overrode the '-cpu' flag with the 'args' one, the 'hidden=1' in your cpu line does not do anything
try to remove the whole 'args' part of your config and try again please
Ok I have now removed the args-line in my vmid.conf on the host. I am sorry to say that did not help in any way.
Thing is that I have had this args-line in there for months.
 
Last edited:

Tutbjun

New Member
Aug 4, 2022
3
0
1
Did you eventually find a solution to this problem? I seem to have the exact same problem trying to pass trough a Nvidia K80.

In my case, I noticed that when running "cat /proc/iomem" on the host machine, I can just see my GPU PCIe addresses being listed, followed by other stuff on the next lines. But according to a post by Lefuneste, the GPU PCIe addresses lines should be followed by "vfio-pci" if you prepare the passtrough correctly:
https://forum.proxmox.com/threads/problem-with-gpu-passthrough.55918/post-471013
Does this mean that the GPU isn't passed correctly to the vfio driver? The odd thing is that when I run "lspci -k", it sais that the kernel driver in use for the GPU is "vfio-pci"?

Any clues would be appreciated, since I'm a bit lost... And sorry if I have misunderstood anything, I'm a bit new to this whole GPU passtrough ordeal.
 

Tutbjun

New Member
Aug 4, 2022
3
0
1
If BOOTFB is in cat /proc/iomem then you are running into a common kernel 5.15+ issue that can be worked around like this.
I have seen recent posts about nvidia-smi but I have no clue about that.
Well, I seem to (thankfully) have avoided the BOOTFB issue, but the nvidia-smi thing is still a mystery...

I've searched a bit more, and it seems that many people have a very similar issue with same RmInitAdapter failed in dmesg on plain Ubuntu systems (no proxmox host or anything) if Above 4G Decoding/"memory-mapped I/O for a 64-bit PCIe device" is disabled in BIOS. Also I found this post fixing the problem in ESXI:
https://forum.proxmox.com/threads/gpu-passthrough-nvidia.100029/post-451669
Where the fix was to enable 64bit MMIO in the VM config.

My thinking is then, that Proxmox might not use above 4G decoding in the VM's? Odd thing is that I cannot find a setting for this in Proxmox like in ESXI. Again, I am quite new to this, and I know even less about ESXI. But could you maybe tell me if I'm on the right track? To an inexperienced user like me, it does sound plausible that there would be a similar above 4G decoding setting in the "virtual BIOS" that could be off, but I have no idea if that is even close to correct...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!