Nvidia A6000 vGPU 14.1 Proxmox 7.2.7 **NVRM: Aborting probe for VF 0000:01:00.4 since PF is not bound to nvidia driver

I didn't think the A5000 supports vGPU?
according to their documentation (and my local tests) it does: https://docs.nvidia.com/grid/latest/grid-vgpu-release-notes-generic-linux-kvm/index.html

id you use the Linux KVM Nvidia Grid bundle (with included windows display driver), or the Ubuntu version?
i used the Linux KVM bundle

Dmesg now showing [ 82.772848] [nvidia-vgpu-vfio] 00000000-0000-0000-0000-000000000100: vGPU migration disabled" for each BSOD'ing VM
that is also shown here on every vm. i don't think it has anything to do with the bsod though

FYI. After installing the Linux KVM grid (after purging the Ubuntu install) on the host and then matching display driver on the Win10 VM, same deal. BSOD and the below dmesg
mhmm... which version of windows 10 do you try to use? (i used 21H2 here)

edit: you could also try ubuntu 22.04, just to rule out that windows is the problem here (or something else unrelated)
 
Yes, Windows 10 21H2 also. Funnily enough, I've actually already got an Ubuntu 22.04 VM up and running as we're looking to moving that into production at some point in the future. I tried installing the .deb driver but it fails as it cannot see the VGPU (although nvidia-smi on the host now see's both the Win10 and Ubuntu vGPU instances) so I ran the .run with dkms, rebooted and nvidia-smi now moans that it cant communicate with the driver.

Just looking at the link to the Nvidia docs you sent, noticed that compression adjustment only supports and RTX6000-12Q instance. I was using 6Q, swapped it to 12Q in the GUI, booted up fine, installed the nvidia .exe on the WIN10 VM and same deal, BSOD.

After setting the RTX6000-12Q on the ubuntu VM, it now fails to boot:
"kvm: -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:01:00.7/00000000-0000-0000-0000-000000000104,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: warning: vfio 00000000-0000-0000-0000-000000000104: Could not enable error recovery for the device"
 
Last edited:
Actually, scrub that for the Ubuntu instance! Checked the Proxmox GUI Hardware>Display was set to default, after setting the VirtIO GPU it now boots and I get the bleow in nvidia-smi, so more progress. Thanks for sticking with me!

Screenshot 2022-07-27 at 11.45.03.png
 
ok if you have problems with ubuntu too, it's not only windows that's the problem....
i am just guessing here now, but did you enable the following things in the bios:

* pci AER
* pci ASPM
* Above 4G Decoding
* Resizable BAR (not really sure if necessary)
* SR-IOV

?
 
Ah no sorry, Ubuntu is looking good..
As for BIOS, I believe these are all good. The chassis is with the vendor and I don't have direct access. I'll ask them now to be sure but I think as per the image below, Ubuntu nvidia-settings is correctly showing the vGPU. It seems the 12Q mdev type did the trick? I'll start a fresh Windows install, check the BIOS again and come back to you.

Screenshot 2022-07-27 at 13.43.27.png
 
Last edited:
The vendor has confirmed the BIOS settings:
PCI AER is enabled.
PCI ASPM isn’t available, so shouldn’t be any power saving going on.
Above 4G Decoding is enabled.
Re-sizable BAR isn’t available either – consumer focused setting.

PCIe ARI has also been set from Auto to Enabled.

Windows is getting a bit further than before.. I can now see the A6000-12Q in device manager and can install the driver. However its very inconsistent, a reboot spins it out back into an automatic repair loop. After trying a restore, uninstalling features and updates I can get it back but its very hit and miss. A system restore to the point where I installed the updates, virtio drivers and the QEMU agent still works so not all is lost. I still have to reboot the entire chassis as I cant shutdown machines on the command line or GUI.

Feels so close!

Could you post the qm.conf of a working Win10 install, so I can compare?

Screenshot 2022-07-28 at 08.11.15.png
 
the one i use here is:
Code:
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 4
cpu: max,flags=+md-clear;+pcid;+spec-ctrl;+hv-tlbflush;+aes
efidisk0: local-lvm:vm-7001-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:01:00.4,mdev=nvidia-662
ide2: none,media=cdrom
machine: pc-i440fx-6.2
memory: 4096
net0: virtio=<some-mac>,bridge=vmbr1,firewall=1
numa: 0
ostype: win10
scsi0: local-lvm:vm-7001-disk-1,discard=on,size=50G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=<some-uuid>
sockets: 1
tablet: 1
vmgenid: <some-uuid>

so no major difference aside that i'm using i440fx instead of q35 (but i tested with q35 as well)

edit: i use the default vga ('std') instead of virtio though
edit2: also i have patched my qemu-server locally to add the uuid automatically (sent that patch already on the pve-devel list) so i don't need to use the 'args -uuid ...' workaround
 
Last edited:
did you manage to get it working? or does it still bluescreen/not work?
 
did you manage to get it working? or does it still bluescreen/not work?
Hey! Apologies for the late reply (been on holiday) , yes we did get it working! Working conf below (setting the CPU to "host" is what got it over the line). We really appreciate your help with this, we're going ahead and getting a unit on site for some real world tests, so I'll no doubt be in touch again soon :)


root@pve:/etc/pve/local/qemu-server# cat 110.conf

agent: 1

args: -uuid 00000000-0000-0000-0000-000000000107

bios: ovmf

boot: order=ide0;ide2;net0

cores: 8

cpu: host

efidisk0: local-lvm:vm-110-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M

hostpci0: 0000:01:00.7,mdev=nvidia-530

ide0: local-lvm:vm-110-disk-1,size=150G

ide2: local:iso/Win10_21H2_EnglishInternational_x64.iso,media=cdrom,size=5748118K

machine: pc-q35-6.2

memory: 32768

meta: creation-qemu=6.2.0,ctime=1659547025

name: CONF01

net0: e1000=5A:36:FB:B3:EB:66,bridge=vmbr0

numa: 0

ostype: win10

scsihw: virtio-scsi-pci

smbios1: uuid=<some UUID>

sockets: 1

tablet: 1

vga: virtio

vmgenid: <some UUID>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!