Enabling vGPU on proxmox 8.4 (can't see mdev after installing driver)

emancuso · Jun 24, 2025

Hi all,

i'm not new in using proxmox and now i'm trying to install pve 8.4 on a Dell R750 for a customer who is valutating quitting broadcom esxi.

The goal is to enable vGPU feautes for a RTX TENSOR A30 to be shared beetween 2 windows VMs.

I followed this documentation : https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE

- I installed proxmox 8.4 from scratch
- apt update && apt dist-upgrade -y
- enabling all the features required in the documentation :

VT-d for Intel, or AMD-v for AMD (sometimes named IOMMU)
SR-IOV (this may not be necessary for older pre-Ampere GPU generations)
Above 4G decoding
Alternative Routing ID Interpretation (ARI) (not necessary for pre-Ampere GPUs)

- apt install pve-nvidia-vgpu-helper
- pve-nvidia-vgpu-helper setup
- reboot
- installation nvidia driver NVIDIA-Linux-x86_64-570.148.06-vgpu-kvm.run
- reboot
- systemctl enable --now pve-nvidia-sriov@ALL.service
- reboot

everything seems to be installed well, no problem with the service, no problem with lspci that is showing all the pci device (before the installation i can see just a single entry for the nvidia card).

Anyway, when is time to create the resource mapping from GUI, if i check the mediated devices box, no device appears.

I tried different kernel version, i tried with different nvidia driver version (always for KVM and with vGPU support) but i can't still resolve this problem.

the problem is related to mdev, actually i can't find any nvidia-profile loaded in the system and some path are missing, in others some file are missing.

1) /sys/class/mdev_bus it's an empty dir.
2) mdevctl types not showing output
3) mdevctl list not showing output
4) vgpuConfig.xml MISSING ( cp /usr/share/nvidia/vgpu/vgpuConfig.xml /var/lib/nvidia-vgpu-mgr/vgpuConfig.xml doesn't solve the problem)

So, i'm wondering if someone may help me in this. I followed any step of the procedure but i can't understand why i'm not able to mapping the gpu resources.
Maybe my configuration is not fully supported ? Maybe it's a bug ?

As a long-time Proxmox user, I’d really like to resolve this issue and demonstrate the full potential of the platform to the client. However, the vGPU implementation on ESXi was much faster, and this might discourage the client from choosing Proxmox. Please, I need your support to get this working and to properly showcase Proxmox’s capabilities to the customer !!!

i can provide any log/information you need

cheiss · Jun 24, 2025

Can you please provide the output of pveversion -v?
Does /sys/bus/pci/devices/0000:17:00.0/nvidia exist? Can you a directory listing ls -lah <path> of that directory?

Looking at the supported list of GPUs, the A30 is not listed - so you might need the Enterprise AI driver.
In this case please not the last two paragraph in the "Software Versions" paragraph of our documentation: https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Software_Versions

NVIDIA vGPU uses a custom low-level driver interface since kernel 6.8, so they won't show under mdevctl - but are treated as mediated devices in PVE for simplicity.

emancuso · Jun 24, 2025

cheiss said:
Can you please provide the output of pveversion -v?
Does /sys/bus/pci/devices/0000:17:00.0/nvidia exist? Can you a directory listing ls -lah <path> of that directory?

Looking at the supported list of GPUs, the A30 is not listed - so you might need the Enterprise AI driver.
In this case please not the last two paragraph in the "Software Versions" paragraph of our documentation: https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Software_Versions

NVIDIA vGPU uses a custom low-level driver interface since kernel 6.8, so they won't show under mdevctl - but are treated as mediated devices in PVE for simplicity.

Hi, thanks for the quick reply.

proxmox-ve: 8.4.0 (running kernel: 6.8.12-11-pve)
pve-manager: 8.4.1 (running version: 8.4.1/2a5fa54a8503f96d)
proxmox-kernel-helper: 8.1.1
proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11
proxmox-kernel-6.8: 6.8.12-11
proxmox-kernel-6.8.12-9-pve-signed: 6.8.12-9
ceph-fuse: 17.2.8-pve2
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u1
frr-pythontools: 10.2.2-1+pve1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve2
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.2
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.1
libpve-cluster-perl: 8.1.1
libpve-common-perl: 8.3.1
libpve-guest-common-perl: 5.2.2
libpve-http-server-perl: 5.2.2
libpve-network-perl: 0.11.2
libpve-rs-perl: 0.9.4
libpve-storage-perl: 8.3.6
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.4.2-1
proxmox-backup-file-restore: 3.4.2-1
proxmox-firewall: 0.7.1
proxmox-kernel-helper: 8.1.1
proxmox-mail-forward: 0.3.3
proxmox-mini-journalreader: 1.5
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.11
pve-cluster: 8.1.1
pve-container: 5.2.6
pve-docs: 8.4.0
pve-edk2-firmware: 4.2025.02-3
pve-esxi-import-tools: 0.7.4
pve-firewall: 5.1.1
pve-firmware: 3.15-4
pve-ha-manager: 4.0.7
pve-i18n: 3.4.5
pve-qemu-kvm: 9.2.0-5
pve-xtermjs: 5.5.0-2
qemu-server: 8.3.13
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve2

The path /sys/bus/pci/devices/0000:17:00.0 exists but there is not an nvidia folder inside it.

We were not able to find any specific information regards the A30 compatibility, or better, in some case we find that it should be compatible, in other cases we just didn't find the information. So we think A30 should be supported as well as A40 A16 A10.

For the " Enterprise AI driver" that was exactly my concern. Nvidia specify that you need enterprise grade driver but they don't specify where to find it !
i download all the drivers from here : https://www.nvidia.com/en-us/drivers/ ----> in the botton right corner you find NVIDIA Virtual GPU Customers, after creating an account you have 90 days to gets the driver.

Anyway maybe those "Enterprise AI driver" are downloaded from another source. I really need this ?
So, just for understanding, how it is possibile i found people who tring to share consumer level GPU (like RTX2080) ??? If vGPU is only related to enterprise level GPU/Software it should be not possible that a lot of people are using vGPU in their setup.

Again thanks for your time, Much appreciated

cheiss · Jun 24, 2025

emancuso said:
The path /sys/bus/pci/devices/0000:17:00.0 exists but there is not an nvidia folder inside it.

That where all the vGPU information is read from in PVE - so rather no surprise that no vGPUs are shown in the GUI.
You can also check the system log using journalctl -b and/or kernel log specifically using dmesg -H for any errors.

You can check the output of nvidia-smi vgpu --creatable --verbose - which will confirm whether vGPU is actually supported.
Could also try the opt-in 6.14 kernel for example, but - as written below - this is not the correct driver for this card, so that won't help much probably.

emancuso said:
We were not able to find any specific information regards the A30 compatibility, or better, in some case we find that it should be compatible, in other cases we just didn't find the information. So we think A30 should be supported as well as A40 A16 A10.

It's listed in the Enterprise AI driver documentation: https://docs.nvidia.com/ai-enterprise/release-6/latest/appendix/vgpu.html#sd-tab-item-29
So yeah, seems like you need that driver.

Not really familiar with the Enterprise AI offering by Nvidia (since we only support the "standard" vGPU offering), but you will need a valid license from Nvidia for that too, I'd guess from reading the documentation a bit.
And with such a license you will also have access to the appropriate driver downloads - at least that's the case with the vGPU offering.

emancuso said:
So, just for understanding, how it is possibile i found people who tring to share consumer level GPU (like RTX2080) ???

That works using gross hacks using modified firmware and similar AFAIK - which is neither supported nor recommended, for very obvious reason.

emancuso · Jun 24, 2025

cheiss said:
That where all the vGPU information is read from in PVE - so rather no surprise that no vGPUs are shown in the GUI.
You can also check the system log using journalctl -b and/or kernel log specifically using dmesg -H for any errors.

You can check the output of nvidia-smi vgpu --creatable --verbose - which will confirm whether vGPU is actually supported.
Could also try the opt-in 6.14 kernel for example, but - as written below - this is not the correct driver for this card, so that won't help much probably.

It's listed in the Enterprise AI driver documentation: https://docs.nvidia.com/ai-enterprise/release-6/latest/appendix/vgpu.html#sd-tab-item-29
So yeah, seems like you need that driver.

Not really familiar with the Enterprise AI offering by Nvidia (since we only support the "standard" vGPU offering), but you will need a valid license from Nvidia for that too, I'd guess from reading the documentation a bit.
And with such a license you will also have access to the appropriate driver downloads - at least that's the case with the vGPU offering.

That works using gross hacks using modified firmware and similar AFAIK - which is neither supported nor recommended, for very obvious reason.

Hi,

from the output of the command you provided, i think there is the support for A30, even if there is an error in journal logs:

"GPU not supported by vGPU at PCI Id: 0:17:0:0 DevID: 0x10de / 0x20b7 / 0x10de / 0x0000" i think this error came before the starting of pve-nvidia-sriov@ALL.service (maybe), so based on other output:

GPU 00000000:17:00.0
No vGPUs found on this device

Jun 24 11:54:37 nvidia-vgpud[2781]: Homogeneous vGPUs: 1
Jun 24 11:54:37 nvidia-vgpud[2781]: vGPU types: 492
Jun 24 11:54:37 nvidia-vgpud[2781]:
Jun 24 11:54:37 nvidia-vgpud[2781]: pciId of gpu [0]: 0:17:0:0
Jun 24 11:54:37 nvidia-vgpud[2781]: GPU not supported by vGPU at PCI Id: 0:17:0:0 DevID: 0x10de / 0x20b7 / 0x10de / 0x0000
Jun 24 11:54:37 systemd[1]: nvidia-vgpud.service: Deactivated successfully.
Jun 24 11:54:37 systemd[1]: Finished nvidia-vgpud.service - NVIDIA vGPU Daemon.
Jun 24 11:54:37 systemd[1]: Starting nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon...
Jun 24 11:54:37 systemd[1]: Started nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon.
Jun 24 11:54:37 kernel: [nvidia-vgpu-vfio] No vGPU types present for GPU 0x1700
Jun 24 11:54:37 kernel: [nvidia-vgpu-vfio] vGPU device data not available. No vGPU devices found for GPU 0000:17:00.0

[ +0.089012] NVRM: GPU 0000:17:00.0: UnbindLock acquired
[ +0.896704] [nvidia-vgpu-vfio] No vGPU types present for GPU 0x1700
[ +0.001952] [nvidia-vgpu-vfio] vGPU device data not available. No vGPU devices found for GPU 0000:17:00.0

so i think the gpu is supported but you got the point with the driver: i need Enterprise AI driver probably.
am i right in assuming the gpu is compatible even if we get the error in journalctl -b ?

Thanks.
Ettore

cheiss · Jun 24, 2025

emancuso said:
so i think the gpu is supported but you got the point with the driver: i need Enterprise AI driver probably.
am i right in assuming the gpu is compatible even if we get the error in journalctl -b ?

Depends on what you mean by "supported" and "compatible".
For vGPU - no, at least not with the normal vGPU host drivers.
The above command outputs clearly confirm that.

Jun 24 11:54:37 nvidia-vgpud[2781]: GPU not supported by vGPU at PCI Id: 0:17:0:0 DevID: 0x10de / 0x20b7 / 0x10de / 0x0000

With the Enterprise AI drivers - maybe compatible, but not officially supported as stated in our documentation.

emancuso · Jun 24, 2025

Hi Christoph.

Thank you for clarifying my doubts regarding compatibility.

I kindly ask you to keep the post open — I will try to find the correct drivers and attempt a new installation. I’ll update the post for everyone’s benefit, whether I manage to solve the issue or not.

In any case, thank you for your support and the promptness of your replies.

Regards.
Ettore

cheiss · Jun 24, 2025

emancuso said:
I kindly ask you to keep the post open

We generally don't close threads here for everyones benefit, so no worries

guruevi · Jun 24, 2025

The NVIDIA vGPU drivers, if you need them, come from your subscription, which you can login with at nvid.nvidia.com, an Enterprise AI license is included in the purchase of certain model cards, but can also be purchased separately. For vGPU, you need an appropriate license, your reseller should send you a code that you can associate with your account.

The A30 is a slightly older model, it probably doesn’t have an Enterprise AI subscription, but it is supported by the regular Enterprise drivers and doesn’t require/have vGPU but MIG for partitioning, which works on a process level, not VM.
https://developer.nvidia.com/blog/dividing-nvidia-a30-gpus-and-conquering-multiple-workloads/
https://docs.nvidia.com/datacenter/tesla/drivers/index.html

- The above link has the ‘datacenter’ driver which is for any ‘datacenter’ model card to be used natively (whether or not you use MIG)
- MIG partitions your card for use by separate processes (user processes or containers), but you cannot pass a MIG partition to a VM. MIG partitions you’ll typically find on a ‘compute’ card.
- VGPU has a separate driver available from nvid.nvidia.com with a subscription to the vGPU licenses. VGPU is what you need to give a VM a portion of a GPU, but only specific cards optimized for graphics rather than compute (https://docs.nvidia.com/vgpu/gpus-supported-by-vgpu.html)
- Enterprise AI is not a driver, but provides support and downloadable containers, AI models etc as well as credits for their cloud offering.

emancuso · Jun 25, 2025

guruevi said:
The NVIDIA vGPU drivers, if you need them, come from your subscription, which you can login with at nvid.nvidia.com, an Enterprise AI license is included in the purchase of certain model cards, but can also be purchased separately. For vGPU, you need an appropriate license, your reseller should send you a code that you can associate with your account.

The A30 is a slightly older model, it probably doesn’t have an Enterprise AI subscription, but it is supported by the regular Enterprise drivers and doesn’t require/have vGPU but MIG for partitioning, which works on a process level, not VM.
https://developer.nvidia.com/blog/dividing-nvidia-a30-gpus-and-conquering-multiple-workloads/
https://docs.nvidia.com/datacenter/tesla/drivers/index.html

- The above link has the ‘datacenter’ driver which is for any ‘datacenter’ model card to be used natively (whether or not you use MIG)
- MIG partitions your card for use by separate processes (user processes or containers), but you cannot pass a MIG partition to a VM. MIG partitions you’ll typically find on a ‘compute’ card.
- VGPU has a separate driver available from nvid.nvidia.com with a subscription to the vGPU licenses. VGPU is what you need to give a VM a portion of a GPU, but only specific cards optimized for graphics rather than compute (https://docs.nvidia.com/vgpu/gpus-supported-by-vgpu.html)
- Enterprise AI is not a driver, but provides support and downloadable containers, AI models etc as well as credits for their cloud offering.

Thank you for your reply.
You’ve clarified several doubts I had regarding the implementation of the driver in question.
Indeed, the A30 only supports MIG and no vGPU, so there’s no chance to get it working on Proxmox.
Thank you very much for your contribution.

Regards.
Ettore

dcsapak · Jun 25, 2025

guruevi said:
The NVIDIA vGPU drivers, if you need them, come from your subscription, which you can login with at nvid.nvidia.com, an Enterprise AI license is included in the purchase of certain model cards, but can also be purchased separately. For vGPU, you need an appropriate license, your reseller should send you a code that you can associate with your account.

The A30 is a slightly older model, it probably doesn’t have an Enterprise AI subscription, but it is supported by the regular Enterprise drivers and doesn’t require/have vGPU but MIG for partitioning, which works on a process level, not VM.
https://developer.nvidia.com/blog/dividing-nvidia-a30-gpus-and-conquering-multiple-workloads/
https://docs.nvidia.com/datacenter/tesla/drivers/index.html

- The above link has the ‘datacenter’ driver which is for any ‘datacenter’ model card to be used natively (whether or not you use MIG)
- MIG partitions your card for use by separate processes (user processes or containers), but you cannot pass a MIG partition to a VM. MIG partitions you’ll typically find on a ‘compute’ card.
- VGPU has a separate driver available from nvid.nvidia.com with a subscription to the vGPU licenses. VGPU is what you need to give a VM a portion of a GPU, but only specific cards optimized for graphics rather than compute (https://docs.nvidia.com/vgpu/gpus-supported-by-vgpu.html)
- Enterprise AI is not a driver, but provides support and downloadable containers, AI models etc as well as credits for their cloud offering.

just to clarify, there is an ai enterprise driver that supports the a30 with the "C-Series' vgpus

this was previously integrated in the normal vgpu driver, but separated out to the ai enterprise software, for licensing reasons i guess.
see also:
https://docs.nvidia.com/ai-enterpri...rix.html#supported-nvidia-gpus-and-networking
https://docs.nvidia.com/ai-enterprise/release-6/latest/appendix/vgpu.html

Search

Search

Enabling vGPU on proxmox 8.4 (can't see mdev after installing driver)

emancuso

New Member

cheiss

Proxmox Staff Member

emancuso

New Member

cheiss

Proxmox Staff Member

emancuso

New Member

cheiss

Proxmox Staff Member

emancuso

New Member

cheiss

Proxmox Staff Member

guruevi

Well-Known Member

emancuso

New Member

dcsapak

Proxmox Staff Member

We value your privacy

Enabling vGPU on proxmox 8.4 (can't see mdev after installing driver)

New Member

Proxmox Staff Member

New Member

​

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

Well-Known Member

New Member

Proxmox Staff Member

We value your privacy