[SOLVED] Proxmox vGPU configuration troubleshooting

cgirlamo

New Member
Oct 14, 2025
10
0
1
Hello,

I have been trying to get vGPU set up on a single proxmox node with 3 Ada RTX 6000 GPU's. I have enabled IOMMU and SR-IOV, installed the host driver from the Nvidia enterprise portal, and am using proxmox v 9.1.7, kernel 6.17.13-2-pve, and have tried unsuccessfully to use nvidia driver versions 20.0, 19.4, and 18.4.

After installing the drivers and rebooting, nvidia-smi shows my 3 GPUs, and running nvidia-smi vgpu -s shows the different possible partitions for each GPU. However, when I check /sys/class/mdev, that directory is empty, and I cannot see the partitions in the web GUI. I have also set the display mode to physical_display_disabled and reloaded the necessary modules for vgpu.

any help would be appreciated,

Chris

Code:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.126.08             Driver Version: 580.126.08     CUDA Version: N/A      |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX 6000 Ada Gene...    On  |   00000000:5A:00.0 Off |                    0 |
| 30%   35C    P8             27W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA RTX 6000 Ada Gene...    On  |   00000000:63:00.0 Off |                    0 |
| 30%   34C    P8             22W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA RTX 6000 Ada Gene...    On  |   00000000:D6:00.0 Off |                    0 |
| 30%   34C    P8             33W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

 nvidia-smi vgpu -s
GPU 00000000:5A:00.0
    NVIDIA RTX6000-Ada-1B
    NVIDIA RTX6000-Ada-2B
    NVIDIA RTX6000-Ada-1Q
    NVIDIA RTX6000-Ada-2Q
    NVIDIA RTX6000-Ada-3Q
    NVIDIA RTX6000-Ada-4Q
    NVIDIA RTX6000-Ada-6Q
    NVIDIA RTX6000-Ada-8Q
    NVIDIA RTX6000-Ada-12Q
    NVIDIA RTX6000-Ada-16Q
    NVIDIA RTX6000-Ada-24Q
    NVIDIA RTX6000-Ada-48Q
    NVIDIA RTX6000-Ada-1A
    NVIDIA RTX6000-Ada-2A
    NVIDIA RTX6000-Ada-3A
    NVIDIA RTX6000-Ada-4A
    NVIDIA RTX6000-Ada-6A
    NVIDIA RTX6000-Ada-8A
    NVIDIA RTX6000-Ada-12A
    NVIDIA RTX6000-Ada-16A
    NVIDIA RTX6000-Ada-24A
    NVIDIA RTX6000-Ada-48A
    NVIDIA RTX6000-Ada-3B

GPU 00000000:63:00.0
    NVIDIA RTX6000-Ada-1B
    NVIDIA RTX6000-Ada-2B
    NVIDIA RTX6000-Ada-1Q
    NVIDIA RTX6000-Ada-2Q
    NVIDIA RTX6000-Ada-3Q
    NVIDIA RTX6000-Ada-4Q
    NVIDIA RTX6000-Ada-6Q
    NVIDIA RTX6000-Ada-8Q
    NVIDIA RTX6000-Ada-12Q
    NVIDIA RTX6000-Ada-16Q
    NVIDIA RTX6000-Ada-24Q
    NVIDIA RTX6000-Ada-48Q
    NVIDIA RTX6000-Ada-1A
    NVIDIA RTX6000-Ada-2A
    NVIDIA RTX6000-Ada-3A
    NVIDIA RTX6000-Ada-4A
    NVIDIA RTX6000-Ada-6A
    NVIDIA RTX6000-Ada-8A
    NVIDIA RTX6000-Ada-12A
    NVIDIA RTX6000-Ada-16A
    NVIDIA RTX6000-Ada-24A
    NVIDIA RTX6000-Ada-48A
    NVIDIA RTX6000-Ada-3B

GPU 00000000:D6:00.0
    NVIDIA RTX6000-Ada-1B
    NVIDIA RTX6000-Ada-2B
    NVIDIA RTX6000-Ada-1Q
    NVIDIA RTX6000-Ada-2Q
    NVIDIA RTX6000-Ada-3Q
    NVIDIA RTX6000-Ada-4Q
    NVIDIA RTX6000-Ada-6Q
    NVIDIA RTX6000-Ada-8Q
    NVIDIA RTX6000-Ada-12Q
    NVIDIA RTX6000-Ada-16Q
    NVIDIA RTX6000-Ada-24Q
    NVIDIA RTX6000-Ada-48Q
    NVIDIA RTX6000-Ada-1A
    NVIDIA RTX6000-Ada-2A
    NVIDIA RTX6000-Ada-3A
    NVIDIA RTX6000-Ada-4A
    NVIDIA RTX6000-Ada-6A
    NVIDIA RTX6000-Ada-8A
    NVIDIA RTX6000-Ada-12A
    NVIDIA RTX6000-Ada-16A
    NVIDIA RTX6000-Ada-24A
    NVIDIA RTX6000-Ada-48A
    NVIDIA RTX6000-Ada-3B
 
hello, the fix for this was to enable all 3 gpus seperately for the sr-iov process.
So,instead of

Code:
sudo systemctl enable --now pve-nvidia-sriov@ALL.service

run
Code:
systemctl enable --now 'pve-nvidia-sriov@0000:5a:00.0'
systemctl enable --now 'pve-nvidia-sriov@0000:63:00.0'
systemctl enable --now 'pve-nvidia-sriov@0000:d6:00.0'
 
hello, the fix for this was to enable all 3 gpus seperately for the sr-iov process.
So,instead of

Code:
sudo systemctl enable --now pve-nvidia-sriov@ALL.service

run
Code:
systemctl enable --now 'pve-nvidia-sriov@0000:5a:00.0'
systemctl enable --now 'pve-nvidia-sriov@0000:63:00.0'
systemctl enable --now 'pve-nvidia-sriov@0000:d6:00.0'
interesting, did you do more than that? because there should be no difference between activating the 'ALL' service vs activating all 3 individually (nvidias script basically first finds all nvidia cards with 'ALL' and then does the same for each card what it would do if activated for a single pci id...)

However, when I check /sys/class/mdev, that directory is empty,
this is normal with these cards, as they don't use the 'standard' mdev way anymore (beginning with ampere and kernel 6.8) we just still call it that to make it easier for our code