I want to use gpu on a NVIDIA RTX 5000 Ada on the latest Version of Proxmox
Yesterday I downloaded the current Version of Proxmox, registered with the subscription key und then updated everything to the latest stat and rebooted.
This brought me here:
Then I followed these instructions given here:
https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE
I used the latest driver that seem suitable to me: NVIDIA-GRID-Linux-KVM-570.124.03-570.124.06-572.60
This is the latest driver of version 18
Every seems to work - but when I run this command:
-> I do not get any virtual functions.
To me this looks like to be expected.
To me this looks like to be expected.
To me this looks like to be expected.
To me this looks like to be expected.
According to the proxmox documentation I now should get this:
This is what I get:
And here I'm stuck.
To me there seems there is missing some step not mentioned in the proxmox documentation - at least for a RTX 5000 Ada GPU?
Do I need to create the virtual functions on a RTX 5000 Ada GPU somehow?
This seems to be the case - according to the NVIDIA documentation for Version 18 you can find here:
https://docs.nvidia.com/vgpu/latest...ndex.html#creating-vgpu-device-red-hat-el-kvm
If I follow along there I get stuck here:
According to the NVIDIA documentation in the mentioned directory should be entries called virtfnNN like this:
But I do not get any virtfnNN entry here...
Yesterday I downloaded the current Version of Proxmox, registered with the subscription key und then updated everything to the latest stat and rebooted.
This brought me here:
Code:
root@pve01:~# pveversion
pve-manager/8.3.4/65224a0f9cd294a3 (running kernel: 6.8.12-8-pve)
Then I followed these instructions given here:
https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE
I used the latest driver that seem suitable to me: NVIDIA-GRID-Linux-KVM-570.124.03-570.124.06-572.60
This is the latest driver of version 18
Every seems to work - but when I run this command:
Code:
lspci -d 10de:
81:00.0 VGA compatible controller: NVIDIA Corporation AD102GL [RTX 5000 Ada Generation] (rev a1)
81:00.1 Audio device: NVIDIA Corporation AD102 High Definition Audio Controller (rev a1)
-> I do not get any virtual functions.
Code:
root@pve01:~# nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.03 Driver Version: 570.124.03 CUDA Version: N/A |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX 5000 Ada Gene... On | 00000000:81:00.0 Off | 0 |
| 30% 27C P8 25W / 250W | 0MiB / 30712MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
To me this looks like to be expected.
Code:
root@pve01:~# nvidia-smi vgpu
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.03 Driver Version: 570.124.03 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 NVIDIA RTX 5000 Ada Ge... | 00000000:81:00.0 | 0% |
+---------------------------------+------------------------------+------------+
To me this looks like to be expected.
Code:
root@pve01:~# nvidia-smi vgpu -c
GPU 00000000:81:00.0
NVIDIA RTX5000-Ada-1B
NVIDIA RTX5000-Ada-2B
NVIDIA RTX5000-Ada-1Q
NVIDIA RTX5000-Ada-2Q
NVIDIA RTX5000-Ada-4Q
NVIDIA RTX5000-Ada-8Q
NVIDIA RTX5000-Ada-16Q
NVIDIA RTX5000-Ada-32Q
NVIDIA RTX5000-Ada-1A
NVIDIA RTX5000-Ada-2A
NVIDIA RTX5000-Ada-4A
NVIDIA RTX5000-Ada-8A
NVIDIA RTX5000-Ada-16A
NVIDIA RTX5000-Ada-32A
To me this looks like to be expected.
Code:
root@pve01:~# nvidia-smi vgpu -q
GPU 00000000:81:00.0
Active vGPUs : 0
To me this looks like to be expected.
According to the proxmox documentation I now should get this:
Code:
# lspci -d 10de:
01:00.0 3D controller: NVIDIA Corporation GA102GL [RTX A5000] (rev a1)
01:00.4 3D controller: NVIDIA Corporation GA102GL [RTX A5000] (rev a1)
01:00.5 3D controller: NVIDIA Corporation GA102GL [RTX A5000] (rev a1)
This is what I get:
Code:
root@pve01:~# lspci -d 10de:
81:00.0 VGA compatible controller: NVIDIA Corporation AD102GL [RTX 5000 Ada Generation] (rev a1)
81:00.1 Audio device: NVIDIA Corporation AD102 High Definition Audio Controller (rev a1)
And here I'm stuck.
To me there seems there is missing some step not mentioned in the proxmox documentation - at least for a RTX 5000 Ada GPU?
Do I need to create the virtual functions on a RTX 5000 Ada GPU somehow?
This seems to be the case - according to the NVIDIA documentation for Version 18 you can find here:
https://docs.nvidia.com/vgpu/latest...ndex.html#creating-vgpu-device-red-hat-el-kvm
If I follow along there I get stuck here:
Code:
root@pve01:~# ls /sys/bus/pci/devices/0000\:81\:00.0/
aer_dev_correctable broken_parity_status current_link_speed driver i2c-6 iommu_group max_link_speed numa_node reset resource1 resource3_wc subsystem_device
aer_dev_fatal class current_link_width driver_override i2c-7 irq max_link_width power reset_method resource1_resize resource5 subsystem_vendor
aer_dev_nonfatal config d3cold_allowed enable i2c-8 link modalias power_state resource resource1_wc revision uevent
ari_enabled consistent_dma_mask_bits device i2c-10 i2c-9 local_cpulist msi_bus remove resource0 resource3 rom vendor
boot_vga consumer:pci:0000:81:00.1 dma_mask_bits i2c-11 iommu local_cpus msi_irqs rescan resource0_resize resource3_resize subsystem
According to the NVIDIA documentation in the mentioned directory should be entries called virtfnNN like this:
Code:
# ls -l /sys/bus/pci/devices/0000:41:00.0/ | grep virtfn
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn0 -> ../0000:41:00.4
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn1 -> ../0000:41:00.5
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn10 -> ../0000:41:01.6
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn11 -> ../0000:41:01.7
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn12 -> ../0000:41:02.0
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn13 -> ../0000:41:02.1
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn14 -> ../0000:41:02.2
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn15 -> ../0000:41:02.3
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn16 -> ../0000:41:02.4
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn17 -> ../0000:41:02.5
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn18 -> ../0000:41:02.6
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn19 -> ../0000:41:02.7
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn2 -> ../0000:41:00.6
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn20 -> ../0000:41:03.0
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn21 -> ../0000:41:03.1
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn22 -> ../0000:41:03.2
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn23 -> ../0000:41:03.3
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn24 -> ../0000:41:03.4
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn25 -> ../0000:41:03.5
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn26 -> ../0000:41:03.6
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn27 -> ../0000:41:03.7
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn28 -> ../0000:41:04.0
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn29 -> ../0000:41:04.1
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn3 -> ../0000:41:00.7
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn30 -> ../0000:41:04.2
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn31 -> ../0000:41:04.3
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn4 -> ../0000:41:01.0
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn5 -> ../0000:41:01.1
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn6 -> ../0000:41:01.2
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn7 -> ../0000:41:01.3
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn8 -> ../0000:41:01.4
lrwxrwxrwx. 1 root root 0 Jul 16 04:42 virtfn9 -> ../0000:41:01.5
But I do not get any virtfnNN entry here...
