Error when trying to install Nvidia vGPU driver

Sujith Arangan

Well-Known Member
Jan 15, 2018
39
1
48
37
I am trying to install Nvidia vGPU driver in the Proxmox VE 8.2.2. When installing vGPU 14.2 version, I am getting the below error. Kindly please help me resolve it.

ERROR: Failed to run `/usr/sbin/dkms build -m nvidia -v 450.80 -k 6.8.4-3-pve`: Sign command: /lib/modules/6.8.4-3-pve/build/scripts/sign-file Signing key: /var/lib/dkms/mok.key Public certificate (MOK): /var/lib/dkms/mok.pub Building module: Cleaning build area... 'make' -j32 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=6.8.4-3-pve IGNORE_CC_MISMATCH='' modules...(bad exit status: 2) Error! Bad return status for module build on kernel: 6.8.4-3-pve (x86_64) Consult /var/lib/dkms/nvidia/450.80/build/make.log for more information.
 
Last edited:
hi

first, the vgpu 14 version is not supported anymore by nvidia (see https://docs.nvidia.com/grid/)

also currently, there is no official nvidia vgpu driver that works for kernel 6.8 so you'd have to downgrade to kernel 6.5 and then use the v16 or v17 grid driver
 
  • Like
Reactions: Sujith Arangan
hi

first, the vgpu 14 version is not supported anymore by nvidia (see https://docs.nvidia.com/grid/)

also currently, there is no official nvidia vgpu driver that works for kernel 6.8 so you'd have to downgrade to kernel 6.5 and then use the v16 or v17 grid driver
Hi The vGPU driver got installed successfully after downgrading the kernel to 6.5. but vGPU not unlocked. Please find the logs attached.
 

Attachments

  • vGPU-Logs.txt
    2.5 KB · Views: 3
I am using V11.1 driver. The graphics card gets detected but no output on mdevctl types
i doubt the GRID v11 driver works with a modern 6.5 kernel?

can you post the output of nvidia-smi and mdevctl ?
 
i doubt the GRID v11 driver works with a modern 6.5 kernel?

can you post the output of nvidia-smi and mdevctl ?
I have installed it on Proxmox6.4. I had deliberately downgraded due to the vGPU requirement.

root@lc0-proxmox-gpu-host-01:~# nvidia-smi
Fri May 17 02:29:10 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80 Driver Version: 450.80 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 A100-PCIE-40GB Off | 00000000:0A:00.0 Off | 0 |
| N/A 32C P0 44W / 250W | 0MiB / 40537MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
root@lc0-proxmox-gpu-host-01:~# nvidia-smi vgpu
Fri May 17 02:31:18 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80 Driver Version: 450.80 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 A100-PCIE-40GB | 00000000:0A:00.0 | 0% |
+---------------------------------+------------------------------+------------+

root@lc0-proxmox-gpu-host-01:~#
root@lc0-proxmox-gpu-host-01:~# mdevctl types
root@lc0-proxmox-gpu-host-01:~# mdevctl types
 
sorry, but pve 6.4 is not supported anymore, but what is the output of lspci ?
 
sorry, but pve 6.4 is not supported anymore, but what is the output of lspci ?
It detects without any problem.

08:10.0 PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ca)
0a:00.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
0c:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 03)
0d:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30)
7f:08.0 System peripheral: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 QPI Link 0 (rev 02)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!