Getting Nvidia drivers to install on host for use in LXC with CUDA

pundip

Member
Aug 13, 2021
7
2
8
45
Proxmox version: 7.4-18
Kernel version: pve-manager/7.4-18/b1f94095 (running kernel: 5.15.158-1-pve)

I have been trying to get my nvidia RTX 670 working within an LXC container so I can get CUDA running. My understanding as per this blog:
https://yomis.blog/nvidia-gpu-in-proxmox-lxc/
Is that I need to first install the driver on the host with dkms ie:
./NVIDIA-Linux-x86_64-304.137.run --dkms

And then install it on the container without kernel modules:
./NVIDIA-Linux-x86_64-304.137.run --no-kernel-module

I think I have installed the kernel header correctly as the command apt list --installed | grep headers gives me:


Code:
pve-headers-5.15.158-1-pve/stable,now 5.15.158-1 amd64 [installed,automatic]
pve-headers-5.15.30-2-pve/stable,now 5.15.30-3 amd64 [installed]
pve-headers-5.15/stable,now 7.4-14 all [installed,automatic]
pve-headers/stable,now 7.4-1 all [installed]

DKMS seems to be installed as apt install dkms now gives dkms is already the newest version (2.8.4-3).

So when I try to run:


./NVIDIA-Linux-x86_64-304.137.run --dkms

I get asked
“Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel Later.” to which I answer “Yes”

When installing I get the following error:

Code:
ERROR: Failed to run `/usr/sbin/dkms build -m nvidia -v 304.137 -k 5.15.158-1-pve`:
         Kernel preparation unnecessary for this kernel.  Skipping...
 
         Building module:                                                                                                                                         
         cleaning build area...                                         
         make -j32 KERNELRELEASE=5.15.158-1-pve module SYSSRC=/lib/modules/5.15.158-1-pve/build...(bad exit status: 2)                                             
         Error! Bad return status for module build on kernel: 5.15.158-1-pve (x86_64)
         Consult /var/lib/dkms/nvidia/304.137/build/make.log for more information.


/var/lib/dkms/nvidia/304.137/build/make.log contains the following

*** Unable to determine the target kernel version. ***
make: *** [makefile:53: select_makefile] Error 1


The /var/log/nvidia-installer.log contents are as follows:

Code:
nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Mon Jul  1 15:48:47 2024
installer version: 304.137


PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin


nvidia-installer command line:
    ./nvidia-installer
    --dkms


Using: nvidia-installer ncurses v6 user interface
-> License accepted.
-> Installing NVIDIA driver version 304.137.
-> There appears to already be a driver installed on your system (version: 304.137).  As part of installing this driver (version: 304.137), the existing driver wil>
-> Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel late>
-> Installing both new and classic TLS OpenGL libraries.
-> Installing classic TLS 32bit OpenGL libraries.
-> Install NVIDIA's 32-bit compatibility OpenGL libraries? (Answer: Yes)
-> Uninstalling the previous installation with /usr/bin/nvidia-uninstall.
-> Searching for conflicting X files:
-> done.
-> Searching for conflicting OpenGL files:
-> done.
-> Installing 'NVIDIA Accelerated Graphics Driver for Linux-x86_64' (304.137):
   executing: '/usr/sbin/ldconfig'...
   executing: '/usr/sbin/depmod -aq'...
   depmod: WARNING: Ignored deprecated option -q
-> done.
-> Driver file installation is complete.
-> Installing DKMS kernel module:
ERROR: Failed to run `/usr/sbin/dkms build -m nvidia -v 304.137 -k 5.15.158-1-pve`:
Kernel preparation unnecessary for this kernel.  Skipping...


Building module:
cleaning build area...
make -j32 KERNELRELEASE=5.15.158-1-pve module SYSSRC=/lib/modules/5.15.158-1-pve/build...(bad exit status: 2)
Error! Bad return status for module build on kernel: 5.15.158-1-pve (x86_64)
Consult /var/lib/dkms/nvidia/304.137/build/make.log for more information.
-> error.
ERROR: Failed to install the kernel module through DKMS. No kernel module was installed; please try installing again without DKMS, or check the DKMS logs for more >
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the >


Any advice on how I can get the drivers for RTX 670 work?

The server is a Poweredge 720XD and lspci is showing the card.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!