I'm having trouble getting nvidia drivers to load on my Proxmox 6.2 server. I get the following error when trying to run nvidia-smi.
root@sparta:~# nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
See my below outputs:
root@sparta:~# lspci | grep -i nvidia
81:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660] (rev a1)
81:00.1 Audio device: NVIDIA Corporation Device 1aeb (rev a1)
81:00.2 USB controller: NVIDIA Corporation Device 1aec (rev a1)
81:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1aed (rev a1)
root@sparta:~# nvidia-detect
Detected NVIDIA GPUs:
81:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
Checking card: NVIDIA Corporation TU116 [GeForce GTX 1660] (rev a1)
Your card is supported by the default drivers.
It is recommended to install the
nvidia-driver
package.
root@sparta:~# apt install nvidia-driver
Reading package lists... Done
Building dependency tree
Reading state information... Done
nvidia-driver is already the newest version (450.80.02-1~bpo10+1).
The following packages were automatically installed and are no longer required:
libnvidia-fatbinaryloader libnvidia-ptxjitcompiler1
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 3 not upgraded.
Contents of /etc/modprobe.d/pve-blacklist.conf
I even added the following per another post I found on this topic:
root@sparta:~# nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
See my below outputs:
root@sparta:~# lspci | grep -i nvidia
81:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660] (rev a1)
81:00.1 Audio device: NVIDIA Corporation Device 1aeb (rev a1)
81:00.2 USB controller: NVIDIA Corporation Device 1aec (rev a1)
81:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1aed (rev a1)
root@sparta:~# nvidia-detect
Detected NVIDIA GPUs:
81:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
Checking card: NVIDIA Corporation TU116 [GeForce GTX 1660] (rev a1)
Your card is supported by the default drivers.
It is recommended to install the
nvidia-driver
package.
root@sparta:~# apt install nvidia-driver
Reading package lists... Done
Building dependency tree
Reading state information... Done
nvidia-driver is already the newest version (450.80.02-1~bpo10+1).
The following packages were automatically installed and are no longer required:
libnvidia-fatbinaryloader libnvidia-ptxjitcompiler1
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 3 not upgraded.
Contents of /etc/modprobe.d/pve-blacklist.conf
Code:
# This file contains a list of modules which are not supported by Proxmox VE
# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
#blacklist nvidiafb
I even added the following per another post I found on this topic:
Code:
echo 'nvidia' >> /etc/modules
echo 'nvidia_uvm' >> /etc/modules
vi /etc/udev/rules.d/70-nvidia.rules
# Add the two following lines.
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"
Last edited: