Guide: Proxmox + Older NVIDIA GPUs (GTX 1060 / P4000 / V100) LXC Passthrough Guide

itamarbudin

New Member
Aug 13, 2024
6
1
3
After spending way too much time fighting NVIDIA driver issues, DKMS errors, and Proxmox kernel issues, I finally got a clean, reliable setup working for older NVIDIA GPUs with LXC passthrough.

This setup works well for:
  • GTX 1060 / 1070 / 1080
  • Quadro P4000 / P5000
  • Tesla V100
  • Most Pascal / Volta cards
Use cases:
  • Ollama
  • CUDA
  • Docker
  • Plex transcoding
  • Local LLMs
  • AI inference
  • GPU acceleration inside LXC containers

IMPORTANT NOTES​

You do NOT need:
  • NVIDIA vGPU
  • enterprise licensing
  • pve-nvidia-vgpu-helper
  • patched kernels
  • weird passthrough scripts
Also:
  • Avoid Linux Kernel 7.x right now with older Pascal/Volta GPUs
  • Many users are hitting DKMS compile failures on kernel 7
  • Kernel 6.x is MUCH more stable right now
Recommended:
  • Proxmox kernel 6.8 or 6.14
  • NVIDIA 550 or 575 drivers

HOST INSTALLATION​

Install required packages:

Code:
apt update
apt install -y build-essential dkms pve-headers-$(uname -r)

Download NVIDIA driver (550 stable branch):

Code:
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/550.163.01/NVIDIA-Linux-x86_64-550.163.01.run

Alternative newer driver (575 branch):

Code:
wget https://us.download.nvidia.com/tesla/575.57.08/NVIDIA-Linux-x86_64-575.57.08.run

Make executable:

Code:
chmod +x NVIDIA-Linux-x86_64-550.163.01.run

OR for 575:

Code:
chmod +x NVIDIA-Linux-x86_64-575.57.08.run

Install driver:

Code:
./NVIDIA-Linux-x86_64-550.163.01.run

OR:

Code:
./NVIDIA-Linux-x86_64-575.57.08.run

Installer recommendations:

  • YES to DKMS
  • NO to OpenGL libraries if headless
Reboot:

Code:
reboot

Verify:

Code:
nvidia-smi

You should now see the GPU on the Proxmox host.


LXC GPU PASSTHROUGH​

Edit your container config:

Code:
nano /etc/pve/lxc/.conf

Basic passthrough config:

Code:
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 509:* rwm
lxc.cgroup2.devices.allow: c 511:* rwm

lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

Restart container:

Code:
pct restart


MULTIPLE GPU SETUPS​

If your server has multiple GPUs, things work a little differently.

Example:

  • GPU0 = GTX 1060
  • GPU1 = Tesla V100
  • GPU2 = Quadro P4000
Each GPU gets its own device node:

Code:
/dev/nvidia0
/dev/nvidia1
/dev/nvidia2

You can verify GPU numbering with:

Code:
nvidia-smi -L

or:

Code:
ls -l /dev/nvidia*


Passing Through ONLY One Specific GPU​

Example: pass ONLY GPU1 (Tesla V100) into container.

Use:

Code:
lxc.mount.entry: /dev/nvidia1 dev/nvidia1 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file

This isolates the container to ONLY that GPU.


Passing Through Multiple GPUs​

Example: expose GPU0 and GPU2 to the same container:

Code:
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia2 dev/nvidia2 none bind,optional,create=file

Keep the shared NVIDIA devices:

Code:
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file


Restricting CUDA to Specific GPUs​

Inside the container, you can limit visible GPUs using:

Code:
export CUDA_VISIBLE_DEVICES=0

or multiple GPUs:

Code:
export CUDA_VISIBLE_DEVICES=0,1

VERY useful for:

  • Ollama
  • Docker
  • AI workloads
  • Multi-user servers

Docker + NVIDIA Toolkit Inside LXC​

If running Docker inside the container:

Code:
apt install -y nvidia-container-toolkit

Test with:

Code:
docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi


INSIDE THE CONTAINER​

Download the same NVIDIA driver version used on the host.

550 branch:

Code:
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/550.163.01/NVIDIA-Linux-x86_64-550.163.01.run

575 branch:

Code:
wget https://us.download.nvidia.com/tesla/575.57.08/NVIDIA-Linux-x86_64-575.57.08.run

Install userspace libraries ONLY:

550 example:

Code:
chmod +x NVIDIA-Linux-x86_64-550.163.01.run

./NVIDIA-Linux-x86_64-550.163.01.run --no-kernel-module

575 example:

Code:
chmod +x NVIDIA-Linux-x86_64-575.57.08.run

./NVIDIA-Linux-x86_64-575.57.08.run --no-kernel-module

Test GPU access:

Code:
nvidia-smi

If everything is correct, the container should now see the GPU.


OPTIONAL: CUDA INSTALLATION INSIDE LXC​

Download CUDA installer matching the NVIDIA driver version.

CUDA 12.9.1 + NVIDIA 575.57.08:

Code:
wget https://developer.download.nvidia.com/compute/cuda/12.9.1/local_installers/cuda_12.9.1_575.57.08_linux.run

Make executable:

Code:
chmod +x cuda_12.9.1_575.57.08_linux.run

Run installer:

Code:
./cuda_12.9.1_575.57.08_linux.run

IMPORTANT:
  • Skip driver installation during CUDA setup
  • The NVIDIA driver is already installed on the Proxmox host
  • Inside LXC you only need CUDA userspace libraries/tools
Recommended CUDA components:
  • CUDA Toolkit
  • CUDA Runtime
  • nvcc
  • CUDA libraries
You can usually skip:
  • Driver
  • DKMS
  • Kernel modules

VERIFY CUDA​

Check GPU visibility:

Code:
nvidia-smi

Check CUDA compiler:

Code:
nvcc --version

You should now have:

  • CUDA working inside LXC
  • GPU passthrough working
  • Docker/Ollama/AI workloads accelerated

EXAMPLE NVIDIA-SMI OUTPUT​

Example nvidia-smi output after successful passthrough:

Code:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08 Driver Version: 575.57.08 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------|
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla V100-SXM2-16GB Off | 00000000:65:00.0 Off | 0 |
| N/A 36C P0 45W / 300W| 0MiB / 16384MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 Quadro P4000 Off | 00000000:B3:00.0 Off | N/A |
| 34% 42C P8 12W / 105W| 256MiB / 8192MiB | 3% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 GeForce GTX 1060 6GB Off | 00000000:C4:00.0 Off | N/A |
| 27% 39C P8 8W / 120W| 128MiB / 6144MiB | 1% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 1 N/A N/A 2481 C ollama 220MiB |
| 2 N/A N/A 3154 C python 110MiB |
+-----------------------------------------------------------------------------------------+

Useful commands:

Code:
nvidia-smi

Live monitoring:

Code:
watch -n 1 nvidia-smi

Show GPU list only:

Code:
nvidia-smi -L

Check CUDA compiler:

Code:
nvcc --version


OPTIONAL: ADD CUDA TO PATH​

Add to ~/.bashrc:

Code:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Reload shell:

Code:
source ~/.bashrc


FINAL THOUGHTS​

This setup ended up being WAY simpler than all the vGPU guides floating around online.

For older NVIDIA cards, standard LXC passthrough works perfectly fine for:

  • Ollama
  • CUDA
  • Docker
  • AI workloads
  • Plex
without needing enterprise NVIDIA features.

Hopefully this saves someone else a few hours of troubleshooting.
 
Last edited: