Tesla P4 | Cannot get drivers installed at all!!!

Thanks for that, now I have the kernel error again haha.. What a mess, maybe I should blow it all away and start with unraid or something.

ERROR: An error occurred while performing the step: "Building kernel modules".
 
Thanks for that, now I have the kernel error again haha.. What a mess, maybe I should blow it all away and start with unraid or something.

ERROR: An error occurred while performing the step: "Building kernel modules".
You’re welcome and which kernel are you running? (Type uname -r if you’re unsure)
 
That should work fine with 535.230 or similar I believe. Do you have kermel headers installed? (Type apt install pve-headers ) if they need to be installed that will cause the build failure.

Which driver version specifically are you installing?
 
Well, now I'm trying to install the follow v16
./NVIDIA-Linux-x86_64-535.54.06-vgpu-kvm.run --dkms -m=kernel
 
Ah, okay, i think thats the 16.5 one that requires a compatibility patch for 6.8 because of file changes in the 6.8 kernel. That explains why it wont build for you.

You can either use a newer 16.x version (16.9 is best which is 535.230.02) or go to the polloloco page below and follow these instructions and apply the patch. (I recommend applying the patch i linked to on the first page and greendams new patch that is also linked in that same thread then you can use 17.5 and driver 553 in guests and even update to the 6.14 kernel if you’re going with the kvm drivers)

But remember anything vgpu-kvm = vgpu, not lxc, it will not support cuda, etc on host/lxc and will not work with lxc (but in my experience using a tesla p4 with lxc on any driver hasnt actually worked at all no matter what i did with it )

https://gitlab.com/polloloco/vgpu-proxmox
 
ok, :( So Tesla P4 = Burden
Don’t get me wrong, its a great vGPU card and it works very well for a lot of things, it has a lot of performance and potential for being a 75w card, i love mine, but i haven’t been able to get it working with LXCs at all, i hope to one day for jellyfin, immich and a few other things but still have yet to find a way. It doesn’t seem like LXC is really designed well for GPU access to begin with so its kind of a mess of problems.

It would probably be easier to either forget about gpu in lxc or just make it a vm and use a vGPU profile for plex.

(Its also getting harder and more complicated since nvidia dropped support for it in newer drivers this was a fairly recent development since the card is getting older.)
 
Last edited:
So, i downloaded the EXACT driver and patched it as per the guide. So far it installs well.

root@R730Node01:~/newNvidia# nvidia-smi vgpu
Tue Apr 15 12:07:26 2025
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05 Driver Version: 535.161.05 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 Tesla P4 | 00000000:04:00.0 | 0% |
+---------------------------------+------------------------------+------------+
root@R730Node01:~/newNvidia#
root@R730Node01:~/newNvidia#
root@R730Node01:~/newNvidia# nvidia-smi
Tue Apr 15 12:07:35 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05 Driver Version: 535.161.05 CUDA Version: N/A |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P4 Off | 00000000:04:00.0 Off | 0 |
| N/A 61C P0 24W / 75W | 31MiB / 7680MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
root@R730Node01:~/newNvidia#
 
If mdevctl types lists your mdev profiles then you should be good to go for vGPU, the guide can help you with most things related.

Might want to go to datacenter → Resource mappings and create a new one for your p4 to make it easier to Add into VMs

Theres more here too of course:
https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE

And on the guide as it shows you can use profile overrides

I usually set them per vm unless i need all 8gb, without overriding you can only use the same profile in each vm, so no 2gb + 4gb etc only 4gb+4gb or 2gb+2gb without overriding, but with it you can set any vm to anything you have the free vram for, that helps a lot, some really only need 512-1gb
 
mdev doesn't show anything but nvidia-smi vgpu does.

So from the HOST:
root@R730Node01:~# ls -l /dev/nv*
crw-rw-rw- 1 root root 195, 0 Apr 15 12:04 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Apr 15 12:04 /dev/nvidiactl
crw------- 1 root root 10, 122 Apr 15 11:19 /dev/nvme-fabrics
crw------- 1 root root 10, 144 Apr 15 11:19 /dev/nvram

/dev/nvidia-caps:
total 0
cr-------- 1 root root 235, 1 Apr 15 12:04 nvidia-cap1
cr--r--r-- 1 root root 235, 2 Apr 15 12:04 nvidia-cap2
root@R730Node01:~# nvidia-smi
Tue Apr 15 12:41:04 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05 Driver Version: 535.161.05 CUDA Version: N/A |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P4 On | 00000000:04:00.0 Off | 0 |
| N/A 59C P8 11W / 75W | 31MiB / 7680MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
root@R730Node01:~# nvidia-smi-vgpu
-bash: nvidia-smi-vgpu: command not found
root@R730Node01:~# nvidia-smi vgpu
Tue Apr 15 12:41:24 2025
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05 Driver Version: 535.161.05 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 Tesla P4 | 00000000:04:00.0 | 0% |
+---------------------------------+------------------------------+------------+
root@R730Node01:~#



FROM THE LXC:
root@r730plex:~# ls -l /dev/nv*
crw-rw-rw- 1 root root 195, 0 Apr 15 12:04 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Apr 15 12:04 /dev/nvidiactl
---------- 1 root root 0 Apr 15 12:25 /dev/nvidia-modeset
---------- 1 root root 0 Apr 15 12:25 /dev/nvidia-uvm
---------- 1 root root 0 Apr 15 12:25 /dev/nvidia-uvm-tools
crw------- 1 root root 10, 144 Apr 15 11:19 /dev/nvram

/dev/nvidia-caps:
total 0
cr-------- 1 root root 235, 1 Apr 15 12:25 nvidia-cap1
cr--r--r-- 1 root root 235, 2 Apr 15 12:25 nvidia-cap2
root@r730plex:~# nvidia-smi
Tue Apr 15 12:42:17 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05 Driver Version: 535.161.05 CUDA Version: N/A |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P4 On | 00000000:04:00.0 Off | 0 |
| N/A 59C P8 11W / 75W | 31MiB / 7680MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
root@r730plex:~# nvidia-smi vgpu
Tue Apr 15 12:42:26 2025
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05 Driver Version: 535.161.05 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 Tesla P4 | 00000000:04:00.0 | 0% |
+---------------------------------+------------------------------+------------+
root@r730plex:~#


And my LXC CONF FIle is as follows:

cores: 4
features: nesting=1
hostname: r730plex
memory: 8192
nameserver: 192.168.1.1
net0: name=eth0,bridge=vmbr0,gw=192.168.1.1,hwaddr=BC:24:11:B0:4D:C5,ip=192.168.1.105/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: VMstorage:subvol-105-disk-0,size=8G
swap: 512
tags: community-script;media
lxc.cgroup2.devices.allow: a
lxc.cap.drop:
lxc.cgroup2.devices.allow: c 188:* rwm
lxc.cgroup2.devices.allow: c 189:* rwm
lxc.mount.entry: /dev/serial/by-id dev/serial/by-id none bind,optional,create=dir
lxc.mount.entry: /dev/ttyUSB0 dev/ttyUSB0 none bind,optional,create=file
lxc.mount.entry: /dev/ttyUSB1 dev/ttyUSB1 none bind,optional,create=file
lxc.mount.entry: /dev/ttyACM0 dev/ttyACM0 none bind,optional,create=file
lxc.mount.entry: /dev/ttyACM1 dev/ttyACM1 none bind,optional,create=file
lxc.cgroup2.devices.allow: c 195:0 rw
lxc.cgroup2.devices.allow: c 195:255 rw
lxc.cgroup2.devices.allow: c 195:254 rw
lxc.cgroup2.devices.allow: c 509:0 rw
lxc.cgroup2.devices.allow: c 509:1 rw
lxc.cgroup2.devices.allow: c 10:144 rw
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/nvram dev/nvram none bind,optional,create=file


BUT, Plex still cannot see the GPU.
 
You can try systemctl restart nvidia-vgpu-mgr.service to get the mdevctl types command and vgpu working.

But as i said above, the kvm driver will not work for LXCs, you have to use the standard non kvm driver to try to get your lxc to work.
 
Ok, I will find the non kvm driver for the LXC. But last time I tried this, I got driver mismatch issues.
 
Ok, I will find the non kvm driver for the LXC. But last time I tried this, I got driver mismatch issues.
with driver mismatch, you need to uninstall the grid driver from the lxc and install the exact same standard non kvm non grid driver as the host with --no-kernel-modules

this also often occurs with the higher versions above 535 when using the p4, ive had it happen where if you use 17.x / 550 any driver in LXC or linux vms with vgpu will report mismatch. (people using the kvm driver 17.5 / 553 need the patch linked on the first page)
 
So I keep getting errors when I try to install another other drivers on the host. At this stage, I'm looking at moving to UNRAID. For 2025, this is just mental.
 
From within the PLEX LXC


root@R730Node01:~# mdevctl types
0000:04:00.0
nvidia-157
Available instances: 4
Device API: vfio-pci
Name: GRID P4-2B
Description: num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4
nvidia-214
Available instances: 4
Device API: vfio-pci
Name: GRID P4-2B4
Description: num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4
nvidia-243
Available instances: 8
Device API: vfio-pci
Name: GRID P4-1B4
Description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8
nvidia-63
Available instances: 8
Device API: vfio-pci
Name: GRID P4-1Q
Description: num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120x2880, max_instance=8
nvidia-64
Available instances: 4
Device API: vfio-pci
Name: GRID P4-2Q
Description: num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=7680x4320, max_instance=4
nvidia-65
Available instances: 2
Device API: vfio-pci
Name: GRID P4-4Q
Description: num_heads=4, frl_config=60, framebuffer=4096M, max_resolution=7680x4320, max_instance=2
nvidia-66
Available instances: 1
Device API: vfio-pci
Name: GRID P4-8Q
Description: num_heads=4, frl_config=60, framebuffer=8192M, max_resolution=7680x4320, max_instance=1
nvidia-67
Available instances: 8
Device API: vfio-pci
Name: GRID P4-1A
Description: num_heads=1, frl_config=60, framebuffer=1024M, max_resolution=1280x1024, max_instance=8
nvidia-68
Available instances: 4
Device API: vfio-pci
Name: GRID P4-2A
Description: num_heads=1, frl_config=60, framebuffer=2048M, max_resolution=1280x1024, max_instance=4
nvidia-69
Available instances: 2
Device API: vfio-pci
Name: GRID P4-4A
Description: num_heads=1, frl_config=60, framebuffer=4096M, max_resolution=1280x1024, max_instance=2
nvidia-70
Available instances: 1
Device API: vfio-pci
Name: GRID P4-8A
Description: num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=1280x1024, max_instance=1
nvidia-71
Available instances: 8
Device API: vfio-pci
Name: GRID P4-1B
Description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8

root@R730Node01:~#
 
It looks like you’re still using the KVM/GRID driver. It will NOT work for LXCs. I and many others, really wish it would.

this is not the fault of proxmox and you would have the same problem on UNRAiD if you try to use the wrong driver, for a function that is not supported by the driver. The kvm driver has NO cuda. It literally cannot function in a standard way. The component is missing and plex requires cuda or it will not properly see the card.


You have to either
  1. get standard drivers from the repo​
  2. download a standard 535 driver from nvidia
  3. give up on the LXC working
Those are the only options here.

What errors are you running into with other drivers and specifically which other driver are you trying to install?

The ONLY way to continue using the kvm driver is to setup plex as a VM and use vGPU
 
Where can I find the normal 535 driver? Because everytime I change it from the GRID drivers, I get a mismatch or error. Can I run the grid drivers on the host, just not the LXC?

btw, I tried usign a full ubuntu VM, and it still fails lol