NVIDIA vGPU for LXC containers, stuck at renderD128 w/no major/minor

canuckcam

Member
Jun 12, 2021
1
0
6
44
Looking to have vGPU support in Plex and Frigate LXC containers. Stuck right now where I should see a major/minor set next to renderD128. But it's on root with no numbers. I did a chown 666 on it to assign to the render group but no idea how to fix the major/minor where I expect to see 226 and something else to pass through to the LXC?

Code:
root@pve:~# ls -l /dev/dri
total 0
drwxr-xr-x 2 root root       60 Aug 29 01:17 by-path
crw-rw---- 1 root video  226, 0 Aug 29 01:17 card0
-rw-rw-rw- 1 root render      0 Aug 29 01:18 renderD128

HP Z4 G4, Xeon W2135
Proxmox 8.2.4, 6.8.12-1-pve kernel
NVIDIA NVS 310 for local console
Tesla P4 for vGPU passthrough to Plex and Frigate LXCs
PVE has NVIDIA 535.161.05-vgpu-kvm drivers, patched with PolloLoco

I've followed the PolloLoco instructions and I can get a vGPU successfully passed through to a Windows VM but I've worked on the LXC issue for weeks. I think I've read pretty much every reddit, github and proxmox forum thread so much so that I'm getting muddy with information overload. Also trying to figure out what advice is current and lots of info with passthrough and VM usage but very little for LXCs. I've tried to install v17 550.54.10 drivers, but I can't get it to compile on the 6.8 kernel but really should stick with v16 anyway with the older Tesla P4 card. Would going with an Intel ARC GPU make things easier?


nvidia-smi
Code:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05             Driver Version: 535.161.05   CUDA Version: N/A      |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       Off | 00000000:15:00.0 Off |                    0 |
| N/A   52C    P0              24W /  75W |     31MiB /  7680MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                       
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
 
Last edited:
Sorry, I tried to play with it but for now I thought to don't use nvidia card :rolleyes:
 
this one is a big issue, vGPU requires the KVM host driver and only the consumer host driver includes the UVM kernel module and CUDA support.

the closest i have gotten is detailed in this thread:
Merge patch for nvidia drivers to get LXC containers gpu access working

i used the linked github to create a custom merged driver with KVM+consumer versions, but this created a few other issues, it worked for the most part, biggest issue was it didnt seem to give 100% proper access to the host (probably a misconfiguration somewhere because this isnt a common setup) and no longer allowed 100% vram allocation to a profile for vGPU (8gb card was limited to 7gb max used in vGPU) because of the the host now consuming vram with the cuda and UVM features that are normally missing.

(the LXC basically gets direct access to the host and host features so the GPU has to have those features for the host itself and not a vGPU profile from my understanding)

with this i was using Nvidia features in docker on an LXC and in jellyfin on an LXC but the transcoding seemed to be buggy and im not sure if everything was allocating right but as i said, probably a configuration issue.
 
Last edited:
Maybe, the only way is to create a virtual environment like ubuntu or other system and pass vgpu to it. But the reason I don't like this is that it allocates persistent resources to the virtual environment. o_O