NVIDIA vGPU for LXC containers, stuck at renderD128 w/no major/minor

canuckcam · Aug 30, 2024

Looking to have vGPU support in Plex and Frigate LXC containers. Stuck right now where I should see a major/minor set next to renderD128. But it's on root with no numbers. I did a chown 666 on it to assign to the render group but no idea how to fix the major/minor where I expect to see 226 and something else to pass through to the LXC?

Code:

root@pve:~# ls -l /dev/dri
total 0
drwxr-xr-x 2 root root       60 Aug 29 01:17 by-path
crw-rw---- 1 root video  226, 0 Aug 29 01:17 card0
-rw-rw-rw- 1 root render      0 Aug 29 01:18 renderD128

HP Z4 G4, Xeon W2135
Proxmox 8.2.4, 6.8.12-1-pve kernel
NVIDIA NVS 310 for local console
Tesla P4 for vGPU passthrough to Plex and Frigate LXCs
PVE has NVIDIA 535.161.05-vgpu-kvm drivers, patched with PolloLoco

I've followed the PolloLoco instructions and I can get a vGPU successfully passed through to a Windows VM but I've worked on the LXC issue for weeks. I think I've read pretty much every reddit, github and proxmox forum thread so much so that I'm getting muddy with information overload. Also trying to figure out what advice is current and lots of info with passthrough and VM usage but very little for LXCs. I've tried to install v17 550.54.10 drivers, but I can't get it to compile on the 6.8 kernel but really should stick with v16 anyway with the older Tesla P4 card. Would going with an Intel ARC GPU make things easier?

nvidia-smi

Code:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.05             Driver Version: 535.161.05   CUDA Version: N/A      |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       Off | 00000000:15:00.0 Off |                    0 |
| N/A   52C    P0              24W /  75W |     31MiB /  7680MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                       
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

d_G · Sep 2, 2024

I found a link, looking for a way to add nvidia... to proxmox... maybe it could be useful :
https://theorangeone.net/posts/lxc-nvidia-gpu-passthrough/

huanghuai · Oct 20, 2024

i have the same problem that i want to passthrough vgpu to lxc. Do you have solved the problem?

d_G · Oct 21, 2024

Sorry, I tried to play with it but for now I thought to don't use nvidia card

zenowl77 · Oct 21, 2024

this one is a big issue, vGPU requires the KVM host driver and only the consumer host driver includes the UVM kernel module and CUDA support.

the closest i have gotten is detailed in this thread:
Merge patch for nvidia drivers to get LXC containers gpu access working

i used the linked github to create a custom merged driver with KVM+consumer versions, but this created a few other issues, it worked for the most part, biggest issue was it didnt seem to give 100% proper access to the host (probably a misconfiguration somewhere because this isnt a common setup) and no longer allowed 100% vram allocation to a profile for vGPU (8gb card was limited to 7gb max used in vGPU) because of the the host now consuming vram with the cuda and UVM features that are normally missing.

(the LXC basically gets direct access to the host and host features so the GPU has to have those features for the host itself and not a vGPU profile from my understanding)

with this i was using Nvidia features in docker on an LXC and in jellyfin on an LXC but the transcoding seemed to be buggy and im not sure if everything was allocating right but as i said, probably a configuration issue.

huanghuai · Oct 23, 2024

Maybe, the only way is to create a virtual environment like ubuntu or other system and pass vgpu to it. But the reason I don't like this is that it allocates persistent resources to the virtual environment.

Search

Search

NVIDIA vGPU for LXC containers, stuck at renderD128 w/no major/minor

canuckcam

Member

d_G

Member

huanghuai

New Member

d_G

Member

zenowl77

Member

huanghuai

New Member

We value your privacy