GPU Passthrough to LXC Container

ProxyUser

New Member
Aug 15, 2022
20
3
3
I've been having GPU passthrough issue with Dell R720 passing the GPU to an ubuntu 22.04 container. Proxmox host looks fine and I'm able to see the /dev/nvidia device files in the Ubuntu container. But no CUDA capable device is being detected in the container. I tried this on on Proxmox VE 7.2-7


On the Proxmox host:
---------------------------------
#cat /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"

# cat /etc/pve/lxc/101.conf
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 509:* rwm
lxc.cgroup.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file

#nvidia-smi
1661363354460.png

On Ubuntu 22.04 LXC Container:
--------------------------------------------------
1. Verified GPU is available through device files:
# ll /dev/nvidia*
crw-rw-rw- 1 root root 195, 254 Aug 24 15:22 /dev/nvidia-modeset
crw-rw-rw- 1 root root 509, 0 Aug 24 15:22 /dev/nvidia-uvm
crw-rw-rw- 1 root root 509, 1 Aug 24 15:22 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195, 0 Aug 24 15:22 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Aug 24 15:22 /dev/nvidiactl
# ll /dev/dri/*
crw-rw---- 1 root video 226, 0 Aug 24 15:22 /dev/dri/card0
crw-rw---- 1 root video 226, 1 Aug 24 15:22 /dev/dri/card1
crw-rw---- 1 root syslog 226, 128 Aug 24 15:22 /dev/dri/renderD128
# ll /dev/fb*
crw-rw---- 1 root video 29, 0 Aug 24 15:22 /dev/fb0

2. Installed CUDA (same version as host) using this: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu-installation

3. Install the CUDA Samples from here: https://github.com/nvidia/cuda-samples

4. Test GPU available from Container:
# /cudaSamples/cuda-samples/Samples/1_Utilities/deviceQuery/deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 100
-> no CUDA-capable device is detected
Result = FAIL


Why is the GPU not available inside the Ubuntu Container?

Any help appreciated.
 
Last edited:
Thank you for your response and sharing a useful link. Indeed that is my exact same problem: GPU not visible inside LXC container although cuda is recognized and installed and the nVidia devices are mounted on the LXC container and the host.

But the resolution must be different because Proxmox VE 7.2 on host derives its 5.15 based kernel from the Ubuntu 22.04 kernel which is in the LXC container. Also all guides, that show how to set this up, show that the nVidia devices (ls /dev/nvidia*) are owned by root in the LXC container.

What could I be doing wrong?
 
as far as i got from the link, the files should be belong to nobody/nogroup instead of root
(at least according to the original instruction from the superuser post: https://sqream.com/blog/cuda-in-lxc-containers/)

also, the kernel is based on ubuntu 22.04, but the remaining system is a debian 11, maybe you could just try once if that fixes the problem?
 
  • Like
Reactions: ProxyUser
You are absolutely correct! That fixed it.

The device files (/dev/nvidia*) are correctly set to nobody/nogroup if the "Unpriviledged Container" flag is set (checked) on the container at the time of creation. I made the mistake of setting the "Unpriviledged Container" to off (unchecked) and that cause the device files to be owned by root which caused the CUDA problem.

@dcsapak : Thank you very much for your help. You are a genius and awesome!
 
  • Like
Reactions: dcsapak
You are absolutely correct! That fixed it.

The device files (/dev/nvidia*) are correctly set to nobody/nogroup if the "Unpriviledged Container" flag is set (checked) on the container at the time of creation. I made the mistake of setting the "Unpriviledged Container" to off (unchecked) and that cause the device files to be owned by root which caused the CUDA problem.

@dcsapak : Thank you very much for your help. You are a genius and awesome!
I'm glad you made this post, as it answered many of my questions. However, I was wondering, *if* I needed to run CUDA on a privileged container could I just chown -R nobody:nogroup /dev/nvidia, or add the LXC user to the LXC root group?
 
Hey guys.
These two lines that OP mentioned as well, change their number (509 sometimes goes to 508, 511, etc) after reboot which makes the lxc not to recognize the GPU (I'm running deepstack on an ubuntu lxc).
Code:
crw-rw-rw- 1 root root 509, 0 Aug 24 15:22 /dev/nvidia-uvm
crw-rw-rw- 1 root root 509, 1 Aug 24 15:22 /dev/nvidia-uvm-tools
And then I have to manually update the /etc/pve/lxc/101.conf file accordingly and reboot the container.
What am I missing? Thanks.
 
I've been having GPU passthrough issue with Dell R720 passing the GPU to an ubuntu 22.04 container. Proxmox host looks fine and I'm able to see the /dev/nvidia device files in the Ubuntu container. But no CUDA capable device is being detected in the container. I tried this on on Proxmox VE 7.2-7


On the Proxmox host:
---------------------------------
#cat /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"

# cat /etc/pve/lxc/101.conf
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 509:* rwm
lxc.cgroup.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file

#nvidia-smi
View attachment 40325

On Ubuntu 22.04 LXC Container:
--------------------------------------------------
1. Verified GPU is available through device files:
# ll /dev/nvidia*
crw-rw-rw- 1 root root 195, 254 Aug 24 15:22 /dev/nvidia-modeset
crw-rw-rw- 1 root root 509, 0 Aug 24 15:22 /dev/nvidia-uvm
crw-rw-rw- 1 root root 509, 1 Aug 24 15:22 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195, 0 Aug 24 15:22 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Aug 24 15:22 /dev/nvidiactl
# ll /dev/dri/*
crw-rw---- 1 root video 226, 0 Aug 24 15:22 /dev/dri/card0
crw-rw---- 1 root video 226, 1 Aug 24 15:22 /dev/dri/card1
crw-rw---- 1 root syslog 226, 128 Aug 24 15:22 /dev/dri/renderD128
# ll /dev/fb*
crw-rw---- 1 root video 29, 0 Aug 24 15:22 /dev/fb0

2. Installed CUDA (same version as host) using this: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu-installation

3. Install the CUDA Samples from here: https://github.com/nvidia/cuda-samples

4. Test GPU available from Container:
# /cudaSamples/cuda-samples/Samples/1_Utilities/deviceQuery/deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 100
-> no CUDA-capable device is detected
Result = FAIL


Why is the GPU not available inside the Ubuntu Container?

Any help appreciated.
Hello, do you have this in an ordered list of steps to replicate? I'm trying to get a GPU to work in an LXC like you were able to do.

Thank you,
 
Hello, do you have this in an ordered list of steps to replicate? I'm trying to get a GPU to work in an LXC like you were able to do.
Isn't that already an ordered list? Just do top-to-bottom and after you reached the bottom (yes DO everything), it should work.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!