I've been having GPU passthrough issue with Dell R720 passing the GPU to an ubuntu 22.04 container. Proxmox host looks fine and I'm able to see the /dev/nvidia device files in the Ubuntu container. But no CUDA capable device is being detected in the container. I tried this on on Proxmox VE 7.2-7
On the Proxmox host:
---------------------------------
#cat /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
# cat /etc/pve/lxc/101.conf
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 509:* rwm
lxc.cgroup.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file
#nvidia-smi
On Ubuntu 22.04 LXC Container:
--------------------------------------------------
1. Verified GPU is available through device files:
# ll /dev/nvidia*
crw-rw-rw- 1 root root 195, 254 Aug 24 15:22 /dev/nvidia-modeset
crw-rw-rw- 1 root root 509, 0 Aug 24 15:22 /dev/nvidia-uvm
crw-rw-rw- 1 root root 509, 1 Aug 24 15:22 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195, 0 Aug 24 15:22 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Aug 24 15:22 /dev/nvidiactl
# ll /dev/dri/*
crw-rw---- 1 root video 226, 0 Aug 24 15:22 /dev/dri/card0
crw-rw---- 1 root video 226, 1 Aug 24 15:22 /dev/dri/card1
crw-rw---- 1 root syslog 226, 128 Aug 24 15:22 /dev/dri/renderD128
# ll /dev/fb*
crw-rw---- 1 root video 29, 0 Aug 24 15:22 /dev/fb0
2. Installed CUDA (same version as host) using this: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu-installation
3. Install the CUDA Samples from here: https://github.com/nvidia/cuda-samples
4. Test GPU available from Container:
# /cudaSamples/cuda-samples/Samples/1_Utilities/deviceQuery/deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 100
-> no CUDA-capable device is detected
Result = FAIL
Why is the GPU not available inside the Ubuntu Container?
Any help appreciated.
On the Proxmox host:
---------------------------------
#cat /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
# cat /etc/pve/lxc/101.conf
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 509:* rwm
lxc.cgroup.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file
#nvidia-smi
On Ubuntu 22.04 LXC Container:
--------------------------------------------------
1. Verified GPU is available through device files:
# ll /dev/nvidia*
crw-rw-rw- 1 root root 195, 254 Aug 24 15:22 /dev/nvidia-modeset
crw-rw-rw- 1 root root 509, 0 Aug 24 15:22 /dev/nvidia-uvm
crw-rw-rw- 1 root root 509, 1 Aug 24 15:22 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195, 0 Aug 24 15:22 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Aug 24 15:22 /dev/nvidiactl
# ll /dev/dri/*
crw-rw---- 1 root video 226, 0 Aug 24 15:22 /dev/dri/card0
crw-rw---- 1 root video 226, 1 Aug 24 15:22 /dev/dri/card1
crw-rw---- 1 root syslog 226, 128 Aug 24 15:22 /dev/dri/renderD128
# ll /dev/fb*
crw-rw---- 1 root video 29, 0 Aug 24 15:22 /dev/fb0
2. Installed CUDA (same version as host) using this: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu-installation
3. Install the CUDA Samples from here: https://github.com/nvidia/cuda-samples
4. Test GPU available from Container:
# /cudaSamples/cuda-samples/Samples/1_Utilities/deviceQuery/deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 100
-> no CUDA-capable device is detected
Result = FAIL
Why is the GPU not available inside the Ubuntu Container?
Any help appreciated.
Last edited: