Nvidia GPU (P2000) passtrough to ubuntu ctx

Sir-robin10

Member
Apr 10, 2020
34
1
8
25
Hello

I've been struggling for ages by now, to get my GPU (Nvidia Quadro P2000) to be usable in a container (Ubuntu 20.04)...


I've tried every tutorial I could find on the internet, but none of them works...

I've had had it working on a windows VM, for a brief time but after some changes it didn't work anymore and I got 'error 43'...

However, I do know by that, that it is posssible to have the GPU in passtrough mode.


Inside the container, when I do
Code:
lspci -k
I get the following:

Code:
26:00.0 VGA compatible controller: NVIDIA Corporation GP106GL [Quadro P2000] (rev a1)
        Subsystem: NVIDIA Corporation GP106GL [Quadro P2000]
        Kernel driver in use: nvidia
26:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
        Subsystem: NVIDIA Corporation GP106 High Definition Audio Controller
        Kernel driver in use: snd_hda_intel

I get exactly the same on the host (the node itself).



What I got on the node is the following;

/etc/default/grub
Code:
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"
GRUB_CMDLINE_LINUX=""

# Disable os-prober, it might add menu entries for each guest
GRUB_DISABLE_OS_PROBER=true

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

So the GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on" is changed with the amd_iommu=on.


/etc/modules [icode] gives me te following; [code] # /etc/modules: kernel modules to load at boot time. # # This file contains the names of kernel modules that should be loaded # at boot time, one per line. Lines beginning with "#" are ignored. nvidia nvidia_uvm vfio vfio_iommu_type1 vfio_pci vfio_virqfd [/code] [icode]/etc/modprobe.d/blacklist.conf has the following contents;

Code:
blacklist radeon
blacklist nouveau
blacklist nvidia


I did the update-grub and update-initramfs -u -k all , but nothing seems to work at all....

I tried to install the nvidia drivers, that doesn't work either...

When I do nvidia-smi on the container I get NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

I do however (like just now) get an output on the host node:

J5Z1DPa.png


What the hell is going on here? :(

Te config of my container is the following;

Code:
arch: amd64
cores: 8
cpu: host
hostname: plex-server
hostpci0: 26:00,pcie=1
machine: q35
memory: 10240
mp0: movies-drive:100/vm-100-disk-0.raw,mp=/mnt/movies-drive,size=430G
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.207.1,hwaddr=3A:85:20:46:26:D3,ip=192.168.207.100/24,ip6=dhcp,type=veth
numa: 1
onboot: 1
ostype: ubuntu
rootfs: local-lvm:vm-100-disk-0,size=10G
swap: 1024
unprivileged: 1

anyone who could help me out?
 
PCIe passthrough is *only* possible to VMs. Containers use the same kernel and thus also share any devices and drivers, thus mapping a full PCIe device to a container cannot work.
 
So if I'd make a VM instead of a container, it would work perfectly? I wonder, why it would not be possible to have the PCIe device mapped to one or more contaiers (I'd only use the device on one container) and you won't be able to use it? It looks strange to me tho
 
PCIe assignment works by directly forwarding DMA mappings and device interrupts of all kinds into a VM's vCPU and RAM assignment. This is a feature of hardware virtualization, which is neither used nor required for containers. That's why it cannot work.

What would theoretically be possible is to forward device nodes into a container. I know for a fact this works for USB devices, and I've heard of some folk who got it to work with DRI capabilities, to use GPUs for video encoding/decoding in containers. But this depends heavily on the kind of GPU and driver used and is whole other topic from VT-x/VT-d/SVM/IOMMU virtualization, it certainly won't give you the "full" GPU experience in the container (at least not easily, and not in a supported configuration).
 
If your "issue" is that you want PCIe passthrough to work, then yes, using a VM with hostpci assigned would solve the issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!