Nvidia GPU (P2000) passtrough to ubuntu ctx

Sir-robin10 · May 27, 2020

Hello

I've been struggling for ages by now, to get my GPU (Nvidia Quadro P2000) to be usable in a container (Ubuntu 20.04)...

I've tried every tutorial I could find on the internet, but none of them works...

I've had had it working on a windows VM, for a brief time but after some changes it didn't work anymore and I got 'error 43'...

However, I do know by that, that it is posssible to have the GPU in passtrough mode.

Inside the container, when I do

Code:

lspci -k

I get the following:

Code:

26:00.0 VGA compatible controller: NVIDIA Corporation GP106GL [Quadro P2000] (rev a1)
        Subsystem: NVIDIA Corporation GP106GL [Quadro P2000]
        Kernel driver in use: nvidia
26:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
        Subsystem: NVIDIA Corporation GP106 High Definition Audio Controller
        Kernel driver in use: snd_hda_intel

I get exactly the same on the host (the node itself).

What I got on the node is the following;

/etc/default/grub

Code:

# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"
GRUB_CMDLINE_LINUX=""

# Disable os-prober, it might add menu entries for each guest
GRUB_DISABLE_OS_PROBER=true

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

So the GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on" is changed with the amd_iommu=on.

/etc/modules [icode] gives me te following;
[code]
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

nvidia
nvidia_uvm
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
[/code]

[icode]/etc/modprobe.d/blacklist.conf

has the following contents;

Code:

blacklist radeon
blacklist nouveau
blacklist nvidia

I did the update-grub and update-initramfs -u -k all , but nothing seems to work at all....

I tried to install the nvidia drivers, that doesn't work either...

When I do nvidia-smi on the container I get

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

I do however (like just now) get an output on the host node:

What the hell is going on here?

Te config of my container is the following;

Code:

arch: amd64
cores: 8
cpu: host
hostname: plex-server
hostpci0: 26:00,pcie=1
machine: q35
memory: 10240
mp0: movies-drive:100/vm-100-disk-0.raw,mp=/mnt/movies-drive,size=430G
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.207.1,hwaddr=3A:85:20:46:26:D3,ip=192.168.207.100/24,ip6=dhcp,type=veth
numa: 1
onboot: 1
ostype: ubuntu
rootfs: local-lvm:vm-100-disk-0,size=10G
swap: 1024
unprivileged: 1

anyone who could help me out?

Stefan_R · May 28, 2020

PCIe passthrough is *only* possible to VMs. Containers use the same kernel and thus also share any devices and drivers, thus mapping a full PCIe device to a container cannot work.

Sir-robin10 · May 28, 2020

So if I'd make a VM instead of a container, it would work perfectly? I wonder, why it would not be possible to have the PCIe device mapped to one or more contaiers (I'd only use the device on one container) and you won't be able to use it? It looks strange to me tho

Stefan_R · May 28, 2020

PCIe assignment works by directly forwarding DMA mappings and device interrupts of all kinds into a VM's vCPU and RAM assignment. This is a feature of hardware virtualization, which is neither used nor required for containers. That's why it cannot work.

What would theoretically be possible is to forward device nodes into a container. I know for a fact this works for USB devices, and I've heard of some folk who got it to work with DRI capabilities, to use GPUs for video encoding/decoding in containers. But this depends heavily on the kind of GPU and driver used and is whole other topic from VT-x/VT-d/SVM/IOMMU virtualization, it certainly won't give you the "full" GPU experience in the container (at least not easily, and not in a supported configuration).

Sir-robin10 · May 28, 2020

So using a VM, instead of a CTX would solve the issue?

Stefan_R · May 28, 2020

If your "issue" is that you want PCIe passthrough to work, then yes, using a VM with hostpci assigned would solve the issue.

Search

Search

Nvidia GPU (P2000) passtrough to ubuntu ctx

Sir-robin10

Member

Stefan_R

Proxmox Retired Staff

Sir-robin10

Member

Stefan_R

Proxmox Retired Staff

Sir-robin10

Member

Stefan_R

Proxmox Retired Staff

We value your privacy