GPU Passthrough not working

cakemasher

New Member
Apr 25, 2015
10
0
1
Hey community,

For a while now I've been trying to passthrough a Nvidia Quadro M2000 to a VM, unfortunately without any luck so far.

As starters I read through the PCI Passthrough wiki and followed the steps, but when I try to start the VM I get the following error:
Code:
kvm: -device vfio-pci,host=07:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on: vfio 0000:07:00.0: failed to setup container for group 23: Failed to set iommu for container: Operation not permitted
TASK ERROR: start failed: command '/usr/bin/kvm -id 100 -name TestGPU -chardev 'socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/100.pid -daemonize -smbios 'type=1,uuid=aff0ecc7-a3b7-449c-b3d1-a03d0de80a98' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,file=/dev/pve/vm-100-disk-1' -smp '8,sockets=2,cores=4,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/100.vnc,password -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,kvm=off' -m 8192 -device 'vmgenid,guid=88ffbde5-31e5-4b51-b8c2-9ab2b098ae8a' -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=07:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' -device 'vfio-pci,host=07:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' -device 'VGA,id=vga,bus=pcie.0,addr=0x1' -chardev 'socket,path=/var/run/qemu-server/100.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:1aa51d1148c6' -drive 'file=/var/lib/vz/template/iso/debian-10.1.0-amd64-netinst.iso,if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/pve/vm-100-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=C2:39:C3:88:7C:21,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=q35'' failed: exit code 1


Looking at dmesg, I see the following:
Code:
[ 2098.141649] vfio-pci 0000:07:00.1: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.


I've been searching for a solution, but haven't found any so far. Is there someone that can help me towards a solution? I've the configuration files as spoilers at the bottom of this message.



# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX=""

# Disable os-prober, it might add menu entries for each guest
GRUB_DISABLE_OS_PROBER=true

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
/sys/kernel/iommu_groups/55/devices/0000:3f:0f.4
/sys/kernel/iommu_groups/55/devices/0000:3f:0f.2
/sys/kernel/iommu_groups/55/devices/0000:3f:0f.0
/sys/kernel/iommu_groups/55/devices/0000:3f:0f.5
/sys/kernel/iommu_groups/55/devices/0000:3f:0f.3
/sys/kernel/iommu_groups/55/devices/0000:3f:0f.1
... (cut the rest to safe some space)...
07:00.0 VGA compatible controller: NVIDIA Corporation GM206GL [Quadro M2000] (rev a1)
07:00.1 Audio device: NVIDIA Corporation GM206 High Definition Audio Controller (rev a1)
agent: 1
bios: ovmf
bootdisk: scsi0
cores: 4
cpu: host,hidden=1
efidisk0: local-lvm:vm-100-disk-1,size=128K
hostpci0: 07:00,pcie=1,x-vga=1
ide2: local:iso/debian-10.1.0-amd64-netinst.iso,media=cdrom
machine: q35
memory: 8192
name: TestGPU
net0: virtio=C2:39:C3:88:7C:21,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-100-disk-0,size=50G
scsihw: virtio-scsi-pci
smbios1: uuid=aff0ecc7-a3b7-449c-b3d1-a03d0de80a98
sockets: 2
vmgenid: 88ffbde5-31e5-4b51-b8c2-9ab2b098ae8a
07:00.0 0300: 10de:1430 (rev a1)
07:00.1 0403: 10de:0fba (rev a1)
options vfio-pci ids=10de:1430,10de:0fba disable_vga=1
blacklist nouveau
blacklist nvidia
 
Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor.

this is a motherboard issue, i guess it is a hp machine? it might be fixable with a bios upgrade and also there was some thread (cannot seem to find it right now) where there was a possible fix

but in general this is a limitation out of reach for proxmox
 
took me quite a while to hunt down all the details. ESXi works great on these servers, they were built to run it, and have the pass through and sr-iov built in to the OS. Other OS’s require a little more effort. After some time I found this.
https://www.jimmdenton.com/proliant-intel-dpdk/
This article gives detailed instructions to exclude the pcie slot of your choice from the bios rmrr. Next issue is getting the HPE scripting toolkit to work on proxmox/debian: Use these HPE links:
https://downloads.linux.hpe.com/SDR/keys.html & https://downloads.linux.hpe.com/SDR/project/stk/

The actual code is:
curl https://downloads.linux.hpe.com/SDR/hpPublicKey2048.pub | apt-key add -
curl https://downloads.linux.hpe.com/SDR/hpPublicKey2048_key1.pub | apt-key add -
curl https://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub | apt-key add -

Then add a file called hpe.list to /etc/apt/sources.list.d

cat <<EOF >/etc/apt/sources.list.d/hpe.list
# HPE Scripting Tool Kit
deb http://downloads.linux.hpe.com/SDR/repo/stk buster/current non-free
EOF

then:
apt update
apt install hp-scripting-tools

This will install the scripts you need to run ‘conrep’ and modify the bios to exclude a pcie slot.

I have had issues passing hba cards using this method but have not tried using a gpu/video.

Cheers,
Mike


cat <<EOF > exclude.dat
<Conrep> <Section name="RMRDS_Slot4" helptext=".">Endpoints_Excluded</Section> </Conrep>
EOF
 
  • Like
Reactions: leesteken

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!