PCI passtrough not working, at all.

Sir-robin10

Member
Apr 10, 2020
34
1
8
25
So, I've been struggling with this for waay too long now... I currently have a server running proxmox that has two GPU's in it. One just for video out, that is installed in slot 1 and one p2000, that is installed in slot 2. (PCI 16x slots).

Now both of the GPU's DO work, since I've tested them on another machine, there they just work (no proxmox).

What I did is follow this "guide"; https://pve.proxmox.com/wiki/Pci_passthrough

That just doesn't work at all in my case....

What I get when I assign the GPU to my windows server vm, whole proxmox dies and spits out the following:

20200721_145457.jpg


To restore the system, I have to physically remove the PCI device (the P2000 GPU), reboot, then remove the pci device in the gui, shut down, re-insert the gpu and reboot the system....


Here are the "specs" I'm using;

- AMD Ryzen 3700X
- Nvidia P2000 (secondairy gpu)
- GeForce 8400 GS (primary gpu for display only)
- 32 GB DDR4 RAM
- AS Rock B450-A pro max motherboard.

The proxmox spect then;

- proxmox-ve: 6.2-1 (running kernel: 5.4.44-2-pve)
- pve-manager: 6.2-10 (running version: 6.2-10/a20769ed)
- pve-kernel-5.4: 6.2-4
- pve-kernel-helper: 6.2-4
- pve-kernel-5.3: 6.1-6
- pve-kernel-5.4.44-2-pve: 5.4.44-2
- pve-kernel-5.4.44-1-pve: 5.4.44-1
- pve-kernel-5.4.41-1-pve: 5.4.41-1
- pve-kernel-5.3.18-3-pve: 5.3.18-3
- pve-kernel-5.3.18-2-pve: 5.3.18-2
- ceph-fuse: 12.2.11+dfsg1-2.1+b1
- corosync: 3.0.4-pve1
- criu: 3.11-3
- glusterfs-client: 5.5-3
- ifupdown: 0.8.35+pve1
- ksm-control-daemon: 1.3-1
- libjs-extjs: 6.0.1-10
- libknet1: 1.16-pve1
- libproxmox-acme-perl: 1.0.4
- libpve-access-control: 6.1-2
- libpve-apiclient-perl: 3.0-3
- libpve-common-perl: 6.1-5
- libpve-guest-common-perl: 3.1-1
- libpve-http-server-perl: 3.0-6
- libpve-storage-perl: 6.2-5
- libqb0: 1.0.5-1
- libspice-server1: 0.14.2-4~pve6+1
- lvm2: 2.03.02-pve4
- lxc-pve: 4.0.2-1
- lxcfs: 4.0.3-pve3
- novnc-pve: 1.1.0-1
- proxmox-mini-journalreader: 1.1-1
- proxmox-widget-toolkit: 2.2-9
- pve-cluster: 6.1-8
- pve-container: 3.1-11
- pve-docs: 6.2-5
- pve-edk2-firmware: 2.20200531-1
- pve-firewall: 4.1-2
- pve-firmware: 3.1-1
- pve-ha-manager: 3.0-9
- pve-i18n: 2.1-3
- pve-qemu-kvm: 5.0.0-11
- pve-xtermjs: 4.3.0-1
- qemu-server: 6.2-10
- smartmontools: 7.1-pve2
- spiceterm: 3.1-1
- vncterm: 1.6-1
- zfsutils-linux: 0.8.4-pve1


I have tried everything, nothing works... I want to have the GPU to be usable inside a CTX. That SHOULD be possible as far as I know. If not then I'll create from the CTX a vm....

Any help please. I don't know what to do :(
 
I have tried everything, nothing works... I want to have the GPU to be usable inside a CTX. That SHOULD be possible as far as I know. If not then I'll create from the CTX a vm....
If by CTX you mean container, then no, that's not possible. Only VMs support GPU passthrough.

For your issue: What do your IOMMU groups look like (the wiki article you linked contains a section about that, your device needs to be isolated)? What does the kernel log say about DMAR/IOMMU (dmesg | grep -e DMAR -e IOMMU)? A bit more information about your system (e.g. /etc/pve/qemu-server/<vmid>.conf as well as a full lspci -nnk) would help.
 
Hi,

Okay, so I didn't know if I gave enough information and what would be usefull, but the dmesg output is the following:

Code:
root@node02:~# dmesg | grep -e DMAR -e IOMMU
[    0.585421] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.589438] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.590426] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).



The lspci gives the following output (cleaned up a bit, due to too muc characters):

Code:
root@node02:~# lspci -nnk
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4 [1022:1444]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5 [1022:1445]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6 [1022:1446]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7 [1022:1447]
01:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology Device [c0a9:2263] (rev 03)
        Subsystem: Micron/Crucial Technology Device [c0a9:2263]
        Kernel driver in use: nvme
03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
        Subsystem: ASMedia Technology Inc. 400 Series Chipset USB 3.1 XHCI Controller [1b21:1142]
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
        Subsystem: ASMedia Technology Inc. 400 Series Chipset SATA Controller [1b21:1062]
        Kernel driver in use: ahci
        Kernel modules: ahci
03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
        Kernel driver in use: pcieport
20:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
        Kernel driver in use: pcieport
20:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
        Kernel driver in use: pcieport
20:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
        Kernel driver in use: pcieport
20:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
        Kernel driver in use: pcieport
20:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
        Kernel driver in use: pcieport
20:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
        Kernel driver in use: pcieport
21:00.0 SATA controller [0106]: Marvell Technology Group Ltd. Device [1b4b:9215] (rev 11)
        Subsystem: Marvell Technology Group Ltd. Device [1b4b:9215]
        Kernel driver in use: ahci
        Kernel modules: ahci
22:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
        Subsystem: Micro-Star International Co., Ltd. [MSI] RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [1462:7b86]
        Kernel driver in use: r8169
        Kernel modules: r8169
25:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106GL [10de:1c30] (rev a1)
        Subsystem: NVIDIA Corporation GP106GL [Quadro P2000] [10de:11b3]
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau
25:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
        Subsystem: NVIDIA Corporation GP106 High Definition Audio Controller [10de:11b3]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel
28:00.0 SATA controller [0106]: Marvell Technology Group Ltd. Device [1b4b:9215] (rev 11)
        Subsystem: Marvell Technology Group Ltd. Device [1b4b:9215]
        Kernel driver in use: ahci
        Kernel modules: ahci
29:00.0 VGA compatible controller [0300]: NVIDIA Corporation G98 [GeForce 8400 GS Rev. 2] [10de:06e4] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] G98 [GeForce 8400 GS Rev. 2] [1462:1163]
        Kernel modules: nvidiafb, nouveau
2a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function [1022:148a]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse PCIe Dummy Function [1462:7b86]
2b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse Reserved SPP [1462:7b86]
2b:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse Cryptographic Coprocessor PSPCPP [1462:7b86]
        Kernel driver in use: ccp
        Kernel modules: ccp
2b:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Matisse USB 3.0 Host Controller [1462:7b86]
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
2b:00.4 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller [1022:1487]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Starship/Matisse HD Audio Controller [1462:9b86]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
30:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
        Subsystem: Micro-Star International Co., Ltd. [MSI] FCH SATA Controller [AHCI mode] [1462:7b86]
        Kernel driver in use: ahci
        Kernel modules: ahci
31:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
        Subsystem: Micro-Star International Co., Ltd. [MSI] FCH SATA Controller [AHCI mode] [1462:7b86]
        Kernel driver in use: ahci
        Kernel modules: ahci

So there you can see that I need to passtrough the 25:00.0 PCI device.

Then the vm config (not modified for passtrough, since it crashes if I do so) is the following;

Code:
agent: 1
args: -cpu 'host,+kvm_pv_unhalt,hv_vendor_id=NV43FIX,kvm=off'
balloon: 0
bootdisk: ide0
cores: 4
cpu: host,hidden=1,flags=+pcid
ide0: vm_storage:vm-108-disk-0,size=100G,ssd=1
ide1: zfs1:vm-108-disk-0,backup=0,size=1000G,ssd=1
memory: 10240
name: windows-2019
net0: e1000=96:C4:25:49:DA:D9,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=11ee9edc-2d44-4783-b984-3ca140872535
sockets: 2
startup: order=1
usb0: host=2516:0051,usb3=1
vmgenid: c9afd214-228b-4ddc-8ee6-1442b89e39c3

As you can see, I have an USB passtrough, that works just fine. The args and CPU flags give me a (I think) desired outcome to have the vm "think" that there is no vm here, but that it runs just on bare metal. I get the CPU specs as intended.

GhM4302.png
 
I just went through what you’re dealing with. Similar seteup, same issues. And resolved all of them. Proxmox just uses qemu, so mostly this involves making the right qemu settings in your vm .conf and critically, in your uefi.

This blog helped tremendously: https://heiko-sieger.info/running-windows-10-on-linux-using-kvm-with-vga-passthrough/#comment-125
Most of the stuff mentioned in the article should be irrelevant for PVE, as we do it automatically anyway.

Hi,

Okay, so I didn't know if I gave enough information and what would be usefull, but the dmesg output is the following:
What I meant was the IOMMU grouping, i.e. the output of find /sys/kernel/iommu_groups/ -type l. Your GPU device should be in it's own group, but the log you posted first makes it seem to me as if it shares a group with your SATA controller - thus breaking your disk access everytime you try to pass it through.
 
Most of the stuff mentioned in the article should be irrelevant for PVE, as we do it automatically anyway.

Right, most of that article should be ignored for GPU passthrough in Proxmox, but it provides additional info about qemu + troubleshooting + win10 tuning than the Proxmox article.

OP, I noticed that your vm .conf did not show machine: q35. I believe you will be better off with this, but maybe Stefan_R can comment. Also, which BIOS are you using for the guest? Ideally, OVMF should be used.

For your host UEFI, check these:
  • Upgrade your UEFI to the latest, but google first to make sure the latest doesn't introduce any SVM, ACS, or IMMOU bugs.
  • Enable ACS in UEFI
  • Enable PCIe ARI (I also enabled ARI Enumeration, but couldn't find info on what it does)
  • Don't rely on "Auto" setting in UEFI. Set these to Enabled.
If these don't separate your VGAs into their own IMMOU groups, try moving the GPU that is in the SATA group to a different physical PCI slot on your motherboard. That may give it its own IMMOU group.

Can you post the full vm .conf that you're using for GPU passthrough? (I think the one you posted removed some stuff?)
 
Most of the stuff mentioned in the article should be irrelevant for PVE, as we do it automatically anyway.


What I meant was the IOMMU grouping, i.e. the output of find /sys/kernel/iommu_groups/ -type l. Your GPU device should be in it's own group, but the log you posted first makes it seem to me as if it shares a group with your SATA controller - thus breaking your disk access everytime you try to pass it through.

Well,

I have 12 disks, where there are 2 PCIE cards (HBA) that run 10 ofthese, and 2 more from the motherboard itself. Maybe that is why these are in the same group?


The full vm config for now was given, the only things I changed (and to have it crash) was to add the GPU and boot the vm.

Same hapens when I use the following settings (they are marked as pending)

R292qva.png
 
I have 12 disks, where there are 2 PCIE cards (HBA) that run 10 ofthese, and 2 more from the motherboard itself. Maybe that is why these are in the same group?

The GPU has to be in its own IOMMU group for passthrough to work OR you have to passthrough all of the devices in that group. If your motherboard can't support the isolation or you can't passthrough the SATA device as well, you'll need to try the ACS override, but that brings risks. Otherwise, you'll just need to buy a motherboard that provides better isolation. The Gigabyte Aorus boards are good for this. My Aorus Master has something like 37 IOMMU groups.
 
Ugh, well I've had it working one day with the GPU in the first slot, so I've put the GPU in the first slot and this is what the pve gives me now;

kvm: -device vfio-pci,host=0000:29:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on: Failed to mmap 0000:29:00.0 BAR 3. Performance may be slow
kvm: -device ide-hd,bus=ide.0,unit=1,drive=drive-ide1,id=ide1,rotation_rate=1: Can't create IDE unit 1, bus supports only 1 units
TASK ERROR: start failed: QEMU exited with code 1


These are my settings:

Code:
agent: 1
args: -cpu 'host,+kvm_pv_unhalt,hv_vendor_id=NV43FIX,kvm=off'
balloon: 0
bios: ovmf
bootdisk: ide0
cores: 4
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-108-disk-1,size=4M
hostpci0: 29:00,pcie=1,x-vga=1
ide0: vm_storage:vm-108-disk-0,size=100G,ssd=1
ide1: zfs1:vm-108-disk-0,backup=0,size=1000G,ssd=1
machine: q35
memory: 10240
name: windows-2019
net0: e1000=96:C4:25:49:DA:D9,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=11ee9edc-2d44-4783-b984-3ca140872535
sockets: 2
startup: order=1
usb0: host=2516:0051,usb3=1
vmgenid: c9afd214-228b-4ddc-8ee6-1442b89e39c3
 
The GPUs each have to be in their own IOMMU group. If your motherboard can't give you that, then it's time for an x570 board with good isolation support, full stop. You have a lot going on with all the HBAs and GPUs and a motherboard that is probably not capable of all this virtualization / isolation.

Also, use process of elimination. Start with the simplest hardware configuration, and add new devices one by one to determine what exactly is causing the problem. But, again, I do believe your root problem is the motherboard that does not provide the adequate isolation as Stefan said.
 
Okay, but I currently have the passtrough working, the only issue I now have, is that I get (in windows) error 43
 
That’s awesome! I’m curious and future readers will be as well I’m sure about the config that got it working. Would you mind posting the output from these commands? This info could also help diagnose your error 43 issue.
  1. # cat /etc/modules
  2. # qm showcmd [YOUR VM ID] --pretty
  3. # cat /etc/pve/qemu-server/[YOUR VM ID].conf
  4. # cat /etc/pve/modprobe.d/vfio.conf *repeat this for all the .conf files in this directory
  5. If using grub: # /etc/default/grub , If using systemd-boot: cat /etc/kernel/cmdline
  6. # lspci -nnv
  7. Enter qm monitor while VM is running: # qm monitor [YOUR VM ID]
    1. qm> info qtree
    2. qm> info pci
 
Okay, so this are the command outputs:

Code:
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

vfio
vfio_pci
vfio_virqfd
vfio_iommu_type1

qm showcmd;

Code:
/usr/bin/kvm \
  -id 107 \
  -name moe-test \
  -chardev 'socket,id=qmp,path=/var/run/qemu-server/107.qmp,server,nowait' \
  -mon 'chardev=qmp,mode=control' \
  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
  -mon 'chardev=qmp-event,mode=control' \
  -pidfile /var/run/qemu-server/107.pid \
  -daemonize \
  -smbios 'type=1,uuid=bc73fbd5-1b9d-4501-836d-86e50836add3' \
  -smp '8,sockets=1,cores=8,maxcpus=8' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
  -vnc unix:/var/run/qemu-server/107.vnc,password \
  -no-hpet \
  -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,+pcid' \
  -m 4098 \
  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
  -device 'vmgenid,guid=b02ce5d3-8936-4f36-9508-e0ea3b2c92be' \
  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
  -device 'vfio-pci,host=0000:29:00.0,id=hostpci0.0,bus=pci.0,addr=0x10.0,multifunction=on' \
  -device 'vfio-pci,host=0000:29:00.1,id=hostpci0.1,bus=pci.0,addr=0x10.1' \
  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:3ef0f7521b4' \
  -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \
  -drive 'file=/dev/pve/vm-107-disk-0,if=none,id=drive-sata0,cache=writeback,format=raw,aio=threads,detect-zeroes=on' \
  -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0,rotation_rate=1,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap107i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' \
  -device 'e1000,mac=CE:AC:F0:D5:26:55,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
  -rtc 'driftfix=slew,base=localtime' \
  -machine 'type=pc+pve0' \
  -global 'kvm-pit.lost_tick_policy=discard' \
  -cpu 'host,+kvm_pv_unhalt,hv_vendor_id=NV43FIX,kvm=off'

Code:
cat /etc/pve/qemu-server/107.conf
args: -cpu 'host,+kvm_pv_unhalt,hv_vendor_id=NV43FIX,kvm=off'
balloon: 0
bootdisk: sata0
cores: 8
cpu: host,hidden=1,flags=+pcid
hostpci0: 29:00
memory: 4098
name: moe-test
net0: e1000=CE:AC:F0:D5:26:55,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
sata0: local-lvm:vm-107-disk-0,cache=writeback,size=32G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=bc73fbd5-1b9d-4501-836d-86e50836add3
sockets: 1
vmgenid: b02ce5d3-8936-4f36-9508-e0ea3b2c92be

cat /etc/pve/modprobe.d/vfio.conf << Don't have this?



GRUB has only one line changed; GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"


lspci gives the following;

InSnHyL.png

(I don't think you need to see all the devices and that is the only GPU installed)


Here I have the QM commands outputs;
https://pastebin.com/xmLSM3ea
https://pastebin.com/2p3ieAxh
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!