More GPU/PCIe passthrough experiments

jamesd

New Member
Apr 2, 2023
11
0
1
Hi,

Running on a fresh install of PVE 7.2. I have 2 Nvidia GPUs and trying to get both able to be utilised by VMs using PCEe passthrough. Both cards have separate monitors connected to them.

I can get passthrough working so long as the BIOS settings are not set to use the card I'm working with as the default.

So if I boot with card 1 set, I can make passthrough work with card 2, and vice-versa. If I try to passthrough the card the machine booted with, the VM starts up OK, and I see the monitor connected to that card switch mode, but the screen stays black.

I have implemented passthrough by including the following lines in /etc/modules:

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

I have specified the VFIO addresses and applied various settings in my GRUB_CMDLINE_LINUX_DEFAULT, admittedly I have tried adding bits, so no doubt there's redundant stuff in there, but it works as long as we're not passing through the boot card:

GRUB_CMDLINE_LINUX_DEFAULT="quiet
quiet pcie_acs_override=downstream,multifunction video=simplefb:eek:ff initcall_blacklist=sysfb_init vfio-pci.ids=10de:2486,10de:228b,10de:1004,10de:0e1a modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu"

video=simplefb:eek:ff

Added in response to the 7.2 release notes

initcall_blacklist=sysfb_init

Everything worked in terms of non-boot card passthrough without this. When trying to passthrough the boot card, I saw messages in dmesg like this:

BAR 1: can't reserve [mem 0xd0000000-0xdfffffff 64bit pref]

After setting this parameter, I no longer see those messages, however the symptoms are the same - VM boots 'OK', black screen

Here are my IOMMU groups, followed by the dmesg from the latest attempt:

IOMMU group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 10 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 11 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU group 11 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 12 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU group 12 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU group 12 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU group 12 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU group 12 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU group 12 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU group 12 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU group 12 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU group 13 01:00.0 Non-Volatile memory controller [0108]: SK hynix PC300 NVMe Solid State Drive 512GB [1c5c:1284]
IOMMU group 14 02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset USB 3.1 xHCI Controller [1022:43bb] (rev 02)
IOMMU group 15 02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset SATA Controller [1022:43b7] (rev 02)
IOMMU group 16 02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b2] (rev 02)
IOMMU group 17 03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 18 03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 19 03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 1 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 20 04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
IOMMU group 21 06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK110 [GeForce GTX 780] [10de:1004] (rev a1)
IOMMU group 22 06:00.1 Audio device [0403]: NVIDIA Corporation GK110 High Definition Audio Controller [10de:0e1a] (rev a1)
IOMMU group 23 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] [10de:2486] (rev a1)
IOMMU group 24 07:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1)
IOMMU group 25 08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU group 26 08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU group 27 08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
IOMMU group 28 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU group 29 09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 2 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 30 09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU group 3 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 4 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 5 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 6 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 7 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 8 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 9 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]

Dmesg attached
 

Attachments

Last edited:
I have also just tried it with setting the vfio's in /etc/modprobe.d/vfio.conf

options vfio-pci ids=10de:2486,10de:228b,10de:1004,10de:0e1a

in an attempt to get the vfio driver loaded early, but same result.

Also, verified that vfio-pci driver is in use for both cards:

06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK110 [GeForce GTX 780] [10de:1004] (rev a1)
Subsystem: ASUSTeK Computer Inc. GK110 [GeForce GTX 780] [1043:8469]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
06:00.1 Audio device [0403]: NVIDIA Corporation GK110 High Definition Audio Controller [10de:0e1a] (rev a1)
Subsystem: ASUSTeK Computer Inc. GK110 High Definition Audio Controller [1043:8469]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
07:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] [10de:2486] (rev a1)
Subsystem: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] [10de:147a]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
07:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:147a]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel

Looking a bit closer at working/non-working dmesg outputs I see a difference which looks significant but don't know how to follow up, if anyone has a clue. Thanks!


Working:

# dmesg | grep vfio
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet pcie_acs_override=downstream,multifunction video=simplefb:off initca
ll_blacklist=sysfb_init vfio-pci.ids=10de:2486,10de:228b,10de:1004,10de:0e1a modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu
[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet pcie_acs_override=downstream,multifunction video=simplefb:off
initcall_blacklist=sysfb_init vfio-pci.ids=10de:2486,10de:228b,10de:1004,10de:0e1a modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu
[ 6.328544] vfio-pci 0000:07:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 6.352129] vfio_pci: add [10de:2486[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.376170] vfio_pci: add [10de:228b[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.376215] vfio-pci 0000:06:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 6.400117] vfio_pci: add [10de:1004[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.420194] vfio_pci: add [10de:0e1a[ffffffff:ffffffff]] class 0x000000/00000000
[ 44.847145] vfio-pci 0000:06:00.0: enabling device (0000 -> 0003)
[ 44.847627] vfio-pci 0000:06:00.0: vfio_ecap_init: hiding ecap 0x19@0x900

Non-working:

# dmesg | grep vfio
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet pcie_acs_override=downstream,multifunction video=simplefb:off initca
ll_blacklist=sysfb_init vfio-pci.ids=10de:2486,10de:228b,10de:1004,10de:0e1a modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu
[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet pcie_acs_override=downstream,multifunction video=simplefb:off
initcall_blacklist=sysfb_init vfio-pci.ids=10de:2486,10de:228b,10de:1004,10de:0e1a modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu
[ 6.324391] vfio-pci 0000:07:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 6.348143] vfio_pci: add [10de:2486[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.372199] vfio_pci: add [10de:228b[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.372259] vfio-pci 0000:06:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 6.392210] vfio_pci: add [10de:1004[ffffffff:ffffffff]] class 0x000000/00000000
[ 6.412173] vfio_pci: add [10de:0e1a[ffffffff:ffffffff]] class 0x000000/00000000
[ 81.073257] vfio-pci 0000:07:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 81.073283] vfio-pci 0000:07:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 81.073292] vfio-pci 0000:07:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
[ 81.073294] vfio-pci 0000:07:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
[ 81.073296] vfio-pci 0000:07:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
[ 81.074569] vfio-pci 0000:07:00.0: No more image in the PCI ROM
[ 81.093269] vfio-pci 0000:07:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
[ 83.671038] vfio-pci 0000:07:00.0: No more image in the PCI ROM
 
A word of warning; while it was possible to pass through a USB controller, I don't recommend it. After restarting the guest, time after time, the node gets trashed. Boots into emergency mode, and to make matters worse going fully headless makes it hard to work out what's going on.

Something get's messed up in LVM land, no clue what. The USB controller was in its own IOMMU group, but still had quite severe issues.

I've gone back to an old favourite; usbip. Flawless latency, amazing easy setup. My one gripe with this tool used to be that if there was an ungraceful exit by client or server, it was hard to get re-working without rebooting. However this time around I've worked out how to unload/reload kernel modules and clear out any residual fs stuff from broken connections. So now if it stops working I just call my reset_usb.sh script.

Usbip is an underused tool IMHO, especially when you work around its slight brittleness. Great for passing non-gamepad peripherals through when doing sunshine/moonlight btw. Using it I've been able to use the RPI4 as a thin moonlight/usbip client and run race sims with the full wheel setup.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!