Problem with PCI Passthrough

iT0mT0m

Member
Dec 30, 2020
4
0
6
31
Hi,

I have problem with PCI Passthrough on AMD Ryzen 5 2600. I have this CPU with motherboard Gigabyte B450M DS3H. I bought Intel X520-DA2 network card for Mikrotik CHR VM. I have read the instructions from this link https://pve.proxmox.com/wiki/Pci_passthrough and I added to GRUB_CMDLINE_LINUX_DEFAULT line amd_iommu=on, then I updated grub. I enabled IOMMU in bios too. Command "dmesg | grep -e DMAR -e IOMMU" show me:
Code:
root@PROX:~# dmesg | grep -e DMAR -e IOMMU
[    1.095455] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    1.097648] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    1.098723] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).

I added require modules to /etc/modules and I check remapping:
Code:
root@PROX:~# dmesg | grep 'remapping'
[    1.097652] AMD-Vi: Interrupt remapping enabled

Next I check IOMMU Isolation:
Code:
root@PROX:~# find /sys/kernel/iommu_groups/ -type l
.
.
/sys/kernel/iommu_groups/0/devices/0000:05:00.1 <- this is my second port of x520 card
.
.
./sys/kernel/iommu_groups/0/devices/0000:05:00.0 <- this is my first port of x520 card

When I add by GUI PCI Device and run CHR VM proxmox is freezed. I can't do anything, restart machine does not help. I must power off machine, detach from PCIe X520 card, run machine, remove entry with PCIe and plug the card back in. I try update BIOS from F42 to F50 and it didn't help.

My proxmox version: 6.3-3

Where to look for the problem?
 
Last edited:
Your multi-function device is in IOMMU group 0. Please make sure that you pass all sub-devices to the same VM.
Also, what other devices are in IOMMU group 0? Please show us the whole group using ls /sys/kernel/iommu_groups/0/devices/.
 
Command "ls /sys/kernel/iommu_groups/0/devices/" show me:
root@PROX:~# ls /sys/kernel/iommu_groups/0/devices/
0000:00:01.0 0000:00:01.3 0000:01:00.0 0000:01:00.1 0000:01:00.2 0000:02:00.0 0000:02:01.0 0000:02:04.0 0000:04:00.0 0000:05:00.0 0000:05:00.1
 
I appears that IOMMU group 0 contains more devices. You need to not use those devices (bind them to vfio-pci) or pass them all through to the same VM. PCI Bridge devices can be ignored, What are those other devices? Use for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done; to see the entire grouping of all your PCI devices. Maybe one of them is essential to your Proxmox and you will lose it when you passthrough your X520 card because everything in a single IOMMU group needs to stay together.

PS: a B450 motherboard often does not separate PCI slots and devices as well as a B550, X470, or X570 (in order of increasing suitability for PCIe passthrough).
 
Last edited:
Grouping
Code:
IOMMU group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 0 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 0 01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
IOMMU group 0 01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
IOMMU group 0 01:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
IOMMU group 0 02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU group 0 02:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU group 0 02:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU group 0 04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
IOMMU group 0 05:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
IOMMU group 0 05:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
IOMMU group 1 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 2 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 2 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 2 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 430] [10de:0de1] (rev a1)
IOMMU group 2 07:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
IOMMU group 3 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 4 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 4 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 4 08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU group 4 08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU group 4 08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] USB 3.0 Host controller [1022:145f]
IOMMU group 5 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 5 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 5 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU group 5 09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 5 09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU group 6 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU group 6 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 7 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU group 7 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU group 7 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU group 7 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU group 7 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU group 7 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU group 7 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU group 7 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]

In cat /etc/modprobe.d/vfio.conf I have only:
Code:
options vfio-pci ids=8086:10fb
 
I replace GPU with X520 and actually after add PCI to CHR proxmox is working. When I check in CHR System -> Resources -> PCI I see 01:00.0 82599E 10-Gigabit Network Connection and 02:00.0 82599E 10-Gigabit Network Connection and I see 2 new interfaces. Am I to understand this is already working?
 
IOMMU group 0 contains the following (real) devices:
01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
05:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
05:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)

That means that when you pass X520 to a VM, you cannot access your other network card (04:00.0). your disk drive controller (01:00.1) and a USB controller (01:00.0). This will probably cause a panic in the host Proxmox system. Try a different PCIe slot or maybe a newer BIOS version.

PS: Yes, if you see the two network devices appear inside the VM (and your Proxmox host is still working), then it is working.
PPS: Your CPU was in IOMMU group 2, which does not contain any other (non-bridge) devices. This is a good group/PCIe slot to passthrough.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!