Only one GPU being assigned an IOMMU Group

Yostie

New Member
May 22, 2023
6
1
3
Hi all. I have two RTX3060s in my proxmox server. When I create a VM, both GPUs are present in the PCI Device drop down but the first GPU is the only one that is being assigned a IOMMU group. The second GPU is not being assigned a group. In the picture below the first GPU is at 09:00.0/09:00.1 and the second GPU is at 42:00.0/42:00.1 Any help would be greatly appreciated.

IMMOU Group.PNG
 
This might be a GUI issue. What is the (full) output of cat /proc/cmdline; for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done?
 
Hi leesteken. Thanks for the quick reply. The gull output is as below.

IOMMU group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 10 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU group 10 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 11 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU group 11 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU group 11 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU group 11 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU group 11 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU group 11 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU group 11 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU group 11 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU group 12 00:19.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU group 12 00:19.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU group 12 00:19.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU group 12 00:19.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU group 12 00:19.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU group 12 00:19.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU group 12 00:19.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU group 12 00:19.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU group 13 01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset USB 3.1 xHCI Controller [1022:43ba] (rev 02)
IOMMU group 13 01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset SATA Controller [1022:43b6] (rev 02)
IOMMU group 13 01:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset PCIe Bridge [1022:43b1] (rev 02)
IOMMU group 13 02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 13 02:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 13 02:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 13 02:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 13 02:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 13 02:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU group 13 03:00.0 Network controller [0280]: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter [168c:003e] (rev 32)
IOMMU group 13 04:00.0 Network controller [0280]: Wilocity Ltd. Wil6200 802.11ad Wireless Network Adapter [1ae9:0310] (rev 02)
IOMMU group 13 05:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
IOMMU group 13 08:00.0 USB controller [0c03]: ASMedia Technology Inc. ASM2142 USB 3.1 Host Controller [1b21:2142]
IOMMU group 14 09:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2504] (rev a1)
IOMMU group 14 09:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:228e] (rev a1)
IOMMU group 15 0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU group 16 0a:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU group 17 0a:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
IOMMU group 18 0b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU group 19 0b:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 1 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 20 0b:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU group 2 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 3 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 4 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 5 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 6 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 7 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 8 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 9 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
 
Looks like it is indeed not in any IOMMU group. Maybe it is behind a PCIe multiplexer that does not support the necessary IOMMU features? Is there anything in journalctl about inconsistent IOMMU features? Maybe other users of this Threadripper motherboard (which one exactly?) have the same issue? I don't think this can be changed with software but maybe a BIOS update can help?
 
Thanks again for the reply. the output from journalctl is as below. The board is a ASUS ROG Zenith Extreme x399 and the CPU is a 1950x Threadripper. The board BIOS has been updated to the latest version. I will have a search around and see if I can find anyone else with similar issues. Thanks again for your help.

ay 13 17:38:13 maggee kernel: Linux version 5.15.102-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE>
May 13 17:38:13 maggee kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet
May 13 17:38:13 maggee kernel: KERNEL supported cpus:
May 13 17:38:13 maggee kernel: Intel GenuineIntel
May 13 17:38:13 maggee kernel: AMD AuthenticAMD
May 13 17:38:13 maggee kernel: Hygon HygonGenuine
May 13 17:38:13 maggee kernel: Centaur CentaurHauls
May 13 17:38:13 maggee kernel: zhaoxin Shanghai
May 13 17:38:13 maggee kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
May 13 17:38:13 maggee kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
May 13 17:38:13 maggee kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
May 13 17:38:13 maggee kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
May 13 17:38:13 maggee kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format.
May 13 17:38:13 maggee kernel: signal: max sigframe size: 1776
May 13 17:38:13 maggee kernel: BIOS-provided physical RAM map:
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000000100000-0x0000000009deffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000009df0000-0x0000000009ffffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000000a000000-0x000000000affffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000000b000000-0x000000000b01ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000000b020000-0x00000000767cffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000767d0000-0x0000000077d16fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000077d17000-0x0000000077d40fff] ACPI data
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000077d41000-0x0000000078206fff] ACPI NVS
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000078207000-0x0000000079535fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000079536000-0x00000000795ebfff] type 20
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000795ec000-0x000000007bffffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000007c000000-0x000000007fffffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000b7a00000-0x00000000b7a00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000b7b00000-0x00000000b7b7ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000ef800000-0x00000000ef8fffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000efc00000-0x00000000efc00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000efd00000-0x00000000efd7ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fea00000-0x00000000feafffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fec30000-0x00000000fec30fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fed40000-0x00000000fed44fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fedc0000-0x00000000fedc0fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fedc2000-0x00000000fedcffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fedd4000-0x00000000fedd5fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000feefffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000100000000-0x000000107f2fffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000107f300000-0x000000107fffffff] reserved
May 13 17:38:13 maggee kernel: NX (Execute Disable) protection: active
 
So even though I have all the drivers blacklisted within /etc/modprobe.d/blacklist.conf as per below, when I plug a monitor into the GPU that is not being passed through it is showing the proxmox command line. Could this be the issue.

blacklist radeon
blacklist nouveau
blacklist nvidia
 
Thanks again for the reply. the output from journalctl is as below. The board is a ASUS ROG Zenith Extreme x399 and the CPU is a 1950x Threadripper. The board BIOS has been updated to the latest version. I will have a search around and see if I can find anyone else with similar issues. Thanks again for your help.

ay 13 17:38:13 maggee kernel: Linux version 5.15.102-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE>
May 13 17:38:13 maggee kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet
May 13 17:38:13 maggee kernel: KERNEL supported cpus:
May 13 17:38:13 maggee kernel: Intel GenuineIntel
May 13 17:38:13 maggee kernel: AMD AuthenticAMD
May 13 17:38:13 maggee kernel: Hygon HygonGenuine
May 13 17:38:13 maggee kernel: Centaur CentaurHauls
May 13 17:38:13 maggee kernel: zhaoxin Shanghai
May 13 17:38:13 maggee kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
May 13 17:38:13 maggee kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
May 13 17:38:13 maggee kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
May 13 17:38:13 maggee kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
May 13 17:38:13 maggee kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format.
May 13 17:38:13 maggee kernel: signal: max sigframe size: 1776
May 13 17:38:13 maggee kernel: BIOS-provided physical RAM map:
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000000100000-0x0000000009deffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000009df0000-0x0000000009ffffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000000a000000-0x000000000affffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000000b000000-0x000000000b01ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000000b020000-0x00000000767cffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000767d0000-0x0000000077d16fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000077d17000-0x0000000077d40fff] ACPI data
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000077d41000-0x0000000078206fff] ACPI NVS
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000078207000-0x0000000079535fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000079536000-0x00000000795ebfff] type 20
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000795ec000-0x000000007bffffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000007c000000-0x000000007fffffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000b7a00000-0x00000000b7a00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000b7b00000-0x00000000b7b7ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000ef800000-0x00000000ef8fffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000efc00000-0x00000000efc00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000efd00000-0x00000000efd7ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fea00000-0x00000000feafffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fec30000-0x00000000fec30fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fed40000-0x00000000fed44fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fedc0000-0x00000000fedc0fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fedc2000-0x00000000fedcffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fedd4000-0x00000000fedd5fff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000feefffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x0000000100000000-0x000000107f2fffff] usable
May 13 17:38:13 maggee kernel: BIOS-e820: [mem 0x000000107f300000-0x000000107fffffff] reserved
May 13 17:38:13 maggee kernel: NX (Execute Disable) protection: active
This is only a (very) small part of the logs. You'll need to scroll through it and see if you can find messages related to IOMMU, which might explain why some PCI(e) devices are not assigned to groups. Have you tried moving the GPU to another PCIe slot or checked the motherboard manual about some PCIe slots being different from others?
So even though I have all the drivers blacklisted within /etc/modprobe.d/blacklist.conf as per below, when I plug a monitor into the GPU that is not being passed through it is showing the proxmox command line. Could this be the issue.

blacklist radeon
blacklist nouveau
blacklist nvidia
No. You probably have not installed nvidia drivers on the Proxmox host (I hope), so blacklisting them does nothing. You have no old AMD GPU, so blacklisting radeon does nothring. And you are probably seeing output via simplefb instead of nouveau, which might not even support your GPUs yet, so it probably does nothing. Either way, blacklisting drivers had no effect on IOMMU groups whatsoever. IOMMU groups are determined bu the motherboard BIOS and physical PCIe layout and multiplexers.
 
Ah I did indeed miss the rest of the log. I have trolled through the log and the below output is the only mention of IOMMU. I did try shuffling the GPU across the 4 PCIE x16 slots with no success. The motherboard documentation does not mention which slots are directly connected to the CPU and which are running through the chipset, but I did try all slots. I have also confirmed none of the slots are sharing any lanes with the current NVMe config.

May 23 18:24:35 maggee kernel: Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA

May 23 18:24:35 maggee kernel: iommu: Default domain type: Translated
May 23 18:24:35 maggee kernel: iommu: DMA domain TLB invalidation policy: lazy mode


May 23 18:24:35 maggee kernel: pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
May 23 18:24:35 maggee kernel: pci 0000:00:01.0: Adding to iommu group 0
May 23 18:24:35 maggee kernel: pci 0000:00:01.1: Adding to iommu group 1
May 23 18:24:35 maggee kernel: Trying to unpack rootfs image as initramfs...
May 23 18:24:35 maggee kernel: pci 0000:00:02.0: Adding to iommu group 2
May 23 18:24:35 maggee kernel: pci 0000:00:03.0: Adding to iommu group 3
May 23 18:24:35 maggee kernel: pci 0000:00:03.1: Adding to iommu group 4
May 23 18:24:35 maggee kernel: pci 0000:00:04.0: Adding to iommu group 5
May 23 18:24:35 maggee kernel: pci 0000:00:07.0: Adding to iommu group 6
May 23 18:24:35 maggee kernel: pci 0000:00:07.1: Adding to iommu group 7
May 23 18:24:35 maggee kernel: pci 0000:00:08.0: Adding to iommu group 8
May 23 18:24:35 maggee kernel: pci 0000:00:08.1: Adding to iommu group 9
May 23 18:24:35 maggee kernel: pci 0000:00:14.0: Adding to iommu group 10
May 23 18:24:35 maggee kernel: pci 0000:00:14.3: Adding to iommu group 10
May 23 18:24:35 maggee kernel: pci 0000:00:18.0: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:18.1: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:18.2: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:18.3: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:18.4: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:18.5: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:18.6: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:18.7: Adding to iommu group 11
May 23 18:24:35 maggee kernel: pci 0000:00:19.0: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:00:19.1: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:00:19.2: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:00:19.3: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:00:19.4: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:00:19.5: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:00:19.6: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:00:19.7: Adding to iommu group 12
May 23 18:24:35 maggee kernel: pci 0000:01:00.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:01:00.1: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:01:00.2: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:02:00.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:02:01.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:02:02.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:02:03.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:02:04.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:02:09.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:03:00.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:04:00.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:05:00.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:08:00.0: Adding to iommu group 13
May 23 18:24:35 maggee kernel: pci 0000:09:00.0: Adding to iommu group 14
May 23 18:24:35 maggee kernel: pci 0000:09:00.1: Adding to iommu group 14
May 23 18:24:35 maggee kernel: pci 0000:0a:00.0: Adding to iommu group 15
May 23 18:24:35 maggee kernel: pci 0000:0a:00.2: Adding to iommu group 16
May 23 18:24:35 maggee kernel: pci 0000:0a:00.3: Adding to iommu group 17
May 23 18:24:35 maggee kernel: pci 0000:0b:00.0: Adding to iommu group 18
May 23 18:24:35 maggee kernel: pci 0000:0b:00.2: Adding to iommu group 19
May 23 18:24:35 maggee kernel: pci 0000:0b:00.3: Adding to iommu group 20
May 23 18:24:35 maggee kernel: pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
May 23 18:24:35 maggee kernel: AMD-Vi: Extended features (0x400f77ef22294ada): PPR NX GT IA GA PC GA_vAPIC
May 23 18:24:35 maggee kernel: PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
May 23 18:24:35 maggee kernel: software IO TLB: mapped [mem 0x000000006fd14000-0x0000000073d14000] (64MB)
May 23 18:24:35 maggee kernel: perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
 
leestekeen, just wanted to say a big thank you for your help. Turns out there where additional settings buried deep within the BIOS that I had missed. In particular the setting below.

Advanced->AMD PBS-> Enumerate all IOMMU in IVRS: Enabled

Next thing to work through though, is why when I have to GPUs past through to the same VM, I get just a black screen. Will do some reseach and see if I can figure it out.
 
  • Like
Reactions: leesteken

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!