[SOLVED] One IOMMU group, two GPUs

paoloc68

Member
Oct 29, 2019
19
1
23
56
I have a proxmox server and I want to allow the two gpus to be used on separate vm but i can't because they happen to fall under the same iommu group:

Code:
IOMMU Group 0:
    00:00.0 Host bridge [0600]: Intel Corporation 8th/9th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S] [8086:3e30] (rev 0d)
IOMMU Group 1:
    00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 0d)
    00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 0d)
    01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c3)

    02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
    03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c3)
    03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:ab28]
    04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev e7)
    04:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]

this is my grub command line:

Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pci_acs_override=downstream,multifunction initcall_blacklist=sysfb_init"

and the acs setting is ignored apparently because the kernel does not support it anymore:

Code:
root@proxmox:~# cat /proc/cmdline


initrd=\EFI\proxmox\5.15.152-1-pve\initrd.img-5.15.152-1-pve BOOT_IMAGE=/boot/vmlinuz-5.15.152-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init

this my driver conf:

Code:
root@proxmox:~# cat /etc/modprobe.d/vfio.conf


options vfio-pci ids=1002:73bf,1002:67df

I don't know how o make the two groups, please help!
 
Hi,

two gpus to be used on separate vm but i can't because they happen to fall under the same iommu group:
I don't know how o make the two groups, please help!
IOMMU groups are determined by your motherboard, so there is nothing you can really do about that.

You can try plugging them into different physical PCIe slots on the motherboard and see if that improves it.
Or updating the firmware/UEFI of the motherboard, that could change them potentially too.

Usually server hardware nearly puts everything their own IOMMU group, but consumer hardware/firmware is more often than not pretty shoddy in that regard.
 
I was able previously to do that before by using pcie_acs_override but now it seems that it is not compiled in the kernel anymore. It would be interesting to know the official position on that and if there is a "poor man" solution, like recompiling the kernel with pcie_acs_override patches.

Thank you!
 
I was able previously to do that before by using pcie_acs_override but now it seems that it is not compiled in the kernel anymore.
That's surprising! What is the output of cat /proc/cmdline ?
Please note that using the override allows VMs to read/write all of each other's memory (and the Proxmox host if one of the devices in the group is still assigned to the host).
 
root@proxmox:~# cat /proc/cmdline
Code:
initrd=\EFI\proxmox\5.15.152-1-pve\initrd.img-5.15.152-1-pve BOOT_IMAGE=/boot/vmlinuz-5.15.152-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init
root@proxmox:~# cat /etc/default/grub
Code:
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'


GRUB_DEFAULT=0
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init"
GRUB_CMDLINE_LINUX=""


# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"


# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console


# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480


# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true


# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"


# Uncomment to get a beep at grub


as you can you can see my settings on pcie_acs_override are ignored
 
root@proxmox:~# cat /proc/cmdline
Code:
initrd=\EFI\proxmox\5.15.152-1-pve\initrd.img-5.15.152-1-pve BOOT_IMAGE=/boot/vmlinuz-5.15.152-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init
The old 5.15 kernel did have the override patch (and Proxmox did not report any change on that). Please consider upgrading your Proxmox as 7.4 is almost out of support...

However, it looks like you did not add the kernel parameter correctly or did not do the other step(s) after that: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot_edit_kernel_cmdline . Or you did not reboot or the partition from which your system boots is not updated by proxmox-boot-tool.
It's impossible to tell what might be wrong with the way your system boots or keeps the boot stuff up to date from the information (and maybe you systems is installed long agoe before proxmox-boot-tool?), but you might want to investigate that (instead of blaming the kernel).
 
You may be right, I am trying to see if there something wrong in my grub configuration, I was able to see the pcie_acs_override option applied only by editing directly nano /etc/kernel/cmdline

Code:
root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init pcie_acs_override=downstream

the file /boot/grub/grub.cfg looks ok but something is not working as expected:

This what I got before:
Code:
root@proxmox:~# proxmox-boot-tool refresh
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/cmdline found - falling back to /proc/cmdline
Copying and configuring kernels on /dev/disk/by-uuid/E354-C804
    Copying kernel and creating boot-entry for 5.13.19-6-pve
    Copying kernel and creating boot-entry for 5.15.143-1-pve
    Copying kernel and creating boot-entry for 5.15.152-1-pve
    /var/tmp/espmounts/E354-C804/EFI/proxmox/grubx64.efi is not a directory - skipping
root@proxmox:~# cat /proc/cmdline
initrd=\EFI\proxmox\5.15.152-1-pve\initrd.img-5.15.152-1-pve BOOT_IMAGE=/boot/vmlinuz-5.15.152-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init

and this now:

Code:
root@proxmox:~# proxmox-boot-tool refresh
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
Copying and configuring kernels on /dev/disk/by-uuid/E354-C804
    Copying kernel and creating boot-entry for 5.13.19-6-pve
    Copying kernel and creating boot-entry for 5.15.143-1-pve
    Copying kernel and creating boot-entry for 5.15.152-1-pve
    /var/tmp/espmounts/E354-C804/EFI/proxmox/grubx64.efi is not a directory - skipping
root@proxmox:~# cat /proc/cmdline
initrd=\EFI\proxmox\5.15.152-1-pve\initrd.img-5.15.152-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init pcie_acs_override=downstream
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!