2 devices passthru doesn't work.

catha1201

Member
Mar 12, 2021
9
2
6
Hello anyone,

I have made passthru work on my server, but not with 2 devices for two different vms.

My setup:
Host:
E3-1225 v5
64GB ECC RAM

Guests / VMs
1. TrueNAS Scale with an m1015 HBA card
2. Ubuntu with Quart P400 graphics card

Both server VMs work fine with PCI passthru separately, but I can not run them at the same time :-(

The error is:

kvm: -device vfio-pci, host = 0000: 01: 00.0, id = hostpci0, bus = pci.0, addr = 0x10, rombar = 0: vfio 0000: 01: 00.0: failed to open / dev / vfio / 1 : Device or resource busy
Use of uninitialized value $ tpmpid in concatenation (.) Or string at /usr/share/perl5/PVE/QemuServer.pm line 5465.
stopping swtpm instance (pid) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1

Does anyone know why I can not run them at the same time?
 
Are you sure you are using the correct PCI ID? I think you get this error when some of the devices of the IOMMU group are already passed to another VM. Can you show us information about your IOMMU groups using: for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done? Maybe you can share the brand and model of the motherboard, as that determines most of the groups? Maybe show us the VM configuratiopn files from the /etc/pve/qemu-server/ directory as well?
 
Hi avw,

I have copy & paste below - and it is long :cool:

Are you sure you are using the correct PCI ID? I think you get this error when some of the devices of the IOMMU group are already passed to another VM. Can you show us information about your IOMMU groups using: for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done? Maybe you can share the brand and model of the motherboard, as that determines most of the groups? Maybe show us the VM configuratiopn files from the /etc/pve/qemu-server/ directory as well?
My mainbroad is HP ML10 Gen9

root@aello:~# for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done
IOMMU group 0 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:1918] (rev 07)
IOMMU group 1 00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
IOMMU group 1 00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 07)
IOMMU group 1 01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
IOMMU group 1 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P400] [10de:1cb3] (rev a1)
IOMMU group 1 02:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
IOMMU group 2 00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f] (rev 31)
IOMMU group 2 00:14.2 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem [8086:a131] (rev 31)
IOMMU group 3 00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31)
IOMMU group 3 00:16.3 Serial controller [0700]: Intel Corporation 100 Series/C230 Series Chipset Family KT Redirection [8086:a13d] (rev 31)
IOMMU group 4 00:17.0 SATA controller [0106]: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [8086:a102] (rev 31)
IOMMU group 5 00:1c.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #7 [8086:a116] (rev f1)
IOMMU group 6 00:1f.0 ISA bridge [0601]: Intel Corporation C236 Chipset LPC/eSPI Controller [8086:a149] (rev 31)
IOMMU group 6 00:1f.2 Memory controller [0580]: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121] (rev 31)
IOMMU group 6 00:1f.4 SMBus [0c05]: Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123] (rev 31)
IOMMU group 6 00:1f.7 Non-Essential Instrumentation [1300]: Intel Corporation 100 Series/C230 Series Chipset Family Trace Hub [8086:a126] (rev 31)
IOMMU group 7 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
IOMMU group 8 03:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003]

First VM with HBA controller:

agent: 1
balloon: 0
boot: order=scsi0;ide2;net0
cores: 2
hostpci0: 0000:01:00,rombar=0
ide2: none,media=cdrom
memory: 32768
name: triton
net0: virtio=D2:87:3F:5D:99:72,bridge=vmbr0
numa: 0
ostype: l26
scsi0: local-vmpool:vm-1000-disk-0,backup=0,size=64G
scsi1: local-vmpool:vm-1000-disk-1,backup=0,size=64G
scsihw: virtio-scsi-pci
smbios1: uuid=59a7f4d3-0405-4e87-9117-5c9afb0b4c18
sockets: 1
vmgenid: f29e1491-5d06-4792-a918-65fe54cf317b


Second VM with Nvidia

agent: 1,fstrim_cloned_disks=1
boot: order=scsi0;ide2;net0
cores: 2
hostpci0: 0000:02:00.0
hostpci1: 0000:02:00.1
ide2: none,media=cdrom
machine: q35
memory: 8192
name: plex
net0: virtio=52:DD:9B:71:07:8C,bridge=vmbr0
numa: 0
ostype: l26
scsi0: local-vmpool:vm-3000-disk-0,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=03f63232-6a29-4619-bf9a-328e669cc5c9
sockets: 1
tablet: 0
vmgenid: cdf537fd-cbb1-437c-931a-953a7a10d2be

File: vfio.conf:

options vfio-pci ids=10de:1cb3,10de:0fb9 disable_vga=1
options vfio-pci ids=01:00:0107:1000:0072
 
Both devices are in IOMMU group 1:
IOMMU group 1 01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03) IOMMU group 1 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P400] [10de:1cb3] (rev a1) IOMMU group 1 02:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
Devices in the same IOMMU group are not securely isolated and cannot be shared between VMs or between a VM and the host, which explains the (not very clear) error you get.

Try putting one of the devices in another PCI(e) slot until they are no longer in the same group.
 
Both devices are in IOMMU group 1:
IOMMU group 1 01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03) IOMMU group 1 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P400] [10de:1cb3] (rev a1) IOMMU group 1 02:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
Devices in the same IOMMU group are not securely isolated and cannot be shared between VMs or between a VM and the host, which explains the (not very clear) error you get.

Try putting one of the devices in another PCI(e) slot until they are no longer in the same group.

Good catch :cool:

I can't since I only have 1 slot with x16 :-(

I read somewhere about "pcie_acs_override=downstream,multifunction" but how do you use that? And is it safe?
 
Proxmox already contains the ACS override patches, so you can just add the pcie_acs_override=downstream or if necessary pcie_acs_override=downstream,multifunction kernel parameters to GRUB or systemd-boot (please check the PVE manual to find out which is used in your case). It usually "breaks" all IOMMU group and there allows you to ignore security isolation of the IOMMU groups. PCI(e) devices can do DMA (DIrect memory access, as in reading and writing any part of the VM memory) and talk to other PCI(e) devices in the same group. This is a security issue and your should not use it if you run untrusted software or allow untrusted users to access your VMs. Also, there are no guarantees that it will work. It just allows you to ignore ACS as your own risk.
 
  • Like
Reactions: MatthewLXJ
Proxmox already contains the ACS override patches, so you can just add the pcie_acs_override=downstream or if necessary pcie_acs_override=downstream,multifunction kernel parameters to GRUB or systemd-boot (please check the PVE manual to find out which is used in your case). It usually "breaks" all IOMMU group and there allows you to ignore security isolation of the IOMMU groups. PCI(e) devices can do DMA (DIrect memory access, as in reading and writing any part of the VM memory) and talk to other PCI(e) devices in the same group. This is a security issue and your should not use it if you run untrusted software or allow untrusted users to access your VMs. Also, there are no guarantees that it will work. It just allows you to ignore ACS as your own risk.

Do I set it in GRUB like this .... or?

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on, pcie_acs_override=downstream"
 
  • Like
Reactions: mooneydude
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream" (no comma) and run update-grub after making changes to /etc/default/grub and reboot.
Or if you just one to try it once, press e in the GRUB boot menu when the system starts and change it there without making it permanent. You'll may have to look for the word quiet to find the right place to add it, when you do it that way.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!