YAPQ - HBA passthrough not working

zebrahost

New Member
Jan 13, 2020
5
1
1
I am attempting to passthrough my HBA card in group 1 in proxmox. I am limited to single PCIx16 slot which is normally intended for graphics cards. When I assign the PCI device to the VM, the VM fails to start (at one point caused the system to crash). When I checked the groups, it appears to share with xeon processor pci bridge. I using the latest stable version of proxmox.
Here are my setup options:

Code:
root@pve10:~# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/7/devices/0000:00:1a.0
/sys/kernel/iommu_groups/5/devices/0000:00:16.0
/sys/kernel/iommu_groups/13/devices/0000:03:00.0
/sys/kernel/iommu_groups/3/devices/0000:00:03.0
/sys/kernel/iommu_groups/11/devices/0000:00:1d.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:1b.0
/sys/kernel/iommu_groups/6/devices/0000:00:19.0
/sys/kernel/iommu_groups/4/devices/0000:00:14.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.2
/sys/kernel/iommu_groups/12/devices/0000:00:1f.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.3
/sys/kernel/iommu_groups/2/devices/0000:00:02.0
/sys/kernel/iommu_groups/10/devices/0000:00:1c.3
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:1c.0
root@pve10:~#

Code:
root@pve10:~# for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done;
IOMMU group 0 00:00.0 Host bridge [0600]: Intel Corporation 4th Gen Core Processor DRAM Controller [8086:0c00] (rev 06)
IOMMU group 10 00:1c.3 PCI bridge [0604]: Intel Corporation 9 Series Chipset Family PCI Express Root Port 4 [8086:8c96] (rev d0)
IOMMU group 11 00:1d.0 USB controller [0c03]: Intel Corporation 9 Series Chipset Family USB EHCI Controller #1 [8086:8ca6]
IOMMU group 12 00:1f.0 ISA bridge [0601]: Intel Corporation 9 Series Chipset Family Z97 LPC Controller [8086:8cc4]
IOMMU group 12 00:1f.2 SATA controller [0106]: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode] [8086:8c82]
IOMMU group 12 00:1f.3 SMBus [0c05]: Intel Corporation 9 Series Chipset Family SMBus Controller [8086:8ca2]
IOMMU group 13 03:00.0 Ethernet controller [0200]: Qualcomm Atheros AR8161 Gigabit Ethernet [1969:1091] (rev 10)
IOMMU group 1 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller [8086:0c01] (rev 06)
IOMMU group 1 01:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
IOMMU group 2 00:02.0 VGA compatible controller [0300]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller [8086:0412] (rev 06)
IOMMU group 3 00:03.0 Audio device [0403]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller [8086:0c0c] (rev 06)
IOMMU group 4 00:14.0 USB controller [0c03]: Intel Corporation 9 Series Chipset Family USB xHCI Controller [8086:8cb1]
IOMMU group 5 00:16.0 Communication controller [0780]: Intel Corporation 9 Series Chipset Family ME Interface #1 [8086:8cba]
IOMMU group 6 00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection I217-V [8086:153b]
IOMMU group 7 00:1a.0 USB controller [0c03]: Intel Corporation 9 Series Chipset Family USB EHCI Controller #2 [8086:8cad]
IOMMU group 8 00:1b.0 Audio device [0403]: Intel Corporation 9 Series Chipset Family HD Audio Controller [8086:8ca0]
IOMMU group 9 00:1c.0 PCI bridge [0604]: Intel Corporation 9 Series Chipset Family PCI Express Root Port 1 [8086:8c90] (rev d0)
root@pve10:~#

Code:
[root@pve10:~# cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=1000:0072,1969:1091
root@pve10:~#

Code:
root@pve10:~# dmesg |grep -e DMAR -e IOMMU -e AMD-Vi
[    0.008357] ACPI: DMAR 0x00000000D65F94D0 0000B8 (v01 INTEL  BDW      00000001 INTL 00000001)
[    0.033659] DMAR: IOMMU enabled
[    0.080723] DMAR: Host address width 39
[    0.080724] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.080727] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
[    0.080728] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.080730] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c20660462 ecap f010da
[    0.080731] DMAR: RMRR base: 0x000000d9e7f000 end: 0x000000d9e8dfff
[    0.080731] DMAR: RMRR base: 0x000000db000000 end: 0x000000df1fffff
[    0.080733] DMAR-IR: IOAPIC id 8 under DRHD base  0xfed91000 IOMMU 1
[    0.080733] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.080734] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[    0.080734] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[    0.081207] DMAR-IR: Enabled IRQ remapping in xapic mode
[    0.763726] DMAR: No ATSR found
[    0.763754] DMAR: dmar0: Using Queued invalidation
[    0.763758] DMAR: dmar1: Using Queued invalidation
[    0.799391] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    0.994680] ehci-pci 0000:00:1a.0: DMAR: 32bit DMA uses non-identity mapping
[    1.017273] ehci-pci 0000:00:1d.0: DMAR: 32bit DMA uses non-identity mapping
[    3.831207] i915 0000:00:02.0: DMAR active, disabling use of stolen memory
root@pve10:~#

Code:
root@pve10:~# lsmod | grep vfio
vfio_pci               53248  0
vfio_virqfd            16384  1 vfio_pci
irqbypass              16384  2 vfio_pci,kvm
vfio_iommu_type1       32768  0
vfio                   32768  2 vfio_iommu_type1,vfio_pci

Code:
root@pve10:~# cat /etc/default/grub | grep "LINUX_DEFAULT"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_overide=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1 iommu=pt"
root@pve10:~#

Code:
root@pve10:/etc/modprobe.d# cat pve-blacklist.conf
# This file contains a list of modules which are not supported by Proxmox VE

# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb
blacklist mpt3sas
blacklist alx


Code:
root@pve10:/etc/modprobe.d# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
vfio
#vfio_iommu_type1
vfio_pci
vfio_virqfd

Is "IOMMU group 1 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller [8086:0c01] (rev 06)" creating a conflict?

I also get the message in dmesg
[ 0.033304] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x22 (or later)

what other options do I have besides give up?
 
Last edited:
It shouldn't create a conflict, bridges are usually fine. Try updating your BIOS maybe, that can help get rid of the microcode error (and enable x2apic in the BIOS setup if you find it, that's usually a good idea in general, regardless of passthrough).

Without any more logs of what happens when you start the VMs (or when your system crashes) it's hard to say more. Please post task logs, 'journalctl' output and dmesg (if applicable).