[SOLVED] PCI Passthrough Intel X540 T2 error

ilya_buryy

Member
Jul 11, 2021
23
4
8
37
Hi!
I need help!
I try enabled Intel X540 T2 in VM, but i see error in ProxMox, when i started VM. VM not booting. ProxMox freez.

error.png

Server: Asrack E3C224 | E3-1286 v3 | ProxMox 6.4-13 | Bios (VT-d enabled, IVT enabled)
VM: Windows Server 2019

vm.png

If I remove PCI Device, VM load successfully, but hard disk scsi 1 and scsi 2 offline.



IOMMU is alredy enabled
dmesg | grep -e DMAR -e IOMMU
Code:
root@files:~# dmesg | grep -e DMAR -e IOMMU
[    0.014007] ACPI: DMAR 0x00000000DD8F0638 000080 (v01 INTEL  BDW      00000001 INTL 00000001)
[    0.014024] ACPI: Reserving DMAR table memory at [mem 0xdd8f0638-0xdd8f06b7]
[    0.082566] DMAR: IOMMU enabled
[    0.177903] DMAR: Host address width 39
[    0.177904] DMAR: DRHD base: 0x000000fed90000 flags: 0x1
[    0.177907] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap d2008c20660462 ecap f010da
[    0.177908] DMAR: RMRR base: 0x000000df6b6000 end: 0x000000df6c4fff
[    0.177909] DMAR-IR: IOAPIC id 8 under DRHD base  0xfed90000 IOMMU 0
[    0.177910] DMAR-IR: HPET id 0 under DRHD base 0xfed90000
[    0.177910] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[    0.177911] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[    0.178121] DMAR-IR: Enabled IRQ remapping in xapic mode
[    0.751343] DMAR: No ATSR found
[    0.751374] DMAR: dmar0: Using Queued invalidation
[    0.751722] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    0.939691] megaraid_sas 0000:01:00.0: DMAR: 32bit DMA uses non-identity mapping
[    0.950636] ehci-pci 0000:00:1a.0: DMAR: 32bit DMA uses non-identity mapping
[    0.975026] ehci-pci 0000:00:1d.0: DMAR: 32bit DMA uses non-identity mapping

/etc/modules
Code:
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/default/grub
Code:
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""

# Disable os-prober, it might add menu entries for each guest
GRUB_DISABLE_OS_PROBER=true

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

lspci -nn | grep Ethernet
Code:
02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
02:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
06:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

lspci -nnk
Code:
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v3 Processor DRAM Controller [8086:0c08] (rev 06)
        Subsystem: ASRock Incorporation Xeon E3-1200 v3 Processor DRAM Controller [1849:0c08]
        Kernel driver in use: ie31200_edac
        Kernel modules: ie31200_edac
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller [8086:0c01] (rev 06)
        Kernel driver in use: pcieport
00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x8 Controller [8086:0c05] (rev 06)
        Kernel driver in use: pcieport
00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 05)
        Subsystem: ASRock Incorporation 8 Series/C220 Series Chipset Family USB xHCI [1849:8c31]
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
00:16.0 Communication controller [0780]: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 [8086:8c3a] (rev 04)
        Subsystem: ASRock Incorporation 8 Series/C220 Series Chipset Family MEI Controller [1849:8c3a]
        Kernel modules: mei_me
00:16.1 Communication controller [0780]: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #2 [8086:8c3b] (rev 04)
        Subsystem: ASRock Incorporation 8 Series/C220 Series Chipset Family MEI Controller [1849:8c3b]
00:1a.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 [8086:8c2d] (rev 05)
        Subsystem: ASRock Incorporation 8 Series/C220 Series Chipset Family USB EHCI [1849:8c2d]
        Kernel driver in use: ehci-pci
        Kernel modules: ehci_pci
00:1c.0 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 [8086:8c10] (rev d5)
        Kernel driver in use: pcieport
00:1c.4 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #5 [8086:8c18] (rev d5)
        Kernel driver in use: pcieport
00:1c.5 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #6 [8086:8c1a] (rev d5)
        Kernel driver in use: pcieport
00:1c.7 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #8 [8086:8c1e] (rev d5)
        Kernel driver in use: pcieport
00:1d.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 [8086:8c26] (rev 05)
        Subsystem: ASRock Incorporation 8 Series/C220 Series Chipset Family USB EHCI [1849:8c26]
        Kernel driver in use: ehci-pci
        Kernel modules: ehci_pci
00:1f.0 ISA bridge [0601]: Intel Corporation C224 Series Chipset Family Server Standard SKU LPC Controller [8086:8c54] (rev 05)
        Subsystem: ASRock Incorporation C224 Series Chipset Family Server Standard SKU LPC Controller [1849:8c54]
        Kernel driver in use: lpc_ich
        Kernel modules: lpc_ich
00:1f.2 SATA controller [0106]: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 05)
        Subsystem: ASRock Incorporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [1849:8c02]
        Kernel driver in use: ahci
        Kernel modules: ahci
00:1f.3 SMBus [0c05]: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller [8086:8c22] (rev 05)
        Subsystem: ASRock Incorporation 8 Series/C220 Series Chipset Family SMBus Controller [1849:8c22]
        Kernel driver in use: i801_smbus
        Kernel modules: i2c_i801
01:00.0 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] [1000:005b] (rev 03)
        Subsystem: LSI Logic / Symbios Logic MegaRAID SAS 9266-8i [1000:9266]
        Kernel driver in use: megaraid_sas
        Kernel modules: megaraid_sas
02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
        Subsystem: Intel Corporation Ethernet Converged Network Adapter X540-T2 [8086:0001]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
02:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
        Subsystem: Intel Corporation Ethernet Converged Network Adapter X540-T2 [8086:0001]
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe
03:00.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge [1b21:1080] (rev 03)
05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
        Subsystem: ASRock Incorporation I210 Gigabit Network Connection [1849:1533]
        Kernel driver in use: igb
        Kernel modules: igb
06:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
        Subsystem: ASRock Incorporation I210 Gigabit Network Connection [1849:1533]
        Kernel driver in use: igb
        Kernel modules: igb
07:00.0 PCI bridge [0604]: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge [1a03:1150] (rev 02)
08:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 21)
        Subsystem: ASRock Incorporation ASPEED Graphics Family [1849:2000]
        Kernel driver in use: ast
        Kernel modules: ast

I try this setting in /etc/pve/nodes/files/qemu-server/200.conf for VM:
  • hostpci0: 02:00.0,pcie=1
  • hostpci0: 02:00,pcie=1
VM load successfully, but PCI device not found.
 
Hev you checked that your device is in an isolated IOMMU group? (PCI Bridges in a group are not a problem.)
Please show us your groups with: for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done.
It looks like the Proxmox host loses some devices when you do passthrough and this is commonly because devices in a single group cannot be shared between VMs and the host. The best way to try to move a device between groups is to put it in another PCIe slot.
 
Hi, avw!
Code:
IOMMU group 0 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v3 Processor DRAM Controller [8086:0c08] (rev 06)
IOMMU group 10 00:1f.0 ISA bridge [0601]: Intel Corporation C224 Series Chipset Family Server Standard SKU LPC Controller [8086:8c54] (rev 05)
IOMMU group 10 00:1f.2 SATA controller [0106]: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 05)
IOMMU group 10 00:1f.3 SMBus [0c05]: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller [8086:8c22] (rev 05)
IOMMU group 11 03:00.0 PCI bridge [0604]: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge [1b21:1080] (rev 03)
IOMMU group 12 05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
IOMMU group 13 06:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
IOMMU group 14 07:00.0 PCI bridge [0604]: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge [1a03:1150] (rev 02)
IOMMU group 14 08:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 21)
IOMMU group 1 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller [8086:0c01] (rev 06)
IOMMU group 1 00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x8 Controller [8086:0c05] (rev 06)
IOMMU group 1 01:00.0 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] [1000:005b] (rev 03)
IOMMU group 1 02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
IOMMU group 1 02:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
IOMMU group 2 00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 05)
IOMMU group 3 00:16.0 Communication controller [0780]: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 [8086:8c3a] (rev 04)
IOMMU group 3 00:16.1 Communication controller [0780]: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #2 [8086:8c3b] (rev 04)
IOMMU group 4 00:1a.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 [8086:8c2d] (rev 05)
IOMMU group 5 00:1c.0 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 [8086:8c10] (rev d5)
IOMMU group 6 00:1c.4 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #5 [8086:8c18] (rev d5)
IOMMU group 7 00:1c.5 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #6 [8086:8c1a] (rev d5)
IOMMU group 8 00:1c.7 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #8 [8086:8c1e] (rev d5)
IOMMU group 9 00:1d.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 [8086:8c26] (rev 05)
 
Your Ethernet multi-function device (02:00) and your RAID device (01:00.0) are in the same group (1). This is the cause of your disk drive errors and Proxmox freeze: they are both removed from the host when you passthrough one to a VM.
 
Your Ethernet multi-function device (02:00) and your RAID device (01:00.0) are in the same group (1). This is the cause of your disk drive errors and Proxmox freeze: they are both removed from the host when you passthrough one to a VM.
How can this be fixed? I have one free slot left ...
 
Ah right, sorry, I did not understand that you have no options.
The motherboard and the BIOS determine the IOMMU groups, so there is not much you can do to change the groups. Devices in one group cannot interfere with devices in another group via the PCI bus. This is known as PCIe ACS and the groups are used to tell you what is safe and secure.
You can break (the security of) the groups by adding pcie_acs_override=downstream to the kernel parameters (add it to your GRUB_CMDLINE_LINUX_DEFAULT).
There is no guarantee that it will work and not make things worse. Please understand that, in principle, the VM can then access your disk drives and the host can access the network devices inside the VM. Don't use it if you do not trust the software inside the VM or allow untrusted people (like the internet) to access to the VM.
 
Ah right, sorry, I did not understand that you have no options.
The motherboard and the BIOS determine the IOMMU groups, so there is not much you can do to change the groups. Devices in one group cannot interfere with devices in another group via the PCI bus. This is known as PCIe ACS and the groups are used to tell you what is safe and secure.
You can break (the security of) the groups by adding pcie_acs_override=downstream to the kernel parameters (add it to your GRUB_CMDLINE_LINUX_DEFAULT).
There is no guarantee that it will work and not make things worse. Please understand that, in principle, the VM can then access your disk drives and the host can access the network devices inside the VM. Don't use it if you do not trust the software inside the VM or allow untrusted people (like the internet) to access to the VM.
Thank you so much! Works!
 
  • Like
Reactions: leesteken

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!