Issues with PCI SAS card passthrough

Poetart

Member
Jul 14, 2020
1
0
21
31
Moving from a R620 ESXI environment to another R620 with proxmox.

Previous setup was working with PCI passthrough on ESXI and I moved the card back into the machine to make sure that it didn't die on the install to the new machine and it is still working.

Right now, I have gone through this & this for some context.

I could see the drives inside of proxmox before I blacklisted the driver so I know that proxmox is aware of and working with the SAS card.

Before I post all the command outputs, i'll list out what I am seeing:

Currently, I have blacklisted the driver that the SAS card was using. Adding the "options vfio-pci ids=1234:5678,4321:8765" to a conf file in /etc/modprobe.d/. did not removed the Drives / PCI device from proxmox nor did it change the driver to vifo. Blacklisting mpt2sas did. When I boot up the Freenas VM, the VM starts eating up all my RAM till it crashes the server (See image below). I have tried with the rombar=0 option and changed the VM to q35 but the same thing happens. If I remove the card the VM starts just fine.
RAM.PNG

Here is some configuration:
Code:
root@R620-2:~# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[    0.011627] ACPI: DMAR 0x000000007D3346F4 000138 (v01 DELL   PE_SC3   00000001 DELL 00000001)
[    0.912615] DMAR: IOMMU enabled
[    1.646109] DMAR: Host address width 46
[    1.646110] DMAR: DRHD base: 0x000000d6000000 flags: 0x0
[    1.646115] DMAR: dmar0: reg_base_addr d6000000 ver 1:0 cap d2078c106f0466 ecap f020de
[    1.646116] DMAR: DRHD base: 0x000000df900000 flags: 0x1
[    1.646118] DMAR: dmar1: reg_base_addr df900000 ver 1:0 cap d2078c106f0466 ecap f020de
[    1.646119] DMAR: RMRR base: 0x0000007f458000 end: 0x0000007f46ffff
[    1.646120] DMAR: RMRR base: 0x0000007f450000 end: 0x0000007f450fff
[    1.646120] DMAR: RMRR base: 0x0000007f452000 end: 0x0000007f452fff
[    1.646121] DMAR: ATSR flags: 0x0
[    1.646124] DMAR-IR: IOAPIC id 2 under DRHD base  0xd6000000 IOMMU 0
[    1.646125] DMAR-IR: IOAPIC id 0 under DRHD base  0xdf900000 IOMMU 1
[    1.646125] DMAR-IR: IOAPIC id 1 under DRHD base  0xdf900000 IOMMU 1
[    1.646126] DMAR-IR: HPET id 0 under DRHD base 0xdf900000
[    1.646126] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[    1.646127] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[    1.646673] DMAR-IR: Enabled IRQ remapping in xapic mode
[    2.839384] DMAR: dmar0: Using Queued invalidation
[    2.839392] DMAR: dmar1: Using Queued invalidation
[    2.872734] DMAR: Intel(R) Virtualization Technology for Directed I/O

Code:
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

I removed some of the output so I could fit it in the character limit of this forum.
Code:
root@R620-2:~# lspci -nnk
00:05.2 System peripheral [0880]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 IIO RAS [8086:0e2a] (rev 04)
        Subsystem: Dell Xeon E7 v2/Xeon E5 v2/Core i7 IIO RAS [1028:04ce]
00:11.0 PCI bridge [0604]: Intel Corporation C600/X79 series chipset PCI Express Virtual Root Port [8086:1d3e] (rev 05)
        Kernel driver in use: pcieport
00:16.0 Communication controller [0780]: Intel Corporation C600/X79 series chipset MEI Controller #1 [8086:1d3a] (rev 05)
        Subsystem: Dell C600/X79 series chipset MEI Controller [1028:04ce]
        Kernel modules: mei_me
00:16.1 Communication controller [0780]: Intel Corporation C600/X79 series chipset MEI Controller #2 [8086:1d3b] (rev 05)
        Subsystem: Dell C600/X79 series chipset MEI Controller [1028:04ce]
00:1a.0 USB controller [0c03]: Intel Corporation C600/X79 series chipset USB2 Enhanced Host Controller #2 [8086:1d2d] (rev 05)
        Subsystem: Dell C600/X79 series chipset USB2 Enhanced Host Controller [1028:04ce]
        Kernel driver in use: ehci-pci
00:1c.0 PCI bridge [0604]: Intel Corporation C600/X79 series chipset PCI Express Root Port 1 [8086:1d10] (rev b5)
        Kernel driver in use: pcieport
00:1c.7 PCI bridge [0604]: Intel Corporation C600/X79 series chipset PCI Express Root Port 8 [8086:1d1e] (rev b5)
        Kernel driver in use: pcieport
00:1d.0 USB controller [0c03]: Intel Corporation C600/X79 series chipset USB2 Enhanced Host Controller #1 [8086:1d26] (rev 05)
        Subsystem: Dell C600/X79 series chipset USB2 Enhanced Host Controller [1028:04ce]
        Kernel driver in use: ehci-pci
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev a5)
00:1f.0 ISA bridge [0601]: Intel Corporation C600/X79 series chipset LPC Controller [8086:1d41] (rev 05)
        Subsystem: Dell C600/X79 series chipset LPC Controller [1028:04ce]
        Kernel driver in use: lpc_ich
        Kernel modules: lpc_ich
00:1f.2 SATA controller [0106]: Intel Corporation C600/X79 series chipset 6-Port SATA AHCI Controller [8086:1d02] (rev 05)
        Subsystem: Dell C600/X79 series chipset 6-Port SATA AHCI Controller [1028:04ce]
        Kernel driver in use: ahci
        Kernel modules: ahci
01:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
        Subsystem: Dell NetXtreme BCM5720 2-port Gigabit Ethernet PCIe [1028:1f5b]
        Kernel driver in use: tg3
        Kernel modules: tg3
01:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
        Subsystem: Dell NetXtreme BCM5720 2-port Gigabit Ethernet PCIe [1028:1f5b]
        Kernel driver in use: tg3
        Kernel modules: tg3
02:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
        Subsystem: Dell NetXtreme BCM5720 2-port Gigabit Ethernet PCIe [1028:1f5b]
        Kernel driver in use: tg3
        Kernel modules: tg3
02:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
        Subsystem: Dell NetXtreme BCM5720 2-port Gigabit Ethernet PCIe [1028:1f5b]
        Kernel driver in use: tg3
        Kernel modules: tg3
03:00.0 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] [1000:005b] (rev 05)
        Subsystem: Dell PERC H710P Mini (for monolithics) [1028:1f34]
        Kernel driver in use: megaraid_sas
        Kernel modules: megaraid_sas
08:00.0 PCI bridge [0604]: Renesas Technology Corp. SH7757 PCIe Switch [PS] [1912:0013]
        Kernel driver in use: pcieport
09:00.0 PCI bridge [0604]: Renesas Technology Corp. SH7757 PCIe Switch [PS] [1912:0013]
        Kernel driver in use: pcieport
09:01.0 PCI bridge [0604]: Renesas Technology Corp. SH7757 PCIe Switch [PS] [1912:0013]
        Kernel driver in use: pcieport
0a:00.0 PCI bridge [0604]: Renesas Technology Corp. SH7757 PCIe-PCI Bridge [PPB] [1912:0012]
0b:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. G200eR2 [102b:0534]
        Subsystem: Dell G200eR2 [1028:04ce]
        Kernel driver in use: mgag200
        Kernel modules: mgag200
40:01.0 PCI bridge [0604]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 1a [8086:0e02] (rev 04)
        Kernel driver in use: pcieport
40:03.0 PCI bridge [0604]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 3a [8086:0e08] (rev 04)
        Kernel driver in use: pcieport
40:05.0 System peripheral [0880]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 VTd/Memory Map/Misc [8086:0e28] (rev 04)
        Subsystem: Dell Xeon E7 v2/Xeon E5 v2/Core i7 VTd/Memory Map/Misc [1028:04ce]
40:05.2 System peripheral [0880]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 IIO RAS [8086:0e2a] (rev 04)
        Subsystem: Dell Xeon E7 v2/Xeon E5 v2/Core i7 IIO RAS [1028:04ce]
41:00.0 Fibre Channel [0c04]: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
        Subsystem: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA - FCOE [1657:0014]
        Kernel driver in use: bfa
        Kernel modules: bfa
41:00.1 Fibre Channel [0c04]: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
        Subsystem: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA - FCOE [1657:0014]
        Kernel driver in use: bfa
        Kernel modules: bfa
41:00.2 Ethernet controller [0200]: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
        Subsystem: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA - LL [1657:0015]
        Kernel driver in use: bna
        Kernel modules: bna
41:00.3 Ethernet controller [0200]: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
        Subsystem: Brocade Communications Systems, Inc. 1010/1020/1007/1741 10Gbps CNA - LL [1657:0015]
        Kernel driver in use: bna
        Kernel modules: bna
42:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
        Subsystem: Dell 6Gbps SAS HBA Adapter [1028:1f1c]
        Kernel modules: mpt3sas

Code:
nano /etc/modprobe.d/vfio.conf

options vfio-pci ids=1000:0072,1028:1f1c,1000:005b


Code:
nano /etc/modprobe.d/pve-blacklist.conf

# This file contains a list of modules which are not supported by Proxmox VE

# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb
blacklist mpt3sas
 
Last edited: