HBA passthrough to FreeBSD fails after PVE 5.4 updated to linux kernel 4.15.18-14 and above

Leif Julen

Member
Nov 7, 2017
16
0
6
Since PVE 5.0, I have had different FreeNAS VMs with HBA passthrough working flawlessly. However, after a PVE upgrade to linux 4.15.18-15 and anything higher, the VMs hang or cause errors during guest boot. I found one other forum post that seems related to my issue, but no solution was provided: https://forum.proxmox.com/threads/gpu-pass-through-no-longer-functioning-since-upgrade.55223/

I confirmed HBA works normally on host and other machines. Selecting linux kernel 4.15.18-14 from the advanced options at host boot has been a reliable workaround for continuing to use my VMs while I search for a more permanent solution. I had hoped upgrading to PVE 6.0 would fix everything, but no such luck. Unfortunately, it looks like with PVE 6.0 I'll have to deal with other PCI passthrough issues in FreeBSD guests due to QEMU v4.0 machine type as found here:
https://forum.proxmox.com/threads/vm-w-pcie-passthrough-not-working-after-upgrading-to-6-0.56021/

My specifications:
Motherboard: ASUS C246 Pro (tried BIOS/UEFI version 0308 through most recent ,1003)
CPU: Intel Xeon E-2176G
Memory: 64GB ECC 2400Mhz
HBA: Broadcom 9400-16i (tried firmware phase 6 through most recent, phase 12)
Proxmox (versions 5.4 -1 through 6.0-2)
FreeNAS (versions 11.0 through 11.2-U5)
Package versions:
proxmox-ve: 5.4-2 (running kernel: 4.15.18-14-pve)
pve-manager: 5.4-11 (running version: 5.4-11/6df3d8d0)
pve-kernel-4.15: 5.4-6
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-17-pve: 4.15.18-43
pve-kernel-4.15.18-16-pve: 4.15.18-41
pve-kernel-4.15.18-15-pve: 4.15.18-40
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
pve-kernel-4.15.18-9-pve: 4.15.18-30
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-12
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-53
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-14
libpve-storage-perl: 5.0-44
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-54
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
Passing through HBA to FreeBSD guests using Proxmox with versions of linux 4.15.18-15 and higher gives what I believe are relevant messages on guest during boot:
...
mpr0: <Avago Technologies (LSI) SAS3416> port 0xd000-0xd0ff mem 0x1000100000-0x10001fffff,0x92000000-0x920fffff irq 16 at device 0.0 on pci1
mpr0: attempting to allocate 1 MSI-X vectors (16 supported)
msi: routing MSI-X IRQ 257 to local APIC 3 vector 49
mpr0: using IRQ 257 for MSI-X
mpr0: IOC in fault state 0x0
mpr0: IOC in fault state 0x0
mpr0: IOC in fault state 0x0
...
and
...
device_attach: mpr0 attach returned 6
...
Anyone savvier than me know what the issue might be, what I could do about it, or where I might look for more info? Thank you in advance!
 

dcsapak

Proxmox Staff Member
Staff member
Feb 1, 2016
4,019
365
88
31
Vienna
anything on the host logs/dmesg?
 

Leif Julen

Member
Nov 7, 2017
16
0
6
I can't run dmesg right now but will when I get a chance! It may have to wait until next week when I'll have physical access to server (otherwise I can't remotely select linux kernel versions at boot and would lose remote access to my NAS).

Until then, all my best.
 

Leif Julen

Member
Nov 7, 2017
16
0
6
anything on the host logs/dmesg?
Sorry for the delay, I finally have a chance to follow up. Just clean installed PVE 6.0 without hitch on ZFS RAID1 SSD boot and I have two different HBA (confirmed working) for PCIe passthrough testing. Newly installed FreeNAS 11.2-U6 VM. Latest updates applied where possible. To my novice eyes, it looks like the host has no problems with the HBA devices.

PCIe passthrough on host:
# dmesg | grep -e DMAR - IOMMU
[ 0.007541] ACPI: DMAR 0x0000000089F2EF28 0000C8 (v01 INTEL EDK2 00000002 01000013)
[ 0.249118] DMAR: IOMMU enabled
[ 0.369097] DMAR: Host address width 39
[ 0.369098] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.369103] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
[ 0.369105] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.369107] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[ 0.369109] DMAR: RMRR base: 0x00000087de6000 end: 0x00000087e05fff
[ 0.369111] DMAR: RMRR base: 0x0000008b800000 end: 0x0000008fffffff
[ 0.369112] DMAR: RMRR base: 0x00000089e44000 end: 0x00000089ec3fff
[ 0.369114] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.369115] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.369116] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.370598] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 1.116077] DMAR: No ATSR found
[ 1.116114] DMAR: dmar0: Using Queued invalidation
[ 1.116116] DMAR: dmar1: Using Queued invalidation
[ 1.116352] DMAR: Setting RMRR:
[ 1.116406] DMAR: Setting identity map for device 0000:00:02.0 [0x8b800000 - 0x8fffffff]
[ 1.116452] DMAR: Setting identity map for device 0000:00:14.0 [0x87de6000 - 0x87e05fff]
[ 1.116460] DMAR: Prepare 0-16MiB unity mapping for LPC
[ 1.116495] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[ 1.116502] DMAR: Intel(R) Virtualization Technology for Directed I/O
lspci -v
# lspci -v
01:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)
Subsystem: LSI Logic / Symbios Logic 9207-8i SAS2.1 HBA
Flags: bus master, fast devsel, latency 0, IRQ 16
I/O ports at 5000 [*size=256]
Memory at 90340000 (64-bit, non-prefetchable) [size=64K]
Memory at 90300000 (64-bit, non-prefetchable) [size=256K]
Expansion ROM at 90200000 [disabled] [size=1M]
Capabilities: [50] Power Management version 3
Capabilities: [68] Express Endpoint, MSI 00
Capabilities: [d0] Vital Product Data
Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [c0] MSI-X: Enable- Count=16 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [1e0] #19
Capabilities: [1c0] Power Budgeting <?>
Capabilities: [190] #16
Capabilities: [148] Alternative Routing-ID Interpretation (ARI)
Kernel driver in use: vfio-pci
Kernel modules: mpt3sas

02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3416 Fusion-MPT Tri-Mode I/O Controller Chip (IOC) (rev 01)
Subsystem: LSI Logic / Symbios Logic SAS3416 Fusion-MPT Tri-Mode I/O Controller Chip (IOC)
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at 6000100000 (64-bit, prefetchable) [size=1M]
Memory at 6000000000 (64-bit, prefetchable) [size=1M]
Memory at 90000000 (32-bit, non-prefetchable) [size=1M]
I/O ports at 4000 [size=256*]
Expansion ROM at 90100000 [disabled] [size=256K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable- Count=16 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Power Budgeting <?>
Capabilities: [158] Alternative Routing-ID Interpretation (ARI)
Capabilities: [168] #19
Capabilities: [254] #16
Capabilities: [284] Vendor Specific Information: ID=0002 Rev=1 Len=100 <?>
Capabilities: [384] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [3bc] #15
Kernel driver in use: vfio-pci
Kernel modules: mpt3sas
dmesg | grep 'sas'
# dmesg | grep 'sas'
[ 1.633551] mpt3sas version 27.101.00.00 loaded
[ 1.634125] mpt3sas 0000:01:00.0: enabling device (0000 -> 0002)
[ 1.634613] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (65818320 kB)
[ 1.648999] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[ 1.649618] mpt2sas_cm0: MSI-X vectors supported: 16, no of cores: 12, max_msix_vectors: -1
[ 1.650508] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 150
[ 1.651066] mpt2sas0-msix1: PCI-MSI-X enabled: IRQ 151
[ 1.651507] mpt2sas0-msix2: PCI-MSI-X enabled: IRQ 152
[ 1.652016] mpt2sas0-msix3: PCI-MSI-X enabled: IRQ 153
[ 1.652475] mpt2sas0-msix4: PCI-MSI-X enabled: IRQ 154
[ 1.652907] mpt2sas0-msix5: PCI-MSI-X enabled: IRQ 155
[ 1.653367] mpt2sas0-msix6: PCI-MSI-X enabled: IRQ 156
[ 1.653701] mpt2sas0-msix7: PCI-MSI-X enabled: IRQ 157
[ 1.654017] mpt2sas0-msix8: PCI-MSI-X enabled: IRQ 158
[ 1.654334] mpt2sas0-msix9: PCI-MSI-X enabled: IRQ 159
[ 1.654658] mpt2sas0-msix10: PCI-MSI-X enabled: IRQ 160
[ 1.654976] mpt2sas0-msix11: PCI-MSI-X enabled: IRQ 161
[ 1.655292] mpt2sas_cm0: iomem(0x0000000090340000), mapped(0x00000000d1583808), size(65536)
[ 1.655619] mpt2sas_cm0: ioport(0x0000000000005000), size(256)
[ 1.664158] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[ 1.754330] mpt2sas_cm0: Allocated physical memory: size(6803 kB)
[ 1.754649] mpt2sas_cm0: Current Controller Queue Depth(10104),Max Controller Queue Depth(10240)
[ 1.754970] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[ 1.801081] mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x05), BiosVersion(07.39.02.00)
[ 1.801465] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[ 1.809376] mpt2sas_cm0: sending port enable !!
[ 1.809658] mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (65818320 kB)
[ 1.869071] mpt3sas_cm0: MSI-X vectors supported: 128, no of cores: 12, max_msix_vectors: -1
[ 1.869654] mpt3sas0-msix0: PCI-MSI-X enabled: IRQ 162
[ 1.870044] mpt3sas0-msix1: PCI-MSI-X enabled: IRQ 163
[ 1.870424] mpt3sas0-msix2: PCI-MSI-X enabled: IRQ 164
[ 1.870799] mpt3sas0-msix3: PCI-MSI-X enabled: IRQ 165
[ 1.871178] mpt3sas0-msix4: PCI-MSI-X enabled: IRQ 166
[ 1.871556] mpt3sas0-msix5: PCI-MSI-X enabled: IRQ 167
[ 1.871933] mpt3sas0-msix6: PCI-MSI-X enabled: IRQ 168
[ 1.872308] mpt3sas0-msix7: PCI-MSI-X enabled: IRQ 169
[ 1.872685] mpt3sas0-msix8: PCI-MSI-X enabled: IRQ 170
[ 1.873073] mpt3sas0-msix9: PCI-MSI-X enabled: IRQ 171
[ 1.873481] mpt3sas0-msix10: PCI-MSI-X enabled: IRQ 172
[ 1.873859] mpt3sas0-msix11: PCI-MSI-X enabled: IRQ 173
[ 1.874240] mpt3sas_cm0: iomem(0x0000006000100000), mapped(0x0000000096cab74d), size(1048576)
[ 1.874637] mpt3sas_cm0: ioport(0x0000000000004000), size(256)
[ 1.929252] mpt3sas_cm0: sending message unit reset !!
[ 1.931237] mpt3sas_cm0: message unit reset: SUCCESS
[ 2.078930] mpt3sas_cm0: Allocated physical memory: size(30634 kB)
[ 2.079256] mpt3sas_cm0: Current Controller Queue Depth(6548),Max Controller Queue Depth(6656)
[ 2.079599] mpt3sas_cm0: Scatter Gather Elements per IO(128)
[ 2.205333] mpt3sas_cm0: _base_display_fwpkg_version: complete
[ 2.205663] mpt3sas_cm0: FW Package Version (12.00.00.00)
[ 2.206390] mpt3sas_cm0: SAS3416: FWVersion(12.00.00.00), ChipRevision(0x01), BiosVersion(09.23.00.00)
[ 2.206722] mpt3sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Diag Trace Buffer,Task Set Full,NCQ)
[ 2.210939] mpt3sas_cm0: sending port enable !!
[ 2.213320] mpt3sas_cm0: host_add: handle(0x0001), sas_addr(0x500605b00d3bb9e0), phys(21)
[ 2.229100] mpt3sas_cm0: port enable: SUCCESS
[ 2.230707] scsi 5:0:0:0: SES: handle(0x0011), sas_addr(0x300705b00d3bb9e0), phy(16), device_name(0x300705b00d3bb9e0)
[ 2.231938] mpt3sas_cm0: log_info(0x31200206): originator(PL), code(0x20), sub_code(0x0206)
[ 4.388701] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500605b001bdf0f2), phys(8)
[ 9.529218] mpt2sas_cm0: port enable: SUCCESS
[ 130.610399] mpt3sas_cm0: removing handle(0x0011), sas_addr(0x300705b00d3bb9e0)
[ 130.610419] mpt3sas_cm0: enclosure logical id(0x300605b00d11b9e0), slot(16)
[ 130.610433] mpt3sas_cm0: enclosure level(0x0000), connector name( )
[ 130.610521] mpt3sas_cm0: sending message unit reset !!
[ 130.612068] mpt3sas_cm0: message unit reset: SUCCESS
[ 130.734289] mpt2sas_cm0: sending message unit reset !!
[ 130.735971] mpt2sas_cm0: message unit reset: SUCCESS
 

Leif Julen

Member
Nov 7, 2017
16
0
6
As for the FreeNAS guest, monitoring PCI devices makes it look like the HBA are being passed through. However, the guest doesn't seem to see either HBA.

Monitoring on host of FreeNAS guest:
Bus 1, device 0, function 0:
SAS controller: PCI device 1000:0087
PCI subsystem 1000:3020
IRQ 10.
BAR0: I/O at 0xd000 [0xd0ff].
BAR1: 64 bit memory at 0xc2040000 [0xc204ffff].
BAR3: 64 bit memory at 0xc2000000 [0xc203ffff].
id "hostpci0"

Bus 2, device 0, function 0:
SAS controller: PCI device 1000:00ac
PCI subsystem 1000:3000
IRQ 10.
BAR0: 64 bit prefetchable memory at 0x1000100000 [0x10001fffff].
BAR2: 64 bit prefetchable memory at 0x1000000000 [0x10000fffff].
BAR4: 32 bit memory at 0xc1e00000 [0xc1efffff].
BAR5: I/O at 0xc000 [0xc0ff].
id "hostpci1"
Using "pciconf -l -v" on guest doesn't list either HBA.
 
Last edited:

Leif Julen

Member
Nov 7, 2017
16
0
6
Changing the the FreeNAS VM machine type to 'pc-q35-3.1" as found in the forum post here. It gets me back to where I was with PVE 5.4 and any linux version 4.15.18-15 or above. The VM starts to boot but then hangs at the following:

FreeNAS Verbose Boot:
mpr0: <Avago Technologies (LSI) SAS3416> port 0xd000-0xd0ff mem 0x1000100000-0x10001fffff,0x1000000000-0x10000fffff,0xc2000000-0xc20fffff irq 16 at device 0.0 on pci1
mpr0: attempting to allocate 1 MSI-X vectors (16 supported)
msi: routing MSI-X IRQ 256 to local APIC 2 vector 49
mpr0: using IRQ 256 for MSI-X
I also tried using the "kernel_irqchip=on" argument as mentioned in the same forum post, but it has no observed effect.

Anyone have any thoughts?
 

Leif Julen

Member
Nov 7, 2017
16
0
6
...and just so everyone has all necessary information, here are my current PVE packages:
proxmox-ve: 6.0-2 (running kernel: 5.0.21-2-pve)
pve-manager: 6.0-7 (running version: 6.0-7/28984024)
pve-kernel-5.0: 6.0-8
pve-kernel-helper: 6.0-8
pve-kernel-5.0.21-2-pve: 5.0.21-6
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve2 criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.12-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-5
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1 lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-65 lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-7
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.0-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-7
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.1-pve2
 

Leif Julen

Member
Nov 7, 2017
16
0
6
SOLVED: Using the test repository with linux kernel 5.3.7-1-pve works without problem for FreeNAS and HBA passthrough. I hope this solution can be made available with the non-test repository soon.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!