Identical LSI HBAs Can't Pass both PCIs to TrueNas VM

cosmo88

New Member
Feb 11, 2024
1
0
1
Hey Everyone! Long time Lurker, but finally found a problem I can't fix. haha.

Server Specs
pve-manager/8.3.2/3e76eec21c4a14a7 (running kernel: 6.8.12-5-pve)
SUPERMICRO MBD-X13SAE-F-O ATX Server Motherboard LGA 1700 Intel W680
20 x 13th Gen Intel(R) Core(TM) i5-13600K (1 Socket)
2 Sticks of Kingston 32GB DDR5 4800MT/s ECC Unbuffered DIMM DDR5 4800MT/s ECC Unbuffered DIMM CL40 2RX8 1.1V 288-pin 16Gbit Hynix M

Dual HPAs Identical LSI SAS9305-16i (Repurposed a 5yr old work 45Drive server)

Code:
root@pve:~# lspci -nnk
0000:01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3224 PCI-Express Fusion-MPT SAS-3 [1000:00c4] (rev 01)
        Subsystem: Broadcom / LSI SAS9305-16i [1000:3190]
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas
0000:02:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3224 PCI-Express Fusion-MPT SAS-3 [1000:00c4] (rev 01)
        Subsystem: Broadcom / LSI SAS9305-16i [1000:3190]
        Kernel driver in use: mpt3sas
        
root@pve:~# sas3flash -listall
Avago Technologies SAS3 Flash Utility
Version 15.00.00.00 (2016.11.17)
Copyright 2008-2016 Avago Technologies. All rights reserved.

        Adapter Selected is a Avago SAS: SAS3224(A1)

Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
----------------------------------------------------------------------------

0  SAS3224(A1)  16.00.12.00    10.00.00.05    08.27.00.00     00:01:00:00
1  SAS3224(A1)  16.00.12.00    10.00.00.05    08.27.00.00     00:02:00:00

root@pve:~# pvesh get /nodes/pve/hardware/pci --pci-class-blacklist ""
┌──────────┬────────┬───────────────┬────────────┬────────┬─────────────────────────────────────────────────────────────────
│ class    │ device │ id            │ iommugroup │ vendor │ device_name                                                     
╞══════════╪════════╪═══════════════╪════════════╪════════╪═════════════════════════════════════════════════════════════════
│ 0x010400 │ 0xa77f │ 0000:00:0e.0  │          5 │ 0x8086 │ Volume Management Device NVMe RAID Controller Intel Corporation
├──────────┼────────┼───────────────┼────────────┼────────┼─────────────────────────────────────────────────────────────────
│ 0x010601 │ 0x7ae2 │ 10000:e0:17.0 │          5 │ 0x8086 │ Alder Lake-S PCH SATA Controller [AHCI Mode]                   
├──────────┼────────┼───────────────┼────────────┼────────┼─────────────────────────────────────────────────────────────────
│ 0x010700 │ 0x00c4 │ 0000:01:00.0  │         17 │ 0x1000 │ SAS3224 PCI-Express Fusion-MPT SAS-3                           
├──────────┼────────┼───────────────┼────────────┼────────┼─────────────────────────────────────────────────────────────────
│ 0x010700 │ 0x00c4 │ 0000:02:00.0  │         18 │ 0x1000 │ SAS3224 PCI-Express Fusion-MPT SAS-3




Code:
root@pve:~# dmesg | grep -e DMAR -e IOMMU
[    0.017892] ACPI: DMAR 0x00000000750A1000 000088 (v02 INTEL  EDK2     00000002      01000013)
[    0.017946] ACPI: Reserving DMAR table memory at [mem 0x750a1000-0x750a1087]
[    0.091262] DMAR: IOMMU enabled
[    0.205733] DMAR: Host address width 39
[    0.205734] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.205742] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[    0.205744] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.205748] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[    0.205750] DMAR: RMRR base: 0x0000007c000000 end: 0x000000803fffff
[    0.205752] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.205753] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.205754] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.207315] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.734147] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[    0.813031] DMAR: No ATSR found
[    0.813032] DMAR: No SATC found
[    0.813033] DMAR: IOMMU feature fl1gp_support inconsistent
[    0.813034] DMAR: IOMMU feature pgsel_inv inconsistent
[    0.813034] DMAR: IOMMU feature nwfs inconsistent
[    0.813035] DMAR: IOMMU feature dit inconsistent
[    0.813036] DMAR: IOMMU feature sc_support inconsistent
[    0.813036] DMAR: IOMMU feature dev_iotlb_support inconsistent
[    0.813037] DMAR: dmar0: Using Queued invalidation
[    0.813040] DMAR: dmar1: Using Queued invalidation
[    0.813774] DMAR: Intel(R) Virtualization Technology for Directed I/O

root@pve:~# dmesg | grep 'remapping'
[    0.205754] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.207315] DMAR-IR: Enabled IRQ remapping in x2apic mode

root@pve:~# cat /etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""

root@pve:~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Please let me know if you need anymore outputs, but please paste the command as I'm still new to CLI.

TRUNAS VM INFO (I know this screenshot only has one PCI Passed through)
Screenshot 2024-12-28 at 9.22.39 PM.png


PROBLEM:
When I pass both PCI Devices through, the VM hangs on startup and doesn't show anything in console. After a long time it will start but only show the HBA in iommu 17. The Truenas VM starts instantly and shows the ZFS dataset when either card is only passed through, but with both its a no go. I've tried a lot of things below and am stuck. Any help would be very much appreciated.

WHAT I'VE TRIED.
1. Firmware update via sas3flash: https://forum.proxmox.com/threads/solved-machine-wont-boot-with-2-identical-hba-it-mode-card.151314/
- I even tried to only updating one and seeing if that would work based on the post, but no dice.
2. Binding one or both to vfio-pci drivers: https://forum.proxmox.com/threads/pci-passthrough-selection-with-identical-devices.63042/post-287937
3. Changing the Bios type and machine type of the VM.
4. Haven't explored vfio-pci with a bind id, but don't know enough about this yet.

Any other ideas?
 

Attachments

  • Screenshot 2024-12-28 at 9.22.39 PM.png
    Screenshot 2024-12-28 at 9.22.39 PM.png
    87.9 KB · Views: 1

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!