No SAS2008 after upgrade

I've been lurking and troubleshooting this myself. I have a couple of these SAS2008 cards. Would be kinda sad for them to go to waste :(

This is on a Dell R630 that's been upgraded from Proxmox 7 to 8.

here's the output of uname -a
Code:
Linux pve 6.8.4-3-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-3 (2024-05-02T11:55Z) x86_64 GNU/Linux

Also here's my current kernel cmdline:
Code:
root@pve:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.8.4-3-pve root=/dev/mapper/pve-root ro quiet pci=realloc=off

i've also tried the other fix with: reserve=0x80000000,0xfffffff but that did not help me :(


Code:
root@pve:~# dmesg | grep mpt
[    0.009120]   Device   empty
[    0.317706] Dynamic Preempt: voluntary
[    0.317871] rcu: Preemptible hierarchical RCU implementation.
[    0.327181] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[    0.327182] MMIO Stale Data: Vulnerable: Clear CPU buffers attempted, no microcode
[    1.577646] mpt3sas version 43.100.00.00 loaded
[    1.577824] mpt3sas 0000:04:00.0: can't disable ASPM; OS doesn't have ASPM control
[    1.577956] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (65612224 kB)
[    1.628338] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    1.628344] mpt2sas_cm0: MSI-X vectors supported: 1
[    1.628346] mpt2sas_cm0:  0 1 1
[    1.628406] mpt2sas_cm0: High IOPs queues : disabled
[    1.628408] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 72
[    1.628409] mpt2sas_cm0: iomem(0x0000000091c40000), mapped(0x00000000851cf7ca), size(65536)
[    1.628411] mpt2sas_cm0: ioport(0x0000000000002000), size(256)
[    1.679002] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    1.706477] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[    1.706535] mpt2sas_cm0: request pool(0x0000000033adbe35) - dma(0xfff80000): depth(2942), frame_size(128), pool_size(367 kB)
[    1.709865] mpt2sas_cm0: sense pool(0x00000000204b8245) - dma(0xffa00000): depth(2811), element_size(96), pool_size (263 kB)
[    1.709937] mpt2sas_cm0: reply pool(0x0000000075e07ab5) - dma(0xff980000): depth(3006), frame_size(128), pool_size(375 kB)
[    1.709947] mpt2sas_cm0: config page(0x000000001191897b) - dma(0xff97b000): size(512)
[    1.709949] mpt2sas_cm0: Allocated physical memory: size(6336 kB)
[    1.709950] mpt2sas_cm0: Current Controller Queue Depth(2808),Max Controller Queue Depth(2879)
[    1.709951] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[    1.754418] mpt2sas_cm0: log_info(0x30030100): originator(IOP), code(0x03), sub_code(0x0100)
[    1.754442] mpt2sas_cm0: log_info(0x30030100): originator(IOP), code(0x03), sub_code(0x0100)
[    1.754444] mpt2sas_cm0: LSISAS2008: FWVersion(07.15.08.00), ChipRevision(0x03)
[    1.754448] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(Raid,TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[    1.755081] mpt2sas_cm0: sending port enable !!
[    4.321167] mpt2sas_cm0: hba_port entry: 0000000006c88bd5, port: 255 is added to hba_port list
[    4.322492] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x5d4ae520735bfd00), phys(8)
[    9.447974] mpt2sas_cm0: port enable: SUCCESS
[   10.775456] systemd[1]: systemd-pstore.service - Platform Persistent Storage Archival was skipped because of an unmet condition check (ConditionDirectoryNotEmpty=/sys/fs/pstore).
[   38.247590] mpt2sas_cm0: sending diag reset !!
[   39.476543] mpt2sas_cm0: diag reset: SUCCESS
[   39.524898] mpt3sas 0000:04:00.0: can't disable ASPM; OS doesn't have ASPM control
[   39.525227] mpt2sas_cm1: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (65612224 kB)
[   39.527494] mpt2sas_cm1: sending diag reset !!
[   39.688108] mpt2sas_cm1: Invalid host diagnostic register value
[   39.688120] mpt2sas_cm1: System Register set:
[   39.757450] mpt2sas_cm1: diag reset: FAILED
[   39.758644] mpt2sas_cm1: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:12329/_scsih_probe()!

Also noticed this in dmesg, very notable:
Code:
[   39.524801] pci 0000:04:00.0: ROM [mem 0x91c00000-0x91cfffff pref]: assigned
[   39.524810] pci 0000:04:00.0: BAR 3 [mem size 0x00040000 64bit]: can't assign; no space
[   39.524815] pci 0000:04:00.0: BAR 3 [mem size 0x00040000 64bit]: failed to assign
[   39.524819] pci 0000:04:00.0: BAR 1 [mem size 0x00010000 64bit]: can't assign; no space
[   39.524822] pci 0000:04:00.0: BAR 1 [mem size 0x00010000 64bit]: failed to assign
[   39.524825] pci 0000:04:00.0: BAR 0 [io  0x2000-0x20ff]: assigned
[   39.524898] mpt3sas 0000:04:00.0: can't disable ASPM; OS doesn't have ASPM control
[   39.525227] mpt2sas_cm1: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (65612224 kB)
[   39.527494] mpt2sas_cm1: sending diag reset !!
[   39.688108] mpt2sas_cm1: Invalid host diagnostic register value
[   39.688120] mpt2sas_cm1: System Register set:
[   39.688123] 00000000: ffffffff
[   39.689210] 00000004: ffffffff
[   39.690303] 00000008: ffffffff
[   39.691390] 0000000c: ffffffff
[   39.692471] 00000010: ffffffff
[   39.693556] 00000014: ffffffff
[   39.694644] 00000018: ffffffff
[   39.695732] 0000001c: ffffffff
[   39.696819] 00000020: ffffffff
[   39.697904] 00000024: ffffffff
[   39.698990] 00000028: ffffffff
[   39.700073] 0000002c: ffffffff
[   39.701159] 00000030: ffffffff
[   39.702250] 00000034: ffffffff
[   39.704410] 00000038: ffffffff
[   39.705477] 0000003c: ffffffff
[   39.706558] 00000040: ffffffff
[   39.706568] 00000044: ffffffff
[   39.707656] 00000048: ffffffff
[   39.708744] 0000004c: ffffffff
[   39.709818] 00000050: ffffffff
[   39.710897] 00000054: ffffffff
[   39.711972] 00000058: ffffffff
[   39.713055] 0000005c: ffffffff
[   39.714141] 00000060: ffffffff
[   39.715221] 00000064: ffffffff
[   39.716305] 00000068: ffffffff
[   39.717390] 0000006c: ffffffff
[   39.718472] 00000070: ffffffff
[   39.719557] 00000074: ffffffff
[   39.720647] 00000078: ffffffff
[   39.721728] 0000007c: ffffffff
[   39.722808] 00000080: ffffffff
[   39.723887] 00000084: ffffffff
[   39.724970] 00000088: ffffffff
[   39.726053] 0000008c: ffffffff
[   39.727143] 00000090: ffffffff
[   39.728228] 00000094: ffffffff
[   39.729310] 00000098: ffffffff
[   39.730392] 0000009c: ffffffff
[   39.731478] 000000a0: ffffffff
[   39.732560] 000000a4: ffffffff
[   39.733642] 000000a8: ffffffff
[   39.734725] 000000ac: ffffffff
[   39.735806] 000000b0: ffffffff
[   39.736892] 000000b4: ffffffff
[   39.737967] 000000b8: ffffffff
[   39.739045] 000000bc: ffffffff
[   39.740125] 000000c0: ffffffff
[   39.741213] 000000c4: ffffffff
[   39.742291] 000000c8: ffffffff
[   39.743379] 000000cc: ffffffff
[   39.744462] 000000d0: ffffffff
[   39.745545] 000000d4: ffffffff
[   39.746637] 000000d8: ffffffff
[   39.747715] 000000dc: ffffffff
[   39.748799] 000000e0: ffffffff
[   39.749875] 000000e4: ffffffff
[   39.750949] 000000e8: ffffffff
[   39.752039] 000000ec: ffffffff
[   39.753129] 000000f0: ffffffff
[   39.754216] 000000f4: ffffffff
[   39.755288] 000000f8: ffffffff
[   39.756371] 000000fc: ffffffff
[   39.757450] mpt2sas_cm1: diag reset: FAILED
[   39.758644] mpt2sas_cm1: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:12329/_scsih_probe()!

Oh and of course the output of lspci -v:

Code:
03:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
        DeviceName: Integrated RAID
        Subsystem: Dell PERC H330 Mini
        Flags: bus master, fast devsel, latency 0, IRQ 33, NUMA node 0, IOMMU group 20
        I/O ports at 3000 [size=256]
        Memory at 91e00000 (64-bit, non-prefetchable) [size=64K]
        Memory at 91d00000 (64-bit, non-prefetchable) [size=1M]
        Expansion ROM at <ignored> [disabled]
        Capabilities: [50] Power Management version 3
        Capabilities: [68] Express Endpoint, MSI 00
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [c0] MSI-X: Enable+ Count=97 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [1e0] Secondary PCI Express
        Capabilities: [1c0] Power Budgeting <?>
        Capabilities: [148] Alternative Routing-ID Interpretation (ARI)
        Kernel driver in use: megaraid_sas
        Kernel modules: megaraid_sas

04:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
        Subsystem: Dell 6Gbps SAS HBA Adapter
        Flags: fast devsel, IRQ 68, NUMA node 0, IOMMU group 21
        I/O ports at 2000 [size=256]
        Expansion ROM at 91c00000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 3
        Capabilities: [68] Express Endpoint, MSI 00
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable- Count=15 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [138] Power Budgeting <?>
        Kernel modules: mpt3sas

Any ideas? I thought about maybe getting a different driver somewhere but I don't 100% know what I'm doing and rather not break proxmox lol
 
Last edited:
I think I posted this above or maybe in another thread but just in case. They work fine on my dell r715. Make sure you have the latest firmware and bios etc.
Code:
05:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
    DeviceName: Integrated SAS                         
    Subsystem: Dell PERC H200 Integrated
    Flags: bus master, fast devsel, latency 0, IRQ 32, NUMA node 0, IOMMU group 18
    I/O ports at fc00 [size=256]
    Memory at ecff0000 (64-bit, non-prefetchable) [size=64K]
    Memory at ecf80000 (64-bit, non-prefetchable) [size=256K]
    Expansion ROM at ece00000 [disabled] [size=1M]
    Capabilities: [50] Power Management version 3
    Capabilities: [68] Express Endpoint, MSI 00
    Capabilities: [d0] Vital Product Data
    Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
    Capabilities: [100] Advanced Error Reporting
    Capabilities: [138] Power Budgeting <?>
    Kernel driver in use: mpt3sas
    Kernel modules: mpt3sas
Code:
22:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02)
    Subsystem: Broadcom / LSI 9201-16e 6Gb/s SAS/SATA PCIe x8 External HBA
    Flags: bus master, fast devsel, latency 0, IRQ 48, IOMMU group 24
    I/O ports at cc00 [size=256]
    Memory at cffc0000 (64-bit, non-prefetchable) [size=16K]
    Memory at cff80000 (64-bit, non-prefetchable) [size=256K]
    Expansion ROM at cff00000 [disabled] [size=512K]
    Capabilities: [50] Power Management version 3
    Capabilities: [68] Express Endpoint, MSI 00
    Capabilities: [d0] Vital Product Data
    Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
    Capabilities: [100] Advanced Error Reporting
    Capabilities: [138] Power Budgeting <?>
    Capabilities: [150] Single Root I/O Virtualization (SR-IOV)
    Capabilities: [190] Alternative Routing-ID Interpretation (ARI)
    Kernel driver in use: mpt3sas
    Kernel modules: mpt3sas
Code:
lsmod |grep mpt
mptctl                 40960  1
mptbase               110592  1 mptctl
mpt3sas               364544  16
raid_class             12288  1 mpt3sas
scsi_transport_sas     53248  2 ses,mpt3sas

If you can maybe remove or disable the "LSI MegaRAID SAS-3 3008 [Fury] (rev 02)" and see if the other cards work?
If you want to see anything else let me know.
 
Update: Fixed with Kernel: Linux 6.8.4-3-pve
Yes, SAS2008 works, but on kernels 6.8.8 6.8.4 and disks are detected, but there is high degradation performance IO wrote in RAIDZ to ~30MB/s.
The most interesting thing is that the reading is correct ~600MB/s for RAIDZ from 4 4TB disks.
I have been searching the internet for two days and nothing works. Disks are working.
The same controller in older equipment with RHEL8 works correctly.
So the problem is somewhere between mpt3sas~kernel 6.8.x.

I even tried the following procedures:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on mpt3sas.max_queue_depth=10000 pcie_acs_override=downstream,multifunction video=efifb:off"
but it does not help and there is no improvement.
The disks are working, I've tested them smartly.

As I said at the beginning, with older equipment and systems such as RHEL8, writing to these disks works correctly.
 
Yes, SAS2008 works, but on kernels 6.8.8 6.8.4 and disks are detected, but there is high degradation performance IO wrote in RAIDZ to ~30MB/s.
The most interesting thing is that the reading is correct ~600MB/s for RAIDZ from 4 4TB disks.
I have been searching the internet for two days and nothing works. Disks are working.
The same controller in older equipment with RHEL8 works correctly.
So the problem is somewhere between mpt3sas~kernel 6.8.x.

I even tried the following procedures:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on mpt3sas.max_queue_depth=10000 pcie_acs_override=downstream,multifunction video=efifb:eek:ff"
but it does not help and there is no improvement.
amd_ionnu=on never does anything since (it is invalid, as) it is enabled by default. video=efifb:off has not done anything on Proxmox for some time. Why do you feel the need to pcie_acs_override?
The main change between 6.8 and earlier is that intel_iommu=on by default. Maybe try intel_iommu=off (or maybe iommu=pt is enough to use identity mapping for non-passed through device). Or maybe I misunderstood your issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!