Problem with MegaRAID SAS3508 controller

Jan 19, 2026
2
0
1
Hi,
I’m running Proxmox VE 9.1.4 on Debian 13 on multiple Huawei 2288H V5 nodes. The servers use an OEM Broadcom/LSI MegaRAID SAS3508 controller.

Hardware / software details:
  • Server: Huawei 2288H V5
  • RAID controller: Broadcom / LSI SAS3508 (OEM Huawei)
  • RAID firmware: 5.140.00-3319 (Huawei confirmed this firmware is EOL, last supported on Debian 10)
  • Proxmox VE: 9.1.4
  • OS: Debian 13
  • Kernel : 6.17.4-2-pve
  • Driver: megaraid_sas (in-kernel driver from the Linux kernel, no out-of-tree module)
We are seeing random MegaRAID firmware crashes. The controller reports a fatal firmware error, goes into FAULT state and performs an Online Controller Reset (OCR). On one node this caused a full reboot, on others the controller reset and recovered without rebooting the host. All VMs run on shared SAN storage; the local RAID is basically only used for the OS (two ssd's in raid 1), so this doesn’t look like heavy local I/O either.

Key kernel logs:

Code:
megaraid_sas 0000:1c:00.0: Fatal firmware error: Line 169 in fw/raid/utils.c
megaraid_sas 0000:1c:00.0: FW in FAULT state Fault code:0x10000
megaraid_sas 0000:1c:00.0: resetting fusion adapter
megaraid_sas 0000:1c:00.0: Reset successful
megaraid_sas 0000:1c:00.0: Controller encountered an error and was reset
At this point it looks like a compatibility issue between newer Linux kernels and old MegaRAID firmware, not something Proxmox-specific.

Has anyone seen the same SAS3508 + Proxmox (Debian 12/13) behavior? I’m considering rolling back to kernel 6.14.11-5-pve. I’ve also found multiple reports online where stability issues with MegaRAID controllers were mitigated by adding the following kernel parameters via GRUB:

pcie_aspm=off
pci=noaer
megaraid_sas.msix_disable=1

These seem to reduce firmware hangs and unexpected controller resets on older MegaRAID firmware when running newer kernels.
 
Hi Mira,

quick update from our side.

We’ve pinned the kernel to 6.14 on all affected nodes and since then the issue has not reoccurred. The systems have been stable so far.

In the meantime, we also purchased Enterprise repositories for all servers.

Could you please let us know whether this issue is already resolved in the latest 6.17 kernel, or if you currently still recommend staying on 6.14 for setups with SAS3508 / older MegaRAID firmware?

Thanks for the update and your help.
 
We don't have a reproducer. So far we couldn't narrow it down to any of the changes in the kernel.
We're still trying, but so far we recommend staying on kernel 6.14 with that issue.
 
I recently attempted to upgrade my PBS to kernel 6.17.9-1-pve and starting having all kinds of issues. I assumed it was my controller card going bad, but I'm thinking it's an incompatibility issue with the kernel. I have a Supermicro Broadcom SAS 3408 card that locks everything up when I attempt a kernel upgrade.