Regression in kernel 5.15 with megaraid_sas when certain RAID cards have VD in rebuild/consistiency-check state

eider

Well-Known Member
Aug 9, 2018
35
8
48
This is more of a PSA than anything, to anyone here that might experience this issue, as it took me some time to debug.

There is regression in megaraid_sas kernel module for version 5.15 used in PVE. In a combination with certain RAID cards (such as `LSI MegaRAID SAS 2008`, for example PERC H310) and in a case where IOMMU is in passthrough (pt) mode, when controller has at least one VD (virtual disk) that is in rebuild or consistency-check state, DMA remapping issue occurs, resulting in complete lockup of reads and writes to PCIe device of RAID controller as well as underlying virtual disks.

Code:
Jun  2 02:00:35 ako kernel: [2445880.628406] DMAR: DRHD: handling fault status reg 2
Jun  2 02:00:35 ako kernel: [2445880.634219] DMAR: [DMA Write NO_PASID] Request device [01:00.0] fault addr 0xcd42c000 [fault reason 0x05] PTE Write access is not set
Jun  2 02:00:35 ako kernel: [2445880.649218] DMAR: [DMA Write NO_PASID] Request device [01:00.0] fault addr 0xcd43c000 [fault reason 0x05] PTE Write access is not set
Jun  2 02:00:35 ako kernel: [2445880.664238] DMAR: [DMA Read NO_PASID] Request device [01:00.0] fault addr 0xcd43c000 [fault reason 0x06] PTE Read access is not set
Jun  2 02:00:35 ako kernel: [2445880.680116] DMAR: [DMA Read NO_PASID] Request device [01:00.0] fault addr 0xcd43c000 [fault reason 0x06] PTE Read access is not set
...
Jun  2 02:36:14 ako kernel: [2448019.908805] megaraid_sas 0000:01:00.0: megasas_wait_for_outstanding:2838 waiting_for_outstanding: before issue OCR. FW state = 0xc0000000, outstanding 0x1
Jun  2 02:36:14 ako kernel: [2448019.908819] megaraid_sas 0000:01:00.0: moving cmd[0]:0000000045d94299:1:0000000000000000 the defer queue as internal
Jun  2 02:36:14 ako kernel: [2448019.908826] megaraid_sas 0000:01:00.0: moving cmd[1]:0000000075f808c7:1:0000000000000000 the defer queue as internal
Jun  2 02:36:14 ako kernel: [2448019.908831] megaraid_sas 0000:01:00.0: moving cmd[2]:000000006e008791:1:0000000000000000 the defer queue as internal
Jun  2 02:36:14 ako kernel: [2448019.908835] megaraid_sas 0000:01:00.0: moving cmd[3]:000000002bc4af4d:0:000000003f8c698f the defer queue as internal
Jun  2 02:36:14 ako kernel: [2448019.908840] megaraid_sas 0000:01:00.0: moving cmd[4]:00000000abf734ba:1:0000000000000000 the defer queue as internal
Jun  2 02:36:14 ako kernel: [2448019.908844] megaraid_sas 0000:01:00.0: moving cmd[5]:000000004b9a4f7a:1:0000000000000000 the defer queue as internal
Jun  2 02:36:14 ako kernel: [2448019.908848] megaraid_sas 0000:01:00.0: FW detected to be in faultstate, restarting it...
Jun  2 02:36:15 ako kernel: [2448020.932860] megaraid_sas 0000:01:00.0: ADP_RESET_GEN2: HostDiag=a0
Jun  2 02:36:25 ako kernel: [2448030.948898] megaraid_sas 0000:01:00.0: FW restarted successfully,initiating next stage...
Jun  2 02:36:25 ako kernel: [2448030.948907] megaraid_sas 0000:01:00.0: HBA recovery state machine,state 2 starting...
Jun  2 02:36:56 ako kernel: [2448061.669090] megaraid_sas 0000:01:00.0: Waiting for FW to come to ready state
Jun  2 02:36:56 ako kernel: [2448061.725118] megaraid_sas 0000:01:00.0: FW now in Ready state
Jun  2 02:36:56 ako kernel: [2448062.276967] dmar_fault: 2454 callbacks suppressed
Jun  2 02:36:56 ako kernel: [2448062.276976] DMAR: DRHD: handling fault status reg 602
Jun  2 02:36:56 ako kernel: [2448062.297859] megaraid_sas 0000:01:00.0: command 0000000045d94299, 0000000000000000:1detected to be pending while HBA reset
Jun  2 02:36:56 ako kernel: [2448062.298032] megaraid_sas 0000:01:00.0: 0000000045d94299 synchronous cmdon the internal reset queue,issue it again.
Jun  2 02:36:56 ako kernel: [2448062.298031] DMAR: [DMA Write NO_PASID] Request device [01:00.0] fault addr 0xcd42c000 [fault reason 0x05] PTE Write access is not set
Jun  2 02:36:56 ako kernel: [2448062.298036] megaraid_sas 0000:01:00.0: command 0000000075f808c7, 0000000000000000:1detected to be pending while HBA reset
Jun  2 02:36:56 ako kernel: [2448062.345337] DMAR: [DMA Read NO_PASID] Request device [01:00.0] fault addr 0xcd43c000 [fault reason 0x06] PTE Read access is not set
Jun  2 02:36:56 ako kernel: [2448062.357438] DMAR: DRHD: handling fault status reg 2
Jun  2 02:36:56 ako kernel: [2448062.387781] megaraid_sas 0000:01:00.0: 0000000075f808c7 synchronous cmdon the internal reset queue,issue it again.
Jun  2 02:36:56 ako kernel: [2448062.412684] DMAR: [DMA Read NO_PASID] Request device [01:00.0] fault addr 0xcd43c000 [fault reason 0x06] PTE Read access is not set
Jun  2 02:36:56 ako kernel: [2448062.424433] DMAR: DRHD: handling fault status reg 102
Jun  2 02:36:56 ako kernel: [2448062.454570] megaraid_sas 0000:01:00.0: command 000000006e008791, 0000000000000000:1detected to be pending while HBA reset
Jun  2 02:36:56 ako kernel: [2448062.479538] DMAR: [DMA Read NO_PASID] Request device [01:00.0] fault addr 0xcd43c000 [fault reason 0x06] PTE Read access is not set
Jun  2 02:36:56 ako kernel: [2448062.531823] megaraid_sas 0000:01:00.0: 000000006e008791 synchronous cmdon the internal reset queue,issue it again.
Jun  2 02:36:56 ako kernel: [2448062.531829] megaraid_sas 0000:01:00.0: command 000000002bc4af4d, 000000003f8c698f:0detected to be pending while HBA reset
Jun  2 02:36:56 ako kernel: [2448062.531832] megaraid_sas 0000:01:00.0: 000000002bc4af4d scsi cmd [28]detected on the internal queue, issue again.
Jun  2 02:36:56 ako kernel: [2448062.531835] megaraid_sas 0000:01:00.0: command 00000000abf734ba, 0000000000000000:1detected to be pending while HBA reset
Jun  2 02:36:56 ako kernel: [2448062.531838] megaraid_sas 0000:01:00.0: 00000000abf734ba synchronous cmdon the internal reset queue,issue it again.
Jun  2 02:36:56 ako kernel: [2448062.531840] megaraid_sas 0000:01:00.0: command 000000004b9a4f7a, 0000000000000000:1detected to be pending while HBA reset
Jun  2 02:36:56 ako kernel: [2448062.531843] megaraid_sas 0000:01:00.0: 000000004b9a4f7a synchronous cmdon the internal reset queue,issue it again.
Jun  2 02:36:56 ako kernel: [2448062.531845] megaraid_sas 0000:01:00.0: aen_cmd in def process
Jun  2 02:36:56 ako kernel: [2448062.531849] megaraid_sas 0000:01:00.0: megasas_wait_for_outstanding:2850 waiting_for_outstanding: after issue OCR.
Jun  2 02:37:02 ako kernel: [2448067.278741] dmar_fault: 8901 callbacks suppressed

The issue is mitigated by downgrading to kernel 5.13 or explicitly disabling IOMMU. Please note that these cards have historically been problematic with IOMMU already, requiring passthrough mode to work properly at all in kernels dating as back as 3.x (which is the reason why so many people have been reporting issues with them on PVE 7.2, as kernel 5.15 now defaults to on mode).

This issue however is quite unique, in how it manifests only in a very specific scenario. Note that this issue can manifest itself on already booted system and will persists through reboots (until a task running on VD is complete), which might falsely give indication of faulty RAID card (this was confirmed to not be a case, as I had replaced the card to same results). This issue does not affect patrol-read operation on these cards.
 
Last edited:
Thank you for your post! I have had this exact issue that presented itself after a power failure knocked out some of my servers and I was rebuilding stuff. I thought I was loosing it as I had already replaced the RAID controller trying to figure this out.

Do you know if there have been any bug reports filed on this in the linux kernel to get this fixed? I cannot seem to locate one, but since Ubuntu 22.04 LTS (which is what I am using) also uses this kernel the issue is present there as well.
 
I believe there was a report opened against Ubuntu's kernel on their bugtracker but I can't seem to locate it anymore (or to be more specific, it seems to lead to 404). Not aware of any other reports of this issue and I'm not really in position to perform bisect to find offending change. One could try using older out-of-tree megaraid_sas module to see if that would help (for reference, last common commit that is present in 5.13 is fa60ce2cb4506701c43bd4cf3ca23d970daf1b9c), failing that you'll need to try 5.14 to determine whether we are looking at 5.15 or 5.14 changes. Do note however that due to nature of issue there is no guarantee that the source of issue and offending commit belong to module itself, it very well might be other kernel part that deals with DMAR/IOMMU stuff.

Overall not really surprised this went unnoticed for such a long time, it requires specific older hardware configured in a specific non-default way (these cards suggest enabling patrol-read but not consistency check) or in a very specific (rebuilding) condition. To top it off, any maintainer would be flooded with irrelevant issues arising from 5.15 switching default IOMMU mode to on making it that much harder to filter out this specific issue.
 
Last edited:
(which is the reason why so many people have been reporting issues with them on PVE 7.2, as kernel 5.15 now defaults to on mode).
the default was changed back to off again (also in Ubuntu upstream) - see https://pve.proxmox.com/wiki/Roadmap#7.2-known-issues
(I assume due to the many unpleasant surprises in certain hardware configurations)

So I assume if you upgrade to the latest released kernel - you can remove the kernel-commandline option for turning it off.
 
So I assume if you upgrade to the latest released kernel - you can remove the kernel-commandline option for turning it off.
In my specific case I've been running with pt since PVE 6.x, so this has no effect on me. It's a shame though that the defaults were reverted back, as it will mask the underlying regression (the issue described did not happen in 5.13 when running in pt mode)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!