proxmox 8.x and megaraid_sas issue

Romainp

Active Member
Jan 23, 2018
19
3
43
53
Hi!
I post here in a desesperate need for help or advices...
I am the proud owner of a proxmox lab that was working fine during those last months on my
SuperMicro X9DRE-TF+ with 200G of RAM
I have a Perc H310 controller configured with 4 x 1T discs in RAID 10 config

This setup was working fine until those days... Maybe there was some updates on the kernel but since then, my controller does not work anymore..
I mean, I have hard time to boot correctly until I set some parameters in grud (intel_iommu=on iommu=pt pcie_aspm=pff acpi_enforce_resources=lax) but even then, the megaraid_sas keep crashing some errors in the logs and obviously the mounts are not available..
I have tested with 2 other controllers and got the same results.
For the fun, I have boot systemrescuecd 11.01 and I was able to mount to access the drive with no problem which lead me to think that something isnot working correctly with the megaraid_sas driver provided by proxmox on the recent release.

I can post logs if I can but I am wondering what can I do from here?

Any advices is welcome.
Thanks
 
...and again brocken with Linux 6.8.12-4-pve - I'm using the Lenovo ThinkSystem RAID 530-8i PCIe 12Gb Adapter (= megaraid_sas) which was working fine with Proxmox V. 8.1-2 (Kernel 6.5) but installing V. 8.3-1 is not possible with this Raid adapter. The installer does not find a harddisk. So, back to kernel 6.5? Is this possible/running with Proxmox V. 8.3?
 
  • Like
Reactions: Kingneutron
Just tried: same with latest kernel Linux 6.8.12-7-pve - very disappointing that proxmox is not able to fix that for a really big group of Raid controllers!
 
Hi All. I'm having the same issue with my Lenovo ThinkSystem RAID 530-8i PCIe 12Gb Adapter: Mar 11 05:02:18 local kernel: megaraid_sas 0000:04:00.0: 222031 (794957255s/0x0020/CRIT) - Disabling writes to flash due to a critical error. Reboot the system to enable writes to flash a> Surely there must be some way to resolve this?
 
At one controller I was able to Update: I had to install a Windows-OS on the server, then I installed the MegaRaid Storge Manager (you can download it from Broadcom for free) and then was able to update. On another machine that procedure was not working but the server was still in service, so lenovo changed the controller (and after that the update via BOMC was no trouble).

It seems to me that Lenovo will give no advice at howto get into the controller - for which reason I don't know...
 
Hi All. I'm having the same issue with my Lenovo ThinkSystem RAID 530-8i PCIe 12Gb Adapter: Mar 11 05:02:18 local kernel: megaraid_sas 0000:04:00.0: 222031 (794957255s/0x0020/CRIT) - Disabling writes to flash due to a critical error. Reboot the system to enable writes to flash a> Surely there must be some way to resolve this?
  • Here's my conversation on Server Fault. I eventually came right with having to sacrifice the raid controller and just use the onbaod controller. I may try bjoster's advice below when i have the time again:

  • There's little chance that a firmware update changes the behavior of the RAID controller or it's driver. You might need to upgrade.
    Zac67
    Commented2 days ago

  • Thanks @Zac67. Ive looked into it a little more, seems like there is support for this controller/adapter from RHEL. The only problem is that both Proxmox and XCP-ng both use Debian based distros - so i'm out of luck on that front - however not all is lost - I can still use the onboard SATA controllers to control the drives in both RAID 0 & RAID 1. Thats my next step. I'll share my findings here.
    Jason van Wyk
    Commented2 days ago


  • NOBODY likes write caches without NVM (=battery backup). The 530-8i uses flash, so the risk of corruption isn't concerning. But linux doesn't know about this. You should be able to set write back with hdparm -W.
    bjoster
    Commentedyesterday

  • Thanks all. I did come right in the end. Turns out that i could use the onboard sata controller and set it to raid one. (losing all the power of that very expensive RAID 530-8i PCIe 12Gb Adapter - which is just sitting pretty in my box doing nothing right now). anyways xcp-ng works great now on the server. @bjoster - when i get the energy again, i may still yet try your suggestion - if i do i'll put my findings here again.
    Jason van Wyk
    Commented17 hours ago Delete