LSI Megaraid issues - some potential fixes

marotori

Member
Jun 17, 2009
161
1
16
Hello,

I have recently installed 6 new super micro servers.

Each of these has been experiencing various issues related to disk io under proxmox 2.

I have been working through a variety of fixes for these issues - and have now got the following 'working' platform in place.

Still early days.. but most of the issues seem to be improved.

Firstly...

My systems started using the following card: LSI 9240-4I

My suggestion. Aviod this card. It is rubbish.

I have changed 2 or my servers to the LSI 9260-4I

Much better! As an added bonus.. it is a drop in replacement and will pick up your existing raid config :)

System config wise.

in /etc/default/grub

Set the following:

Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off idle=halt elevator=deadline"
and run..  update-grub

deadline is used as this is recommended by LSI.
pcie_aspm=off idle=halt switches off some of the power management stuff that seems to cause funnies on the card. (this may be specific to my servers bios)

The following settings get applied on startup of the system.

Code:
# these are recommended by LSI
echo "0" > /sys/block/sda/queue/rotational
echo "975" > /sys/block/sda/queue/nr_requests
echo "975" > /sys/block/sda/device/queue_depth

# this is a redhat kernel fix to keep more ram spare to preven out of memory errors
echo 65536 > /proc/sys/vm/min_free_kbytes

# make sure we dont use power saving.
echo performance > /sys/module/pcie_aspm/parameters/policy

Now. No garuantees this will fix every issue.. but it does seem to smoothen out some of the hickups.

Reality is: I believe there is some kine of low level kernel bug in relation to the megaraid drivers. Until this issue is fixed.. I do not think there will be any 100% fix.

However... these do seem to help.

I will be posting in this thread if I make any further changes.

Rob
 
So..

After yet another crash... I started again and this time - 1 hour on things are looking ok!

To load test I have multiple dd commands running, and rsync & a vsdump!

So..

settings as follows:

Code:
echo cfq > /sys/block/sda/queue/scheduler
echo 0 > /sys/block/sda/queue/iosched/slice_idle
echo 1 > /sys/block/sda/queue/iosched/group_idle
echo 65536 > /proc/sys/vm/min_free_kbytes

These settings seem to work well. The group_idle is essential!

Code:
/sbin/blockdev --setra 16384 /dev/sda
/sbin/blockdev --setra 16384 /dev/mapper/pve-data
/sbin/blockdev --setra 16384 /dev/mapper/pve-root

These depend on your end hardware

Raid Settings:
Code:
MegaCli -LDSetProp ADRA -LALL -aALL
MegaCli -LDSetProp EnDskCache -LAll -aAll
MegaCli -AdpAutoRbld -Dsply -a0
MegaCli -AdpSetProp NCQDsbl -aAll
MegaCli -LDSetProp CachedBadBBU -LALL -aALL
MegaCli -LDSetProp WB -LALL -aALL

These simply are saying. Turn on NCQ as my drives support it; and turn on the Write Back Cache.

Currently.. I dont have battery backup - should have that tomorrow!

I will also report back on if it crashed or not!

Rob
 
how did you install the package on a debian 64bit.
lsi says that only debian/ubuntu 32bit is supported.

have done this by alien?
 
hi,

thank you for the fast reply...

we have a 9260-16i controller

i tried the following
https://plus.google.com/u/0/107102143611228129256/posts/Teoy9HoCNUx

Code:
$ sudo passwd

# 2. install alien
$ sudo apt-get install alien

# 3. download software
[URL]http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/11.12.01-01_Linux_MSM.tar.gz[/URL]

# 4. convert rpms into deb
$ mkdir tmpdir
$ cp 11.12.01-01_Linux_MSM.tar.gz tmpdir/
$ cd tmpdir
$ tar xzvf 11.12.01-01_Linux_MSM.tar.gz 
$ tar xzvf MSM_linux_x64_installer-11.12.01-01.tar.gz 
$ cd disk
$ sudo alien --scripts *.rpm

# 5. install packages 
$ sudo dpkg --install lib-utils2_1.00-5_all.deb
$ sudo dpkg --install lib-utils_1.00-10_all.deb
$ sudo dpkg --install megaraid-storage-manager_11.12.01-2_all.deb

# 6. open second terminal and do:
$ sudo vi /etc/init.d/vivaldiframeworkd
# change first line from "#!/bin/sh" to "#!/bin/bash" & save file

$ sudo /etc/init.d/vivaldiframeworkd restart

it works but i cant connect from client
tcp 0 0 0.0.0.0:49258 0.0.0.0:* LISTEN 2043/java
tcp 0 0 10.0.0.1:3071 10.0.0.16:54891 ESTABLISHED 2043/java

hope you have a suggestion ^^
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!