Hello,
I have recently installed 6 new super micro servers.
Each of these has been experiencing various issues related to disk io under proxmox 2.
I have been working through a variety of fixes for these issues - and have now got the following 'working' platform in place.
Still early days.. but most of the issues seem to be improved.
Firstly...
My systems started using the following card: LSI 9240-4I
My suggestion. Aviod this card. It is rubbish.
I have changed 2 or my servers to the LSI 9260-4I
Much better! As an added bonus.. it is a drop in replacement and will pick up your existing raid config
System config wise.
in /etc/default/grub
Set the following:
deadline is used as this is recommended by LSI.
pcie_aspm=off idle=halt switches off some of the power management stuff that seems to cause funnies on the card. (this may be specific to my servers bios)
The following settings get applied on startup of the system.
Now. No garuantees this will fix every issue.. but it does seem to smoothen out some of the hickups.
Reality is: I believe there is some kine of low level kernel bug in relation to the megaraid drivers. Until this issue is fixed.. I do not think there will be any 100% fix.
However... these do seem to help.
I will be posting in this thread if I make any further changes.
Rob
I have recently installed 6 new super micro servers.
Each of these has been experiencing various issues related to disk io under proxmox 2.
I have been working through a variety of fixes for these issues - and have now got the following 'working' platform in place.
Still early days.. but most of the issues seem to be improved.
Firstly...
My systems started using the following card: LSI 9240-4I
My suggestion. Aviod this card. It is rubbish.
I have changed 2 or my servers to the LSI 9260-4I
Much better! As an added bonus.. it is a drop in replacement and will pick up your existing raid config
System config wise.
in /etc/default/grub
Set the following:
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off idle=halt elevator=deadline"
and run.. update-grub
deadline is used as this is recommended by LSI.
pcie_aspm=off idle=halt switches off some of the power management stuff that seems to cause funnies on the card. (this may be specific to my servers bios)
The following settings get applied on startup of the system.
Code:
# these are recommended by LSI
echo "0" > /sys/block/sda/queue/rotational
echo "975" > /sys/block/sda/queue/nr_requests
echo "975" > /sys/block/sda/device/queue_depth
# this is a redhat kernel fix to keep more ram spare to preven out of memory errors
echo 65536 > /proc/sys/vm/min_free_kbytes
# make sure we dont use power saving.
echo performance > /sys/module/pcie_aspm/parameters/policy
Now. No garuantees this will fix every issue.. but it does seem to smoothen out some of the hickups.
Reality is: I believe there is some kine of low level kernel bug in relation to the megaraid drivers. Until this issue is fixed.. I do not think there will be any 100% fix.
However... these do seem to help.
I will be posting in this thread if I make any further changes.
Rob