Hi All,
Anyone following my various posts on this forum will have noted that I have been having allot of problems recently!
Poor disk performance... lvm snapshots not working... random disk io lockup etc.
I have now finally got it working and my machines are backing up no problem! Disks are working etc..
The bottom line is. The current proxmox kernel has a very bugy lsi driver. I have managed to re-create the same issues across a range of machines. Super Micro - Dell - OEM. The common issue is always the LSI card.
Symptoms and issues are always related to high io.
1. Load the disk with lots of writes - disk io grinds to a halt and no option but to hard boot.
2. Do an LVM Snapshot.. 50/50 it works.
3. Remove an LVM Snapshot. 50/50 it works. Normally results in the /var/lib/vz/ volume hanging - and then hard boot follows
All of this is rather painfull when running live systems!
The solution is rather simple. Change the LSI driver.. and the issues go away!
I hope that the working driver is integratated soon
For thoise who are wanting to upgrade the driver - and I highly recommend it if you use LSI cards; the following procedure applies.
--
Since doing this on all my 'test' servers I have not killed them once. I have for the last 12 hours been running some live sites on 2 servers with the changes - and it all is working. Even backups are working with no proble again
Note: As of kernel 2.6.32-10-pve this module is definitingly needed. It may not be required past this - dependant on if the proxmox dev team pick up what I am posting and update the driver.
Enjoy!
Anyone following my various posts on this forum will have noted that I have been having allot of problems recently!
Poor disk performance... lvm snapshots not working... random disk io lockup etc.
I have now finally got it working and my machines are backing up no problem! Disks are working etc..
The bottom line is. The current proxmox kernel has a very bugy lsi driver. I have managed to re-create the same issues across a range of machines. Super Micro - Dell - OEM. The common issue is always the LSI card.
Symptoms and issues are always related to high io.
1. Load the disk with lots of writes - disk io grinds to a halt and no option but to hard boot.
2. Do an LVM Snapshot.. 50/50 it works.
3. Remove an LVM Snapshot. 50/50 it works. Normally results in the /var/lib/vz/ volume hanging - and then hard boot follows
All of this is rather painfull when running live systems!
The solution is rather simple. Change the LSI driver.. and the issues go away!
I hope that the working driver is integratated soon

For thoise who are wanting to upgrade the driver - and I highly recommend it if you use LSI cards; the following procedure applies.
--
- Download the megaraid driver for your card:
http://www.lsi.com/downloads/Public/MegaRAID Common Files/Debian5.0.x_05_30.zip
Unzip this file. Inside it you will find another file.. probably called: megaraid_sas-v00.00.05.30-src.tgz
This is the one you want!
Extract the files in this tgz file to: /usr/local/src/
(you should end up with a folder... /usr/local/src/megaraid_sas-v00.00.05.30) - Now.. go into the folder and do the following:
Rename Makefile to Makefile.orig (mv Makefile Makefile.orig)
Copy Makefile.standalone to Makefile (cp Makefile.standalone Makefile) - Next step - install development tools
apt-get install build-essential
pve-headers-2.6.32-10-pve (note. The version may be different if you are on a different kernel) - Now - compile time:
cd to : /usr/local/src/megaraid_sas-v00.00.05.30 (you may well still be there!)
and run the following to compile the module:
make -C /usr/src/linux-headers-2.6.32-10-pve/ M=$PWD modules
Note! The header location should be changed to suit your kernel. - Time to replace the driver
Backup this file: /lib/modules/2.6.32-10-pve/kernel/drivers/scsi/megaraid/megaraid_sas.ko (simply rename)
Copy the new driver in: cp /usr/local/src/megaraid_sas-v00.00.05.30/megaraid_sas.ko /lib/modules/2.6.32-10-pve/kernel/drivers/scsi/megaraid/megaraid_sas.ko (again.. note the potential different kernel version in the lib path) - We now need to make sure this is loaded at boot time. To do this we update the initial ram disk.
mv /boot/initrd.img-2.6.32-10-pve /boot/initrd.img-2.6.32-10-pve.bak (note different kernel versions)
Re-create this initial ram disk: update-initramfs -c -k 2.6.32-10-pve
Apply the changes: update-grub - While you are at it.. edit /etc/rc.local and add the following:
echo "975" > /sys/block/sda/queue/nr_requests
#echo "975" > /sys/block/sda/device/queue_depth (this should work but is disabled in the proxmox kernel??)
/sbin/blockdev --setra 1024 /dev/sda
/sbin/blockdev --setra 1024 /dev/mapper/pve-data
/sbin/blockdev --setra 1024 /dev/mapper/pve-root - Reboot!
Since doing this on all my 'test' servers I have not killed them once. I have for the last 12 hours been running some live sites on 2 servers with the changes - and it all is working. Even backups are working with no proble again

Note: As of kernel 2.6.32-10-pve this module is definitingly needed. It may not be required past this - dependant on if the proxmox dev team pick up what I am posting and update the driver.
Enjoy!