ISCSI + Proxmox 1.5 Cluster +performance low?

copymaster

Member
Nov 25, 2009
183
0
16
Hi guys,

i have a production cluster up and running. the cluster is built of 4 servers, 3 are identical and one is with sas Hdd's.

I have a Netapp FAS 270 and a extension shelf. the shelf (SATA Disks) is used as a ISCSI Storage for the Cluster.

Every of the 4 servers is running 3-4 virtual machines and the vm's disks are residing on the ISCSI Storage.

each server has ONE LAN connected to the ISCSI Storage through VLAN and the other LAN interface is connected to the normal Network.

Now i have sometimes very poor performance on the cluster. Most of the servers are W2k3 Terminalservers and when the performance breaks in, some people are disconnected from the server and it reacts very slow. on clicking an icon i can wait about 30 seconds before the programm opens.

I think its a iscsi problem.

Is there a way to tune the ISCSI?
I tested the performance with HD-Tune and the average Transfer rate is about 20 MB/sec. It goes from 1.3 MB/sec to 30MB/Sec.

Thank you
 
Are each of the servers connected to the same ISCSI drive or does each server have a connection to their own storage? I know ISCSI is capable of sharing storage space, but I believe it can cause issues depending on the File system. It may be worth testing the storage performance with just one server connected and see if your speed is still low.
 
Well since it is a cluster, EACH of the server ist connected to the same VLAN, which is the ISCSI storage.

I think a cluster only makes sense when using this setup???

on installing the cluster i tried just to connect the storage to the clustermaster, but then, all other servers didn't know about the storage.

All VM's are lying on that ISCSI Storage and i have 4 Servers in the cluster. I need to have the functionality to quickly migrate VM's to another clusternode in case of (physical)hardware failure.

So i think connecting ISCSI Storage to each of the clusternodes is neccessary.
 
What type of storage usage do you have going with these servers regularly? you can easily tell by the network usage on the port going to the NAS and what cabling do you have going to the NAS? If the speed is 30MB/s for 4 servers

30MB * 8 = 240Mb/s
240Mb/s * 4 servers = 960 Mb/s which is right at gigabit speed.

if you have 1 gigabit cable that is being used for your NAS that could be the problem
 
well, as i already explained

every of the 4 servers has 2 1gbit network cards
one of these nics is connected to a vlan and the other one to the network.

The NAS has also 2 NICS, one is connected to the VLAN and the other one to the network

i tested the HDD performance on the KVM's and got an average of 18 MB/sec

That is too slow for a ISCSI storage over 1 GBIT.

isn't it?
 
The possible problem is that you are overloading the NAS's 1 nic that is going to the vlan. You have 1 vm server with 18 MB/s, but its competing with all the other servers for the disk speed. You may want to see if you can find out how much speed is going through that one cable attached to the vlan on the NAS.

Also have you tested the performance of the NAS alone, with nothing else attached? SATA disks are not going to provide the best performance especially for tasks like this where multiple servers are connecting to one NAS and data is being pulled from different areas on the disk.
 
try doing the following (I did this very thing last night ;)):

- install bmon on the host (bandwidth monitor) and start it
- do the dd if=/ .... that we talked about in another post on the guest
- monitor the throughput on the NIC in question

( * * * HACK HACK HACK - THIS MIGHT BURN YOUR HOUSE DOWN * * * )
Also on the host:

- create a VM that you don't care about with a disk on the SAN. That will create a logical volume for that disk.
- Format that lv (mkfs.ext4 /dev/<whatever>/<the disk id referenced in the conf file for the VM>)
- Mount that disk *on the host* (mkdir /tmp/IamSoBad; mount /dev/<whatever>/<the disk id referenced in the conf file for the VM> /mnt/IamSoBad)
- repeat the dd test but point the of to be on the mounted SAN lv (/tmp/IamSoBad).

That will tell you your true maximum throughput from the host to the SAN.

Now umount that lv and delete the VM (which will tidy up).

Now go and have a shower for doing such hacky things :)

Post your results.
 
Last edited: