Cluster SAN Problems

copymaster

Member
Nov 25, 2009
183
0
16
Hi guys.
I am drowning in performance issues. Please help.

I have a cluster of 4 proxmox 1.5 servers, each one has two NICS.
As SAN i use a Netapp shelf, on which i created a ISCSI storage of 1,5 TB.

The Netapp has 2 NICS. One is the IP for the LAN, and the other is a dedicated ip for ISCSI (VLAN).

The Servers have each one NIC (eth1) connected to the VLAN with the VLAN IP-Range and the other NIC (eth0/vmbr0) with the iprange of the lan.

Each Server is running about 4 KVM Machines but in the last time, these KVM Machines are freezing or just responding very slow while other KVMs run very good.


Is this a ISCSI issue? can i tune some settings to get better performance?

Thanks
 
First - I have a similar setup, 3 servers with dual 1GB dedicated NICs to a HP SAN (which has 4 1GB NICs altogether). (KVM) VMs on the cluster all work fine, but aren't exactly leaving blazing tyre tracks in their wake:

Code:
qfiadmin@wiki:~$ time `dd if=/dev/zero of=/tmp/2G bs=1M count=2048; sync`

2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 53.5065 s, 40.1 MB/s

The san is RAID10 over 4x600 15K SAS disks.

All the servers are running 1.5 kernel 2.6.18 and VMs are Ubuntu Lucid. On an ESXi cluster they were getting about 70 or 80MBs.

I have had one or two periods where the network drops out under very heavy load (like restoring 4 or 5 VMs at a time) but other than that things have been fine.

Can you post the following:

- /etc/network/interfaces from the host
- pveperf / from the host (just for fun)
- pveversion -v from the host
- time `dd if=/dev/zero of=/tmp/2G bs=1M count=2048; sync` on the host
- time `dd if=/dev/zero of=/tmp/2G bs=1M count=2048; sync` on a guest
- /etc/network/interfaces from the guest

Also, check /var/log/syslog and /var/log/messages (anything else) on the host, particularly when the network plays up
 
ok here we go:

/etc/network/interfaces:
Code:
auto lo
iface lo inet loopback

iface eth0 inet manual

auto eth1
iface eth1 inet static
    address  172.16.0.30
    netmask  255.255.0.0

auto vmbr0
iface vmbr0 inet static
    address  192.168.0.72
    netmask  255.255.255.0
    gateway  192.168.0.20
    bridge_ports eth0
    bridge_stp off
    bridge_fd 0

pveperf /

Code:
CPU BOGOMIPS:      76604.74
REGEX/SECOND:      461086
HD SIZE:           94.49 GB (/dev/pve/root)
BUFFERED READS:    208.47 MB/sec
AVERAGE SEEK TIME: 7.38 ms
FSYNCS/SECOND:     848.61
DNS EXT:           40.19 ms
DNS INT:           0.72 ms (DOMAINNAME)

pveversion -v

Code:
pve-manager: 1.5-5 (pve-manager/1.5/4627)
running kernel: 2.6.24-9-pve
pve-kernel-2.6.24-9-pve: 2.6.24-18
pve-kernel-2.6.24-8-pve: 2.6.24-16
qemu-server: 1.1-11
pve-firmware: 1.0-3
libpve-storage-perl: 1.0-8
vncterm: 0.9-2
vzctl: 3.0.23-1pve7
vzdump: 1.2-5
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1


time command on host

Code:
time `dd if=/dev/zero of=/tmp/2G bs=1M count=2048; sync`
2048+0 Datens�tze ein
2048+0 Datens�tze aus
2147483648 Bytes (2,1 GB) kopiert, 6,93111 s, 310 MB/s

real    0m12.864s
user    0m0.004s
sys    0m4.084s


time command from a guest:

Code:
time `dd if=/dev/zero of=/tmp/2G bs=1M count=2048; sync`
2048+0 records in
2048+0 records out
2147483648 bytes (2,1 GB) copied, 208,145 seconds, 10,3 MB/s

real    3m58.460s
user    0m0.000s
sys    0m3.716s



where do i find the network config on a SLES10? its not /etc/network


##############################

Some comments:

i testet all commands on the clusernode which is serving the slow kvm's.
I only have one Linux (SLES10) kvm on this node, most of the KVM's in the Cluster are Windows 2003 terminalservers.
 
hi tom,
thank you for reply. Well you are right i only did a apt-get dist-upgrade.

i will install missing packages asap.

But: Do you generally agree with my setup? or is there something wrong?

The question is: since i have only one ISCSI storage to which all servers in the cluster are connected, can it be that there are performance issues?

I only get a throughput of about 10-18 MB/sec (the KVM disks are all located on the SAN)

i have alltogether 16 KVM's with 20 Disks all on the ISCSI.
I think the performance shoould be higher?
Can i check the ISCSI config somehow? I think here's the error....

Thanks
 
setting up a high performance iSCSI SAN is not that easy and there are a lot of things involved. so do not expect an easy answer like "click here and you got it fast ..."

you need to check a lot of different layers to see where the bottleneck resides, but if 16 KVM accessing in parallel the same disk system the performance is shared. so a good start is just testing with one guest.
 
By the way - I noticed that when I upgraded from 2.18 to 2.32 I got quite a big boost. Hard to be definitive because there are so any things happening all at once on a busy server, but it might be worth considering (unless you need openvz?). Also, using virtio disks in Ubuntu gave a significant speed increase....
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!