Big performance problem

bubbafish

Member
Dec 7, 2009
14
0
21
Hi @all,

using Proxmox VE 1.7 (2.6.32) on an Intel Server (2x E5506, Intel RS2BL040 RAID-Controller, 3x 1TB HD RAID-Edition WD, 8GB RAM) I have incredible performance problems:
the disks seem to be very slow, native, in VZ and in KVM.

pveperf shows:

CPU BOGOMIPS: 34131.93
REGEX/SECOND: 779737
HD SIZE: 94.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 285.26 MB/sec
AVERAGE SEEK TIME: 6.85 ms
FSYNCS/SECOND: 3108.49
DNS EXT: 62.63 ms
DNS INT: 0.83 ms (xxxxxxx.lan)

but if I run rsnapshot (on host) to backup the data I get less than 1MB/s. rsync shows same. I need up to 12 hours to save ~9GB data.
bonnie++ gives really good results, no load on server (via top, htop, dstat). No errors in log-files.

Any ideas?

Thanks in advance.
B.
 
rsnapshot is the only "slow" program? your pveperf results are fine.
 
so all IO related services. I do not know this raid controller but take a deeper look on this component.
 
Hi Tom,

ty. It's similar to that contoller: LSI Megaraid SAS 9260-4i. I'm confused because bonnie++, pveperf and so on seem to be right. I'll follow your hint as I head the same idea.
Best regards,
B.
 
Hi Tom,

ty. It's similar to that contoller: LSI Megaraid SAS 9260-4i. I'm confused because bonnie++, pveperf and so on seem to be right. I'll follow your hint as I head the same idea.
Best regards,
B.
Hi,
what values, do you get with pveperf when to io is slow (samba in use)?
Perhaps a vm makes many io? Look with iostat (apt-get install sysstat).

Udo
 
Hi Udo,

thanks for your answer. Here's what happens with an normal rsync (from usb-device to local hdd):

sending incremental file list
vzdump-qemu-300-2011_01_14-11_58_26.tgz
22380544 0% 338.62kB/s 7:52:01

The values are even getting worse...

Here's the output from iostat while running rsync:

Every 2,0s: iostat Fri Jan 21 09:14:45 2011

Linux 2.6.32-4-pve (xxxxxxxxxxxx) 21.01.2011 _x86_64_

avg-cpu: %user %nice %system %iowait %steal %idle
0,91 0,00 0,90 4,69 0,00 93,49

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 19,70 868,50 422,49 347814106 169198748
sda1 0,01 2,64 0,00 1059218 120
sda2 19,62 865,85 422,49 346752888 169198628
sdb 3,80 233,34 271,79 93445930 108844472
sdb1 3,80 233,34 271,79 93445386 108844472
sr0 0,00 0,00 0,00 1440 0
dm-0 2,11 10,10 6,81 4043312 2729144
dm-1 34,67 235,07 258,62 94139138 103572528
dm-2 24,69 220,83 155,52 88438730 62283360
sdc 1,60 7,14 237,70 2858528 95193104
sdc1 1,60 7,14 237,70 2857672 95193104

sda is the raid5 device, sdb my internal backup hdd and sdc an external usb-hdd.

Thanks in advance.
 
Additional Information:

Concerning. If I create a 1g-size file via dd I loose my ssh-connection.
 
Additional Information:

Concerning. If I create a 1g-size file via dd I loose my ssh-connection.
Hi,
perhaps an disk is faulty? And the good values of bonnie / pveperf came from the raid-cache??

btw. if you use "iostat -dm 5 sda" you can easier see the throughput (the first value is not right, only the overall read/write).

Udo
 
Hi Udo,

don't think it could be a disk. I've tried my RAID5-Device and a single Disk. Both are slow. Here's the output while running rsync (speed 50-350kb/s)

Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 2,40 0,00 0,20 0 0

Following values are nearly identical.
 
Hi Udo,

don't think it could be a disk. I've tried my RAID5-Device and a single Disk. Both are slow. Here's the output while running rsync (speed 50-350kb/s)

Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 2,40 0,00 0,20 0 0

Following values are nearly identical.
Hi,
0,2MB/s is nothing... what shows top (wait, sys, user...) during rsync?
What shows iostat if you do a "dd if=/dev/zero of=/var/lib/vz/bigfile bs=1024k count=8192" ?
From which source came the data from the rsync?
Do you have similiar transfer rates if you use dd instead of rsync?

Udo
 
top during dd:

top - 09:28:51 up 6 days, 15:28, 1 user, load average: 0.23, 0.63, 0.81
Tasks: 205 total, 2 running, 203 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2%us, 12.6%sy, 0.0%ni, 87.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8101208k total, 5900796k used, 2200412k free, 91108k buffers
Swap: 7340024k total, 484304k used, 6855720k free, 482584k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14925 root 20 0 10356 1684 544 R 100 0.0 0:11.78 dd
32469 root 20 0 5311m 4.6g 1000 S 5 59.8 716:48.40 kvm

output from dd:

8192+0 Datensätze ein
8192+0 Datensätze aus
8589934592 Bytes (8,6 GB) kopiert, 135,652 s, 63,3 MB/s

Hmm, these values seem to be slow but not as bad as before.

top during rsync

top - 09:35:48 up 6 days, 15:35, 1 user, load average: 0.19, 0.49, 0.71
Tasks: 206 total, 1 running, 205 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.4%sy, 0.0%ni, 87.0%id, 12.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8101208k total, 7980016k used, 121192k free, 97792k buffers
Swap: 7340024k total, 526936k used, 6813088k free, 2603020k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
32469 root 20 0 5311m 4.6g 984 S 7 59.3 717:15.33 kvm
15047 root 20 0 12508 1304 844 D 1 0.0 0:00.04 rsync
15049 root 20 0 12476 784 296 S 1 0.0 0:00.04 rsync

output from rsync:

18644992 0% 363.05kB/s 6:33:29

ok, these values really are bad.
 
top during dd:

top - 09:28:51 up 6 days, 15:28, 1 user, load average: 0.23, 0.63, 0.81
Tasks: 205 total, 2 running, 203 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2%us, 12.6%sy, 0.0%ni, 87.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8101208k total, 5900796k used, 2200412k free, 91108k buffers
Swap: 7340024k total, 484304k used, 6855720k free, 482584k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14925 root 20 0 10356 1684 544 R 100 0.0 0:11.78 dd
32469 root 20 0 5311m 4.6g 1000 S 5 59.8 716:48.40 kvm

output from dd:

8192+0 Datensätze ein
8192+0 Datensätze aus
8589934592 Bytes (8,6 GB) kopiert, 135,652 s, 63,3 MB/s

Hmm, these values seem to be slow but not as bad as before.
Hi,
the 63,3 MB/s are not true (buffering) - the command is for reading the real write-performance with iostat, but this information are missing.
The top-output is for looking, how many wait you have: this is all ok, but your host used swap-space.
It's looks so, that you io is slow because the node swaped a lot?!
Can you give some VMs less memory? Or shut them down, and look then for IO.

btw. if you want reliable values with dd use "dd if=/dev/zero of=bigfile bs=1024k count=8192 conv=fdatasync".
top during rsync

top - 09:35:48 up 6 days, 15:35, 1 user, load average: 0.19, 0.49, 0.71
Tasks: 206 total, 1 running, 205 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.4%sy, 0.0%ni, 87.0%id, 12.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8101208k total, 7980016k used, 121192k free, 97792k buffers
Swap: 7340024k total, 526936k used, 6813088k free, 2603020k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
32469 root 20 0 5311m 4.6g 984 S 7 59.3 717:15.33 kvm
15047 root 20 0 12508 1304 844 D 1 0.0 0:00.04 rsync
15049 root 20 0 12476 784 296 S 1 0.0 0:00.04 rsync

output from rsync:

18644992 0% 363.05kB/s 6:33:29

ok, these values really are bad.
What values, do you get if you use instead rsync dd?

Udo
 
Hi Udo,

thanks a lot for your help:
dd if=/dev/zero of=bigfile bs=1024k count=8192 conv=fdatasync

8192+0 Datensätze ein
8192+0 Datensätze aus
8589934592 Bytes (8,6 GB) kopiert, 495,217 s, 17,3 MB/s --> much too slow


top - 21:28:27 up 7 days, 3:28, 1 user, load average: 0.24, 0.41, 0.33
Tasks: 204 total, 2 running, 202 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.4%us, 16.2%sy, 0.0%ni, 82.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8101208k total, 5947452k used, 2153756k free, 147084k buffers
Swap: 7340024k total, 554712k used, 6785312k free, 527124k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22961 root 20 0 10356 1688 548 R 100 0.0 0:12.33 dd
32469 root 20 0 5311m 4.6g 984 S 5 58.9 765:48.79 kvm
1055 root 20 0 0 0 0 S 0 0.0 0:33.47 kjournald

In deed the host uses swap but this just could have been happend during a backupjob (vm1: dns/dhcp@512MB, vm2 linux-mailserver@512MB, vm3: w2k8 TS@5GB). Don't think this could be critical. But it's strange because there's 2G of free mem left. Memtest (24h) - no results.

Rsync - tried internal (raid5 to internal s-ata disk and to usb 2 - and vice versa, same results).

B.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!