Performance issues after switching to Proxmox (OpenVZ).

KuJoe

New Member
Sep 5, 2011
23
0
1
Hello, for some reason when I migrated my OpenVZ container from my SolusVM node to my Proxmox server, we saw a dramatic decrease in performance which is exactly the opposite of what was supposed to happen.

On the SolusVM node it was sharing the node with multiple other OpenVZ nodes on slower hardware. On the Proxmox server it is sharing the node with 1 other VPS (KVM) with multiple dual-core Xeons, multiple 10k SAS Drives using hardware RAID1, 8GB of RAM (1GB dedicated to the KVM, 2GB dedicated to the OpenVZ), and bonded 2Gbps ports.

I cannot for the life of me figure out what is causing the slowness. Out of the 2GB only 200MB is being used, CPU is always at 99% idle, disk I/O is showing around 30MB/s which is normally terrible but for a basic web server with very little MySQL interaction it should be fine. Network download and upload are normally less than 1Mbps. The OpenVZ server is running a few custom bash scripts, Lighttpd, MySQL, OpenSSH, and DenyHosts.

The KVM VPS is running Apache and MySQL for a PHP intensive site but has continued to run normally after the OpenVZ container was moved. It is weird because when I ran the BYTE UNIX Benchmark test on both, the KVM performed at about 1/2 of the OpenVZ. Also, when I SSH into the Proxmox server, the highest process in TOP is "kvm".

My questions are these:
1) Does mixing KVM and OpenVZ affect performance?
2) If I decide not to run KVM, can I install Proxmox without it creating an LVM2? (I think this is why our hard drive performance is so slow, we had the same issue when running XenServer but when not using an LVM, we can obtain 80MB/s write speeds with our drives).
3) Can you think of anything that I can do to help determine where the slowness is coming from? I've tried every test I can think of without any luck.

I'm afraid to make my OpenVZ into a KVM because I like being able to use vzmigrate on my primary webserver and move it between servers (I only have 1 server running Proxmox but I have dozens of SolusVM servers using OpenVZ).

Any help or ideas are appreciated.
 
pls post the output of of 'pveversion -v', also the results of 'pveperf'.

when you talk about "slowness" what do you mean exactly, provide benchmark number.
 
By "slowness", I mean that my website used to load in about 6-10 seconds on the other node, now after moving it to Proxmox it takes about 80 seconds to load a single page. SSH also takes about 20 seconds to respond to any input now. Both nodes are in the same data center (actually they are both blades in the same chassis).

Code:
# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6

# pveperf
CPU BOGOMIPS:      21280.35
REGEX/SECOND:      878546
HD SIZE:           16.49 GB (/dev/mapper/pve-root)
BUFFERED READS:    55.43 MB/sec
AVERAGE SEEK TIME: 5.92 ms
FSYNCS/SECOND:     45.90
DNS EXT:           2065.84 ms
DNS INT:           2064.50 ms
 
Last edited:
By "slowness", I mean that my website used to load in about 6-10 seconds on the other node, now after moving it to Proxmox it takes about 80 seconds to load a single page. SSH also takes about 20 seconds to respond to any input now. Both nodes are in the same data center (actually they are both blades in the same chassis so the only difference is that Proxmox is installed on 73GB 10k SAS drives instead of 500GB 7200 SAS drives).

Code:
# pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6

# pveperf
CPU BOGOMIPS:      21280.35
REGEX/SECOND:      878546
HD SIZE:           16.49 GB (/dev/mapper/pve-root)
BUFFERED READS:    55.43 MB/sec
AVERAGE SEEK TIME: 5.92 ms
FSYNCS/SECOND:     45.90
DNS EXT:           2065.84 ms
DNS INT:           2064.50 ms
It seams that yu have DNS issue.
 
...
Code:
...
# pveperf
...
BUFFERED READS:    55.43 MB/sec
AVERAGE SEEK TIME: 5.92 ms
FSYNCS/SECOND:     45.90
DNS EXT:           2065.84 ms
DNS INT:           2064.50 ms
Hi,
like meto wrote - your DNS run in timeout?! Look at /etc/resolv.conf and perhaps firewall?

And your IO-speed is not very good. 55 MB/s for a sas-drive is poor - also the FSYNCS.
What kind of raidcontroller do you have in the box?
To compare - this is the result of two 2TB-SATA-Disks in raid1 on an areca:
Code:
BUFFERED READS:    133.67 MB/sec
AVERAGE SEEK TIME: 8.86 ms
FSYNCS/SECOND:     2176.25
DNS EXT:           52.72 ms

Udo
 
Thanks everyone! I did notice some issues with our data center's DNS servers a while back and added Google's DNS as a backup which fixed some issues (guess they never got it entirely fixed). I will make some changes to my resolv file and test further.

As for the hard drives, we're using Dell's SAS 5/iR RAID controllers in all of our blades and using all Dell hard drives (expensive, but reliable). Unfortunately once LVMs are involved, our disk I/O drops to about 30% of our normal speeds.
 
Thanks everyone! I did notice some issues with our data center's DNS servers a while back and added Google's DNS as a backup which fixed some issues (guess they never got it entirely fixed). I will make some changes to my resolv file and test further.

As for the hard drives, we're using Dell's SAS 5/iR RAID controllers in all of our blades and using all Dell hard drives (expensive, but reliable). Unfortunately once LVMs are involved, our disk I/O drops to about 30% of our normal speeds.
I've got better better performance with SATA drives in soft-raid1. Maybe you've run benchmark on heavy loaded system?
 
Code:
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.8-11
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.35-1-pve: 2.6.35-11
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6


CPU BOGOMIPS:      22666.17
REGEX/SECOND:      931043
HD SIZE:           94.49 GB (/dev/mapper/pve-root)
BUFFERED READS:    104.65 MB/sec
AVERAGE SEEK TIME: 11.61 ms
FSYNCS/SECOND:     2330.21
DNS EXT:           50.95 ms
DNS INT:           0.68 ms

This on a QuadCore Q9550 2.83GHz with 2 Samsung 5400rpm disks on a HighPoint RR3520 in HW RAID-1.
Not bad for the Samsungs...

My guess is the caching setup on your RAID card is not enabled, otherwise I can't explain the poor I/O.....
 
The system is far from overloaded (under high volumes of traffic we see 0.45 CPU load). We get much better performance when we don't use LVMs, no idea why the LVMs kill our disk IO. We have a handful of different blades all utilizing different drives and it seems that our RAID card is not LVM friendly but I guess that's the price we pay. :(
 
I've enabled write cache on the drives and it has shown a 30% increase. Not bad, now if I can just get rid of the LVMs I'll be on easy street. LoL. :)
 
Alignment problem, maybe? LVM should not kill so much of a performance...

Although I did reformat the system to get rid of LVM (the method was not very hard, but rather risky to do remotely): I installed a regular Debian system with KVM virtualization and installed Proxmox normally inside the VM. Next I mounted the VM as a disk image and copied the root and boot partitions out of it on the physical disk (i also joined root and boot partitions in one) - then started Proxmox and formatted all the rest of the disk as data partition. And no more LVM there :D Some changes in config files are required, which I don't actually remember... Also the risky part is to make sure you have Grub installed and in the correct version to allow you to boot up your system (that was especially critical for me as I use softraid 5, and some Grub versions, including the default one, did not support booting from softraid)
 
Possibly, I thought that was only an issue with the Advanced 4k Sector drives (which we found out rather quickly on our storage servers).

Thanks for the tip but in all honesty the node is being used for some internal servers where performance is not needed. Right now the system specs are way overkill for the server's function.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!