CPU Performance Degradtion

adamb · Nov 25, 2013

Hey all. Cluster is proxmox 3.1 all up-to-date. I am running into an issue where the cpu performance is degrading. The only way I can get things back in order is to reboot the VM. Everything will be running great and then we take a significant hit on our software encryption performance and notice the process is running very hard 80-90% cpu within the VM. Once this happens I can see the performance difference when running a 7zip benchmark test. There is only one VM on the cluster which is a fresh standard CentOS6.4 load. I am using cpu type "Host" as I need to pass aes-ni to the guest.

When this takes place I am caching our database files into ram with a small utility called "vmtouch". I can reproduce it by also just simply copying the database to /dev/shm/.

Code:

root@testprox1:~# pveversion -v
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-24 (running version: 3.1-24/060bd5a6)
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-9
libpve-access-control: 3.0-8
libpve-storage-perl: 3.0-18
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1

Before Performance Degrades

Code:

RAM size:   670445 MB,  # CPU hardware threads:  16
RAM usage:   6378 MB,  # Benchmark threads:     16


Dict        Compressing          |        Decompressing
      Speed Usage    R/U Rating  |    Speed Usage    R/U Rating
       KB/s     %   MIPS   MIPS  |     KB/s     %   MIPS   MIPS


22:   47823  2311   2013  46522  |   559050  2433   2071  50398
23:   47367  2470   1953  48261  |   581643  2403   2214  53196
24:   44950  2393   2019  48331  |   648811  2752   2186  60177
25:   46133  2721   1936  52673  |   620536  2735   2133  58343
----------------------------------------------------------------
Avr:         2474   1980  48947              2581   2151  55529
Tot:         2527   2066  52238

After Performance Degrades

Code:

7-Zip (A) 9.13 beta  Copyright (c) 1999-2010 Igor Pavlov  2010-04-15
p7zip Version 9.13 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,16 CPUs)


RAM size:   670447 MB,  # CPU hardware threads:  16
RAM usage:   3402 MB,  # Benchmark threads:     16


Dict        Compressing          |        Decompressing
      Speed Usage    R/U Rating  |    Speed Usage    R/U Rating
       KB/s     %   MIPS   MIPS  |     KB/s     %   MIPS   MIPS


22:   28145  1218   2248  27380  |   362930  1387   2360  32725
23:   30272  1320   2337  30844  |   342470  1323   2367  31324
24:   28649  1345   2290  30804  |   333863  1383   2238  30967
25:   28538  1402   2324  32584  |   329607  1354   2289  30992
----------------------------------------------------------------
Avr:         1321   2300  30403              1362   2313  31502
Tot:         1341   2307  30952

adamb · Nov 25, 2013

More details, here is an iozone test ran in RAM before the performance hit and after. Both tests the iozone process running at 100% cpu, something is just killing the cpu performance. I just looked at the host itself and it is set to "Host Controlled Performance" for the power. The hardware is new HP Prolient DL380p's.

Children see throughput for 5 initial writers = 1758982.53 KB/sec
Parent sees throughput for 5 initial writers = 1724847.01 KB/sec
Min throughput per process = 348266.19 KB/sec
Max throughput per process = 355669.41 KB/sec
Avg throughput per process = 351796.51 KB/sec
Min xfer = 13033556.00 KB

Here is a test when the cpu performance takes a dive

Children see throughput for 5 initial writers = 165586.14 KB/sec
Parent sees throughput for 5 initial writers = 164171.64 KB/sec
Min throughput per process = 32792.05 KB/sec
Max throughput per process = 33343.80 KB/sec
Avg throughput per process = 33117.23 KB/sec
Min xfer = 13091720.00 KB

adamb · Nov 26, 2013

UPDATE

Here is a recap of what I have found.

CPU Performance degrades when moving files into RAM
- This can be reproduced with either vmtouch or by simply copying the directory to /dev/shm/
- Once I have roughly 100GB cached, performance starts to degrade

- The VM itself has 680GB ram, the host has a total of 710GB ram.

I just mounted the space on one of the hosts them selves and used vmtouch to cache the entire database without an issue. This seems to only happen within the VM.

adamb · Nov 26, 2013

UPDATE #2

I have found that simply clearing the cache restores cpu performance. Not sure where to look next.

spirit · Nov 26, 2013

adamb said:
UPDATE #2

I have found that simply clearing the cache restores cpu performance. Not sure where to look next.

maybe can you try to enable transparent hugepage on proxmox host, I think it could help with so much memory.

echo always > /sys/kernel/mm/redhat_transparent_hugepage/enabled

Also, I don't known how much memory have your host, but if ksm is enable, it begin to scan memory when 80% is used, and it's use a lot of cpu with big memory.
you can try to stop in, on the host:

/etc/init.d/ksmtuned stop

adamb · Nov 26, 2013

spirit said:
maybe can you try to enable transparent hugepage on proxmox host, I think it could help with so much memory.

echo always > /sys/kernel/mm/redhat_transparent_hugepage/enabled

Also, I don't known how much memory have your host, but if ksm is enable, it begin to scan memory when 80% is used, and it's use a lot of cpu with big memory.
you can try to stop in, on the host:

/etc/init.d/ksmtuned stop

Really appreciate the input.

I have roughly 700Gb in the host and 680GB dedicated to the VM.

Gave it a try but still hitting the same barrier. Once 150-200GB is filled cpu performance goes down hill, so far my only way to get it back is to clear the memory.

adamb · Nov 26, 2013

A bit more findings. I installed a Ubuntu guest which is also behaving exactly the same as my CentOS6 guest. Figured it was worth proving out that it doesn't have to do with my guest.

I can reproduce this a number of ways, pretty much any process which utilizes cache will experience this issue.

Another interesting note. The first run of 7zip benchmark after performance has dropped will reflect the performance drop, if I run it a 2nd time performance is back to normal. If I kick off a copy or rsync within 5-10 seconds performance is degraded again.

Man this is an odd issue

dietmar · Nov 27, 2013

The VM does not have any info about the NUMA architecture of the host. To show that you can use

Code:

# apt-get install numactl
# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 0 size: 16305 MB
node 0 free: 14790 MB
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
node 1 size: 16384 MB
node 1 free: 15078 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10

So if you use all memory from one node, your benchmark allocates memory from another NUMA node, which is slow.

One way to avoid such behavior is to set CPU affinity, so that a VM only runs on a single NUMA node ('man taskset').

Not sure if there are other options?

adamb · Nov 27, 2013

dietmar said:
The VM does not have any info about the NUMA architecture of the host. To show that you can use

Code:

# apt-get install numactl # numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17 node 0 size: 16305 MB node 0 free: 14790 MB node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23 node 1 size: 16384 MB node 1 free: 15078 MB node distances: node 0 1 0: 10 21 1: 21 10

So if you use all memory from one node, your benchmark allocates memory from another NUMA node, which is slow.

One way to avoid such behavior is to set CPU affinity, so that a VM only runs on a single NUMA node ('man taskset').

Not sure if there are other options?

Looks like memory from both nodes is being utilized.

root@testprox1:~# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 360413 MB
node 0 free: 265915 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 360447 MB
node 1 free: 118270 MB
node distances:
node 0 1
0: 10 20
1: 20 10

I remeber we had a issue on bare metal when using large amounts of ram. We had to run something like "numactl --interleave all" but this doesn't seem to work, I think I am missing something simple.

adamb · Nov 27, 2013

Just noticed something that really might help.

The performance starts to degrade once the VM hits roughly 150GB of used ram. So I let my cache program finish caching the entire database (330GB), cleared cache within the VM and re-ran the cache program. Sure enough on the 2nd time around I don't see a performance hit. The issue definitely has to due with the host giving ram to the VM.

To illustrate here are my cache times.

1st run
[root@hp-test fcmg]# vmtouch -ft -m 50G fil/
Files: 2331
Directories: 3
Touched Pages: 86924653 (331G)
Elapsed: 1115.7 seconds

2nd Run
[root@hp-test fcmg]# vmtouch -ft -m 50G fil/
Files: 2331
Directories: 3
Touched Pages: 86924653 (331G)
Elapsed: 566.77 seconds

Quite the difference in time. If I shut the VM down, start it, im back at the performance issue. Very interesting!

adamb · Dec 2, 2013

adamb said:
Just noticed something that really might help.

The performance starts to degrade once the VM hits roughly 150GB of used ram. So I let my cache program finish caching the entire database (330GB), cleared cache within the VM and re-ran the cache program. Sure enough on the 2nd time around I don't see a performance hit. The issue definitely has to due with the host giving ram to the VM.

To illustrate here are my cache times.

1st run
[root@hp-test fcmg]# vmtouch -ft -m 50G fil/
Files: 2331
Directories: 3
Touched Pages: 86924653 (331G)
Elapsed: 1115.7 seconds

2nd Run
[root@hp-test fcmg]# vmtouch -ft -m 50G fil/
Files: 2331
Directories: 3
Touched Pages: 86924653 (331G)
Elapsed: 566.77 seconds

Quite the difference in time. If I shut the VM down, start it, im back at the performance issue. Very interesting!

Just as an update, this issue is reproducible on pve 2.3 and 3.+, along with completely different hardware (IBM vs HP).

spirit · Dec 2, 2013

adamb said:
Just as an update, this issue is reproducible on pve 2.3 and 3.+, along with completely different hardware (IBM vs HP).

qemu 1.7 should come soon in proxmox, should be interesting to see if you can reproduce the problem.

adamb · Dec 2, 2013

spirit said:
qemu 1.7 should come soon in proxmox, should be interesting to see if you can reproduce the problem.

I just reproduced this issue on a fresh CentOS6 load. So its not anything specific to the pve kernel.

qemu-kvm-0.12.1.2-2.209.el6_2.4.x86_64
qemu-img-0.12.1.2-2.209.el6_2.4.x86_64
gpxe-roms-qemu-0.9.7-6.9.el6.noarch
qemu-kvm-tools-0.12.1.2-2.209.el6_2.4.x86_64

adamb · Dec 3, 2013

Also worth noting that I can reproduce this issue directly in ram. I simply re-mounted /dev/shm in the guest with all available memory and wrote out a file with dd. Started at roughly 2GB/s and steadily declined in performance to a point of 50MB/s. I thought this was worth testing to take the storage out of the equation. It also helps in reproducing as my storage is only good for about 650-700MB/s vs the 2GB/s ram can pull.

spirit · Dec 3, 2013

about transparent hugepages,

they are 2 sysfs flag

echo "always" > /sys/kernel/mm/transparent_hugepage/enabled
echo "always" > /sys/kernel/mm/redhat_transparent_hugepage/enabled

(this is because of openvz patch on top of redhat kernel, I don't known which enable transparent hugepage).

Can you try to put always to both values

adamb · Dec 3, 2013

spirit said:
about transparent hugepages,

they are 2 sysfs flag

echo "always" > /sys/kernel/mm/transparent_hugepage/enabled
echo "always" > /sys/kernel/mm/redhat_transparent_hugepage/enabled

(this is because of openvz patch on top of redhat kernel, I don't known which enable transparent hugepage).

Can you try to put always to both values

Appreciate the input. I gave this a try but it didn't help. Also just tested with the latest pve kernel which also is still having the issue.

adamb · Dec 3, 2013

Well the next suggestion from the dev's was to try out hugetlbfs. Difference is night and day, performance is now 100%. Simply amazing the difference this little addition made.

http://www.linux-kvm.com/content/get-performance-boost-backing-your-kvm-guest-hugetlbfs

So this must be an issue of TLB misses.

cesarpk · Dec 4, 2013

spirit said:
about transparent hugepages,

they are 2 sysfs flag

echo "always" > /sys/kernel/mm/transparent_hugepage/enabled
echo "always" > /sys/kernel/mm/redhat_transparent_hugepage/enabled

(this is because of openvz patch on top of redhat kernel, I don't known which enable transparent hugepage).

Can you try to put always to both values

Hi Spirit

Please let me to do two questions:
1- Do you know if these values will be useful for KVM + Windows Server + MS-SQL with 128 or 256 Gb of RAM in the VM?
2- These values should be in the Host or the Guest? (believe that in the Host)

Best regards
Cesar

spirit · Dec 4, 2013

cesarpk said:
Hi Spirit

Please let me to do two questions:
1- Do you know if these values will be useful for KVM + Windows Server + MS-SQL with 128 or 256 Gb of RAM in the VM?
2- These values should be in the Host or the Guest? (believe that in the Host)

Best regards
Cesar

1. Should be usefull for any vm with big memory (doc said around 15-20% more cpu performance).
2. on the host

cesarpk · Dec 4, 2013

spirit said:
1. Should be usefull for any vm with big memory (doc said around 15-20% more cpu performance).
2. on the host

Thanks spirit for your answer

And let me to do 2 questions:
1- From how much memory is recommended?
2- For apply the changes, don't need reboot the Host, only the VM?

Best regards
Cesar

CPU Performance Degradtion

Famous Member

Famous Member

Famous Member

Famous Member

Distinguished Member

Famous Member

Famous Member

Proxmox Staff Member

Famous Member

Famous Member

Famous Member

Distinguished Member

Famous Member

Famous Member

Distinguished Member

Famous Member

Famous Member

Well-Known Member

Distinguished Member

Well-Known Member

We value your privacy