Performance of LXC vs KVM

selund

Renowned Member
May 18, 2010
27
4
68
I my mind LXC should always be much more efficient than kvm. However I've got a customer who try to run a large elasticsearch clust, and the the tests they have run show almost double performance running on kvm vs lxc. Going forward the elasticsearch cluster will be moved to bare metal, but I would like to find out why we see much better performance on kvm, which in my mind don't make any sense... For the record, there is no problem with io latency, eiter on kvm or lxc. The problem seems to be cpu related. Any suggestion on what might be the problem?

Code:
# pveversion -v
proxmox-ve: 5.1-42 (running kernel: 4.13.16-2-pve)
pve-manager: 5.1-51 (running version: 5.1-51/96be5354)
pve-kernel-4.13: 5.1-44
pve-kernel-4.13.16-2-pve: 4.13.16-47
pve-kernel-4.13.16-1-pve: 4.13.16-46
pve-kernel-4.13.8-3-pve: 4.13.8-30
corosync: 2.4.2-pve4
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: not correctly installed
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-4
libpve-common-perl: 5.0-30
libpve-guest-common-perl: 2.0-14
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-18
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-2
lxcfs: 3.0.0-1
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-15
pve-cluster: 5.0-25
pve-container: 2.0-21
pve-docs: 5.1-17
pve-firewall: 3.0-8
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-4
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.1-5
pve-xtermjs: 1.0-2
qemu-server: 5.0-25
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
 
The cpu usage is almost 100% in both cases. The elasticsearch instances are running percolator, so its a bit different from normal elasticsearch usage. We have run the testes on the same set of physical servers, and they are not running anything else during the tests.
 
Just to some extent close this issue. LXC isn't the problem here. What we actually see here is that elasticsearch (might even be java) runs about 60% faster in a kvm than on native hardware in our case. Running on bare metal or in lxc has the same performance, while running in a kvm boosts the performance with about 60%. We'll continue to chase the rabbit, and I'll post our findings if we catch it....
 
Just to some extent close this issue. LXC isn't the problem here. What we actually see here is that elasticsearch (might even be java) runs about 60% faster in a kvm than on native hardware in our case. Running on bare metal or in lxc has the same performance, while running in a kvm boosts the performance with about 60%. We'll continue to chase the rabbit, and I'll post our findings if we catch it....

Hello,

This is pretty interesting, could you go further in this case ?

Thanks

Talion
 
We haven't found the root cause, and it seems to only have an effect when you use Elasticsearch inside a kvm and use percolation. Other usecases of Elasticseach don't show the same behaviour. My suspicion is that we are seeing a bug in Elasticsearch or java that only manifests it self on physical hardware and not in a vm.
 
  • Like
Reactions: Talion
Which storage setup are you using for the KVM, and which one for the LXC/bare-metal?
It could be related to cache/sync settings for the disks
 
I don't think it's disk related, but for what it's worth, we used lvm for all and no cache set for kvm. For our part, we are not going to do anything more whit this, and will end up running one kvm instance on each physical server.