Why LXC Disk I/O Slower than KVM?

aychprox

Renowned Member
Oct 27, 2015
76
7
73
Maybe this is not the right place to ask, but hopefully member here able to guide me to right direction to find out this differences.

I had created 2 VM (KVM and LXC) to do benchmark.
Seems like LXC disk I/O much slower than KVM with virtio. May I know what should I look into to have both same performances on same hardware and same storage pool....?

Ceph storage Network setup:

4 ports LAN with bond0 with balance-rr mode for 4 nodes.

Hardware setup as listed here: https://forum.proxmox.com/threads/usable-space-on-ceph-storage.25219/

Under KVM VM (virtio):

user@ubuntu:~$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo sync

user@ubuntu:~$ dd if=/dev/zero of=here bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 14.7675 s, 72.7 MB/s

user@ubuntu:~$ dd if=/dev/zero of=here bs=4M count=1000 oflag=direct
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB) copied, 11.9827 s, 350 MB/s



Under LXC VM:

user@ubuntu:~$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo sync

user@ubuntu:~$ dd if=/dev/zero of=here bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 20.2207 s, 53.1 MB/s

user@ubuntu:~$ dd if=/dev/zero of=here bs=4M count=1000 oflag=direct
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB) copied, 228.281 s, 18.4 MB/s
 
@aychprox,

In my experience, the only way to accurately gauge the true writing to storage speed with a system under KVM has been to download a file several times bigger then the host RAM memory from a server with SSD on a Gbit network connection and time it. Atempts at using benchmark tools have consistently shown to throw up useless results with no relation whatsoever to true performance, whether using sync, O_DIRECT flags or whatever.

Just my two cents.
 
Thanks for sharing.
As I am try to select which best for application server (web apps). In this case what is the best criteria to choose between KVM and LXC in proxmox if not using above benchmark as guideline?
 
@aychprox, I think you should start a new topic for that, I mean, if you want to do a general compare. In terms of disk performance there is no reason to expect a very big difference, at least if you do NOT use files for the storage of KVM disks.
 
Some have already commented othe difficulties getting good measurements.

I'd also mention that LXC containers share/use the Host's own Kernel so I/O for an LXC container should be identical to I/O for the Host server OS.

So I'm not sure how, other than difference btwn KVM & LXC backing-store choices that could possibly cause LXC to not out-perform KVM HW vm's -or- possibility if certain "CAP" limits (see - http://man7.org/linux/man-pages/man7/capabilities.7.html) set for the LXC container to rate limit such things. A KVM vm uses a virtual i/o so no matter how optimized there is some 'some' delay btwn it and the Host Kernel's drivers. LXC wouldn't experience that.

You should also check what backing-store the LXC container is set to use as there are multiple choices you need to make sure that both the KVM vm and the LXC container are using the same to get an apples-to-apples.

The lxc-create command -B command option says...
-B backingstore
'backingstore' is one of 'dir', 'lvm', 'loop', 'btrfs', 'zfs', 'rbd', or 'best'. The default is 'dir', meaning that the container root filesystem will be a
directory under /var/lib/lxc/container/rootfs.


This backing store type allows the optional --dir ROOTFS to be specified, meaning that the container rootfs should be placed under the
specified path, rather than the default. (The 'none' backingstore type is an alias for 'dir'.)

If 'btrfs' is specified, then the target filesystem must be btrfs, and the container rootfs will be created as a new subvolume. This allows
snapshotted clones to be created, but also causes rsync --one-filesystem to treat it as a separate filesystem.



If backingstore is 'lvm', then an lvm block device will be used and the following further options are available: --lvname lvname1 will create
an LV named lvname1 rather than the default, which is the container name. --vgname vgname1 will create the LV in volume group
vgname1 rather than the default, lxc. --thinpool thinpool1 will create the LV as a thin-provisioned volume in the pool named thinpool1
rather than the default, lxc. --fstype FSTYPE will create an FSTYPE filesystem on the LV, rather than the default, which is ext4. --fssize

SIZE will create a LV (and filesystem) of size SIZE rather than the default, which is 1G.​

If backingstore is 'loop', you can use --fstype FSTYPE and --fssize SIZE as 'lvm'.
The default values for these options are the same as 'lvm'.

If backingstore is 'rbd', then you will need to have a valid configuration in ceph.conf and a ceph.client.admin.keyring defined. You can
specify the following options : --rbdname RBDNAME will create a blockdevice named RBDNAME rather than the default, which is the
container name. --rbdpool POOL will create the blockdevice in the pool named POOL, rather than the default, which is 'lxc'.​

If backingstore is 'best', then lxc will try, in order, btrfs, zfs, lvm, and finally a directory backing store.
 
qemu have optimisations for writing zeroes (it's skipping zeroes write).
I'm not sure lxc is able to do it, with raw on loop devices

Also , Try with a true benchmark tool like "fio", (with random data write, more queue depth, ...)
 
qemu have optimisations for writing zeroes (it's skipping zeroes write).
I'm not sure lxc is able to do it, with raw on loop devices

Also , Try with a true benchmark tool like "fio", (with random data write, more queue depth, ...)

Hello Spirit - regarding fio : could you suggest options to use? fio has as many options as rsync.
 
You can use this work file as input:

Code:
# This job file tries to mimic the Intel IOMeter File Server Access Pattern
[global]
description=Emulation of Intel IOmeter File Server Access Pattern

[iometer]
bssplit=512/10:1k/5:2k/5:4k/60:8k/2:16k/4:32k/4:64k/10
rw=randrw
rwmixread=80
direct=1
size=4g
ioengine=libaio
# IOMeter defines the server loads as the following:
# iodepth=1     Linear
# iodepth=4     Very Light
# iodepth=8     Light
# iodepth=64    Moderate
# iodepth=256   Heavy
iodepth=64
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!