Slow Disk Performance inside VM

patrick7

Well-Known Member
Jan 25, 2018
32
1
48
30
Hi there

Currently I have some performance problems in VMs runnung on my ProxMox node.

Specs of the node:
- 1x Xeon E5-2620v4
- 64 GB RAM
- 4x 1TB WD Gold
- Software RAID 10
- Full Disk Encryption
- LVM Thin for VMs

On the host, I get around 200MB/s:
Code:
(zrh1)root@vms1:~# dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc 
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.13808 s, 209 MB/s

VMs get ~50 MB/s:
Code:
(zrh1)root@mail:~# dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc 
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 19.8337 s, 54.1 MB/s

Caching is set to default. The HDD Cache seems to be off and cannot be enabled due to unknown reasons.
Code:
(zrh1)root@vms1:~# hdparm -W /dev/sd[abcd]

/dev/sda:
 write-caching =  0 (off)

/dev/sdb:
 write-caching =  0 (off)

/dev/sdc:
 write-caching =  0 (off)

/dev/sdd:
 write-caching =  0 (off)

pveperf:
Code:
(zrh1)root@vms1:~# pveperf
CPU BOGOMIPS:      67200.16
REGEX/SECOND:      2140664
HD SIZE:           48.97 GB (/dev/mapper/vg0-host--root)
BUFFERED READS:    176.00 MB/sec
AVERAGE SEEK TIME: 10.77 ms
FSYNCS/SECOND:     43.43
DNS EXT:           59.73 ms
DNS INT:           10.34 ms (xyz)

Any Ideas how I can improve the performance insde the VMs?

Thanks and best regards
Patrick
 
First, make sure the LVM-thin disk is already fully allocated when doing benchmarks. Also, what kind of FS/options do you use inside the guest?
 
Thanks for your fast reply.
The guest uses ext4.
Code:
UUID={UUID} /    ext4    errors=remount-ro 0    1

I use fstrim/discard to clean up unused space.
Will try with a non-lvm-thin vm later. But I cannot believe the difference uf 150 MB/s.
 
But I cannot believe the difference uf 150 MB/s.

Again, this is usually due to LVM-thin block allocation. This only needs to be done once per block. So the benchmark will run much faster
as soon as block are allocated.
 
Just filled up the disk of one VM up to 100% and ran the dd again. Now it's around 80-90MB/s, but in my opinion that's still too slow.

Another problem is a windows VM is freezing all the time e.g. if I open the taskmanager.
As soon as it's opened, I see 100% disk usage with some 500kb/s :O
 
Last edited:
What kind of "Software Raid 10" so you run?
 
I understand.
But are you sure the problem is caused by the RAID? On the host I get good performance, the problem exists only in virtual machines.
 
I am sure that a ZFS Raid10 with fast SSD cache is a real fast and recommended storage setup with countless advantages and features which you all miss on mdraid.
 
The problem is the server only has 4 slots which are all used, so SSD is not possible. Also, I don't see any possibility for Full Disk Encryption with ZFS.
Also it's strange that the host has good performance.
 
@Patrick. Is this is a test server? Would you have 4 extra same size drives to swap out your existing ProxMox install with a test install ProxMox ZFS raid10 (using the official 5.2 setup ISO)?
 
Unfortunately the server is in production and at the moment I don't see a possibility to test ZFS.
 
Hi,

Very interesting case!

Some observations :

Your test is not so relevant, because, it is very unlikely to have only sync writing on disks. This maybe is true only if you have some DB VM . For others cases, your test have no relevance.

I do not understand how is your full disk encryption setup (with lucks?)

What I can only guess/think, is this:

- on the host server 1 M block size will be split in multiple 4 k sector size -》2 k on each disk who is ok for lucks(mdraid strip?)
- on a VM let say with ext4, the same 1 M block will be split in 4 k sector size for your vDisk, then each 4k sector size will be 2x2k for the mirror and then 1 k for each disk who is member in raid 10
- luks can encrypt on a single op 4k of data, so this make the difference for your tests.
 
Thanks for your answer
In the meantime I found out VMs are reaching 200MB/s if I enable writeback cache.
And the LVM on the host is using writeback too. So I think the performance is bad on both the host & guests without any caching.

I'm using the folowing setup:

LVM -> LUKS -> MDRAID -> HDDs
 
Update I have replaced all disks (3 year old WD RE) with brand new WD Gold and I think the performance increased. Without any cache I'm reaching 100-150 MB/s in a vm, FSYNC on the host is now 5 times more than before the change.
 
  • Like
Reactions: GadgetPig
Hi @patrick7 After you changed disks, did you ever re-run the speed test on the host? 100-150MB/s is a big improvement but it seems to compare with the speed on the host, the IO overhead is still significant.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!