Poor ZFS storage performance in VM compared to host

ufear

Active Member
Aug 19, 2017
3
0
41
33
Hi,

This has been bothering me for a while and after spending countless hours trying to figure out how to resolve it figured I'd ask here and see if anybody has helpful pointers;

Situation:
- 1 Server, currently running proxmox 4.3.1
-- Dual E5 2683-v4 on a Supermicro X10DRD-iNT with 128GB of ECC Reg DDR4/2400Mhz
-- 1x Supermicro SuperDOM 16gb containing Proxmox install
-- 1x 512GB Samsung SM961 SSD
-- 3x 4TB 7200rpm disks
- Using ZFS configured within proxmox
-- 1 pool, compression off, dedup off, currently 39% fragmentation
-- Using 100GB of the SM961 as ZIL
-- Using the remaining 377GB of the SM961 as L2ARC

Problem:
I feel that disk I/O within VMs is rather slow/sluggish - for example when installing a bunch of packages through apt-get this can take rather long. In order to investigate I've turned to fio. This is the output on a host;

Code:
root@proxmox:/poolz1/media# fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.1.11
Starting 1 process
Jobs: 1 (f=1): [m(1)] [88.9% done] [492.3MB/163.8MB/0KB /s] [126K/41.1K/0 iops] [eta 00m:01s]
test: (groupid=0, jobs=1): err= 0: pid=26096: Sat Aug 19 15:38:50 2017
  read : io=3071.7MB, bw=404864KB/s, iops=101215, runt=  7769msec
  write: io=1024.4MB, bw=135013KB/s, iops=33753, runt=  7769msec
  cpu          : usr=6.28%, sys=92.17%, ctx=1686, majf=0, minf=349
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=786347/w=262229/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=3071.7MB, aggrb=404863KB/s, minb=404863KB/s, maxb=404863KB/s, mint=7769msec, maxt=7769msec
  WRITE: io=1024.4MB, aggrb=135013KB/s, minb=135013KB/s, maxb=135013KB/s, mint=7769msec, maxt=7769msec

Now within a fresh VM

Code:
root@perf4:~# fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.2.10
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 4096MB)
Jobs: 1 (f=1): [m(1)] [100.0% done] [53144KB/18136KB/0KB /s] [13.3K/4534/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=1562: Sat Aug 19 15:25:27 2017
  read : io=3071.7MB, bw=59448KB/s, iops=14861, runt= 52910msec
  write: io=1024.4MB, bw=19825KB/s, iops=4956, runt= 52910msec
  cpu          : usr=3.60%, sys=24.51%, ctx=786374, majf=0, minf=10
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=786347/w=262229/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=3071.7MB, aggrb=59447KB/s, minb=59447KB/s, maxb=59447KB/s, mint=52910msec, maxt=52910msec
  WRITE: io=1024.4MB, aggrb=19824KB/s, minb=19824KB/s, maxb=19824KB/s, mint=52910msec, maxt=52910msec

Disk stats (read/write):
  sda: ios=784380/168548, merge=11/142, ticks=38564/112752, in_queue=151468, util=74.22%

Performance seems to be roughly 1/6th of the host itself - however, if I run fio with direct=1 then:

Code:
root@perf4:~# fio --direct=1 --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.2.10
Starting 1 process
Jobs: 1 (f=1): [m(1)] [100.0% done] [466.1MB/155.5MB/0KB /s] [120K/39.8K/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=1611: Sat Aug 19 15:40:43 2017
  read : io=3071.7MB, bw=444704KB/s, iops=111175, runt=  7073msec
  write: io=1024.4MB, bw=148299KB/s, iops=37074, runt=  7073msec
  cpu          : usr=9.39%, sys=89.88%, ctx=768, majf=0, minf=9
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=786347/w=262229/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=3071.7MB, aggrb=444703KB/s, minb=444703KB/s, maxb=444703KB/s, mint=7073msec, maxt=7073msec
  WRITE: io=1024.4MB, aggrb=148298KB/s, minb=148298KB/s, maxb=148298KB/s, mint=7073msec, maxt=7073msec

Disk stats (read/write):
  sda: ios=766145/255529, merge=0/1, ticks=33420/11544, in_queue=46396, util=93.90%

Everything seems fine - but, I don't think there is a way to force to always access the disk using O_DIRECT if I am correct.

I have the feeling that something is sitting in between my ZFS setup and the VM.

Solutions i've attempted:
- Played around with all emulation modes (IDE/SCSI/SATA/VirtIO) and caching modes
- Played around with the controller emulation (VirtIO SCSI, VirtIO SCSI-Single, LSI)
- I've looked at the zfs blockvolsize, increased it from 8k to 128k (as the recordsize is for the folder where I performed the host test) - did not change anything

While changing these settings has some minor effect, either positive or negative, the amounts of IOPS remains around 15k read/5k write in a VM - while in the host this is consistently 100k/35k'ish.

Anybody have a pointer where to look next? Except for throwing the entire setup in the garbage and setting up a seperate NAS/SAN attached through 10GBe?

Thanks already!

ufear
 
Hi,
zfs is hard to test with tools like fio. Some question:
1. how is look zpool status -v (command line)
2. if I understand you test with fio with bs=4k ?
3. your VM use raw format or not?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!