IO is slow on the VM, but fast on the host.

jic5760

Active Member
Nov 10, 2020
41
8
28
27
The physical disks are NVMe and provisioned with lvmthin.
Very fast when benchmarking a thin-volume assigned to a vm on host .
But in vm it is quite slow.
Please advise for DISK IO performance.

VM Options:
```
-drive file=/dev/datastore/vm-12205-disk-0,if=none,id=drive-scsi1,cache=writethrough,discard=on,format=raw,aio=io_uring,detect-zeroes=unmap
```

Physical Device:
```
# dd if=/dev/nvme0n1 of=/dev/null bs=1M skip=200000 count=6000 iflag=direct,sync status=progress
5241831424 bytes (5.2 GB, 4.9 GiB) copied, 2 s, 2.6 GB/s
6000+0 records in
6000+0 records out
6291456000 bytes (6.3 GB, 5.9 GiB) copied, 2.40037 s, 2.6 GB/s
```

LVM-Thin Volume:
```
# dd if=/dev/datastore/vm-12205-disk-0 of=/dev/null bs=1M skip=100000 count=6000 iflag=direct,sync status=progress
3716153344 bytes (3.7 GB, 3.5 GiB) copied, 1 s, 3.7 GB/s
6000+0 records in
6000+0 records out
6291456000 bytes (6.3 GB, 5.9 GiB) copied, 1.77287 s, 3.5 GB/s
```

In Virtual Machine:
```
# dd if=/dev/sdb of=/dev/null bs=1M skip=50000 count=3000 iflag=direct,sync status=progress
2679111680 bytes (2.7 GB, 2.5 GiB) copied, 5 s, 536 MB/s
3000+0 records in
3000+0 records out
3145728000 bytes (3.1 GB, 2.9 GiB) copied, 5.85316 s, 537 MB/s

```
 
Please show us your VM config, i.e. the output of qm config VMID

Also, dd is really not that great for benchmarking performance
 
Last edited:
Please show us your VM config, i.e. the output of qm config VMID

Also, dd is really not that great for benchmarking performance

I know dd is not a "good benchmark", but I want to make sure that sequence performance is good.

```
agent: 1
balloon: 40960
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 16
cpu: host,flags=+aes
efidisk0: local-lvm,efitype=4m,pre-enrolled-keys=1,size=4M
ide2: local:iso/ubuntu-22.04.1-live-server-amd64.iso,media=cdrom,size=1440306K
machine: q35
memory: 65536
meta: creation-qemu=7.1.0,ctime=1670985996
name: ...
net0: virtio=...,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-lvm:vm-12205-disk-2,cache=writethrough,size=64G
scsi1: datastore:vm-12205-disk-0,backup=0,cache=writethrough,discard=on,size=120G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=...
sockets: 1
startup: up=10
tpmstate0: local-lvm:vm-12205-disk-1,size=4M,version=v2.0
vmgenid: ...
```
 
Things you can try:
* Use VirtIO SCSI single as SCSI Controller and enable iothread on your (virtual) disk
* Use a VirtIO Block disk
* Test different cache settings
* with fio you could increase iodepth
 
Thank you! Using virtio-blk has improved performance!

config:
```
virtio1: datastore:vm-12205-disk-0,backup=0,cache=writeback,discard=on,iothread=1,size=120G
```

```
# dd if=/dev/vda of=/dev/null bs=1M skip=2000 count=6000 iflag=direct,sync status=progress
6090129408 bytes (6.1 GB, 5.7 GiB) copied, 5 s, 1.2 GB/s
6000+0 records in
6000+0 records out
6291456000 bytes (6.3 GB, 5.9 GiB) copied, 5.33521 s, 1.2 GB/s

# fio --name RWRITE --rw=randwrite --filename=fio.tmp --size=128m --blocksize=4k --iodepth=1 --direct=0 --numjobs=1 --ioengine=posixaio
RWRITE: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
RWRITE: Laying out IO file (1 file / 128MiB)
Jobs: 1 (f=1)
RWRITE: (groupid=0, jobs=1): err= 0: pid=39250: Thu Dec 15 03:32:43 2022
write: IOPS=14.0k, BW=54.6MiB/s (57.2MB/s)(128MiB/2346msec); 0 zone resets
slat (usec): min=2, max=477, avg= 8.33, stdev= 4.37
clat (usec): min=22, max=1823, avg=60.89, stdev=34.40
lat (usec): min=27, max=1831, avg=69.23, stdev=34.96
clat percentiles (usec):
| 1.00th=[ 40], 5.00th=[ 42], 10.00th=[ 48], 20.00th=[ 53],
| 30.00th=[ 55], 40.00th=[ 57], 50.00th=[ 59], 60.00th=[ 61],
| 70.00th=[ 65], 80.00th=[ 68], 90.00th=[ 75], 95.00th=[ 81],
| 99.00th=[ 97], 99.50th=[ 110], 99.90th=[ 400], 99.95th=[ 562],
| 99.99th=[ 1696]
bw ( KiB/s): min=50496, max=62152, per=99.53%, avg=55610.00, stdev=4939.85, samples=4
iops : min=12626, max=15538, avg=13903.00, stdev=1234.27, samples=4
lat (usec) : 50=14.25%, 100=84.97%, 250=0.62%, 500=0.07%, 750=0.05%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.03%
cpu : usr=13.94%, sys=19.02%, ctx=32785, majf=0, minf=25
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,32768,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
WRITE: bw=54.6MiB/s (57.2MB/s), 54.6MiB/s-54.6MiB/s (57.2MB/s-57.2MB/s), io=128MiB (134MB), run=2346-2346msec

Disk stats (read/write):
vda: ios=0/2, merge=0/0, ticks=0/1, in_queue=1, util=0.18%

```
 
  • Like
Reactions: datschlatscher
As the virtio block is about to be deprecated.. as per official prox page.. You do test with about a test so small that wasn't it just used for the ram cache. --rw=read --size=2g --io_size=10g will give a minimum of usefull read and not sure the option block will help there as it's only 1 vm on a single disk. But i guess only the i/o thread might have a somehow impact.. but again : 1 drive and 1 vm.. what soft usage ?