pve 4.2 and iothread

mir

Famous Member
Apr 14, 2012
3,584
139
133
Copenhagen, Denmark
Hi all,

I have done some IO tests on proxmox 4.2 with iothread. Results below:
Code:
fio --description="Emulation of Intel IOmeter File Server Access Pattern"
--name=iometer --bssplit=512/10:1k/5:2k/5:4k/60:8k/2:16k/4:32k/4:64k/10
--rw=randrw --rwmixread=80 --direct=1 --size=4g --ioengine=libaio --iodepth=8
                   read          write
virtio-blk
iothread=1:        13603         3403
iothread=0:         9795         2450

virtio-scsi-single
iothread=1:         5797         1450
iothread=0:         7011         1754

virtio-scsi
iothread=1:         2560          640
iothread=0:         9379         2346

fio --description="Emulation of Intel IOmeter File Server Access Pattern"
--name=iometer --bssplit=512/10:1k/5:2k/5:4k/60:8k/2:16k/4:32k/4:64k/10
--rw=randrw --rwmixread=80 --direct=1 --size=4g --ioengine=libaio --iodepth=64
                   read          write
virtio-blk
iothread=1:        11699         2927
iothread=0:         8710         2179

virtio-scsi-single
iothread=1:        16745         4189
iothread=0:         3233          808

virtio-scsi
iothread=1:        10982         2747
iothread=0:         4943         1236

Conclusion:
1) option iothread gives more performance increase when used with virtio-scsi
2) For small number of io threads virtio-blk performs significantly better
3) For small number of io threads using option iothread reduces performance for virtio-scsi
4) For a larger number of io threads use option iothread with virtio-scsi
5) If using virtio-scsi always select virtio-scsi-single

Why virio-scsi-single performs significantly better than virtio-scsi is a mystery to me!!??
 
Plain VirtIO

Code:
# fio --description="Emulation of Intel IOmeter File Server Access Pattern" --name=iometer --bssplit=512/10:1k/5:2k/5:4k/60:8k/2:16k/4:32k/4:64k/10 --rw=randrw --rwmixread=80 --direct=1 --size=4g --ioengine=libaio --iodepth=8
iometer: (g=0): rw=randrw, bs=512-64K/512-64K/512-64K, ioengine=libaio, iodepth=8
fio-2.0.13
Starting 1 process
iometer: Laying out IO file(s) (1 file(s) / 4096MB)
Jobs: 1 (f=1): [m] [100.0% done] [160.2M/41741K/0K /s] [43.8K/11.2K/0  iops] [eta 00m:00s]
iometer: (groupid=0, jobs=1): err= 0: pid=1718: Tue May 17 19:20:56 2016
  Description  : [Emulation of Intel IOmeter File Server Access Pattern]
  read : io=3277.3MB, bw=192490KB/s, iops=43139 , runt= 17434msec
    slat (usec): min=4 , max=5213 , avg=13.36, stdev=11.48
    clat (usec): min=21 , max=7459 , avg=130.09, stdev=63.89
     lat (usec): min=29 , max=7484 , avg=143.98, stdev=65.66
    clat percentiles (usec):
     |  1.00th=[   67],  5.00th=[   86], 10.00th=[   96], 20.00th=[  107],
     | 30.00th=[  114], 40.00th=[  119], 50.00th=[  123], 60.00th=[  129],
     | 70.00th=[  137], 80.00th=[  147], 90.00th=[  165], 95.00th=[  189],
     | 99.00th=[  266], 99.50th=[  302], 99.90th=[  490], 99.95th=[  732],
     | 99.99th=[ 3120]
    bw (KB/s)  : min=159070, max=229360, per=100.00%, avg=193056.76, stdev=21445.89
  write: io=838441KB, bw=48092KB/s, iops=10807 , runt= 17434msec
    slat (usec): min=6 , max=2370 , avg=18.07, stdev=12.33
    clat (usec): min=28 , max=7949 , avg=134.93, stdev=78.14
     lat (usec): min=43 , max=7969 , avg=153.56, stdev=79.77
    clat percentiles (usec):
     |  1.00th=[   75],  5.00th=[   94], 10.00th=[  102], 20.00th=[  112],
     | 30.00th=[  117], 40.00th=[  121], 50.00th=[  126], 60.00th=[  133],
     | 70.00th=[  141], 80.00th=[  151], 90.00th=[  169], 95.00th=[  193],
     | 99.00th=[  278], 99.50th=[  322], 99.90th=[  636], 99.95th=[ 1080],
     | 99.99th=[ 3344]
    bw (KB/s)  : min=39135, max=58022, per=100.00%, avg=48245.71, stdev=5422.83
    lat (usec) : 50=0.12%, 100=11.89%, 250=86.58%, 500=1.30%, 750=0.05%
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.03%, 10=0.01%
  cpu          : usr=17.97%, sys=78.63%, ctx=45033, majf=0, minf=23
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=752086/w=188418/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=3277.3MB, aggrb=192489KB/s, minb=192489KB/s, maxb=192489KB/s, mint=17434msec, maxt=17434msec
  WRITE: io=838440KB, aggrb=48092KB/s, minb=48092KB/s, maxb=48092KB/s, mint=17434msec, maxt=17434msec

Disk stats (read/write):
    dm-0: ios=751031/188138, merge=0/0, ticks=28730/15536, in_queue=45233, util=98.89%, aggrios=752087/188430, aggrmerge=0/2, aggrticks=27899/15328, aggrin_queue=43783, aggrutil=98.61%
  vda: ios=752087/188430, merge=0/2, ticks=27899/15328, in_queue=43783, util=98.61%
What is my results comparing to yours?

Can't compare with your output format
 
Why virio-scsi-single performs significantly better than virtio-scsi is a mystery to me!!??

Well, this is pretty simple, iothread is not enabled with virtio-scsi. only with virtio-scsi-single.

in qemu , we can have 1 iothread by disk controller.

virtio-blk have 1 controller by disk.

but virtio-scsi by default, is 1 controller for X disk, so iothread will be shared between disk

virtio-scsi-single is to implement : 1 virtio-scsi controller - 1 disk.


Also, if you bench multiple disk at the same time, it should scale with enabling iothread.
 
Last edited:
virtio-scsi-single is to implement : 1 virtio-scsi controller - 1 disk.
I see. I thought it was the other way round - virtio-scsi-single, one controller for X disks and virtio-scsi, one controller for one disk. That was the reason for my confusion.