ZFS or hardware Raid - new hardware setup

Hello!
I have a question regarding a new hardware setup.
The new hardware I got is:
Supermicro Server, with a X10DRC-LN4+ Mainboard,
onboard it has 10xSATA 6G and a LSI 3108 SAS Chip with 8x SAS3 (12Gbps) Ports,
(like the LSI MegaRAID SAS3 9361-8i 12Gb/s RAID Controller)
2x Intel Xeon E5-2609v3 6-Core 1,9GHz 15MB 6.4GT/s; 64 GB RAM
HDDs:
2x Intel SSD DC S3700, 100GB, with full End to End data - and enhanced
Power-Loss Data Protection
4x HGST Ultrastar 7K4000 HUS724020ALS640 2TB 3.5" SAS-2 7200 U/min 64MB

My question is, what is the "best practice" for the harddisk/Raid setup?
According to my consideration, there are two options:
1:
I work with the onboard LSI Raid-Controller, build a Raid-10 with the four HGST
SAS drives on witch I install the Proxmox System like standard installation
(with ext4 and so on...)
With the two 100GB SSDs I build no real Raid, for two JBOD I take each
individually in a Raid 0, add the two SSD drives in Proxmox for Disk Images and
use them for the most stressed partitions of the main VM

2:
I also work with onboard LSI Raid-Controller, but compared to point one I build
no real Raid. I take each hard drive and SSD individually in a Raid 0 for 6x JBOD.
With the four HGST SAS drives I build a ZFS-Raid-10.
The two SSD I use either as a cache for ZFS or as described above the most stressed
partitions
2b:
I think the use of the onboard SATA ports for the above scenario is the worst idea?

Suggestions, comments, or constructive criticism is greatly appreciated!
Best regards,

maxprox
 

LnxBil

Famous Member
Feb 21, 2015
5,468
596
133
Germany
Hi maxprox,

1) You can use the two SSD with LVM-cache or dm-cache for speeding up your ext4 disks. That'll work fine. Then you have a kind of a hybrid disk with the often read/written stuff on your SSDs, but you see only "one" storage.
2) Never use a RAID-controller in RAID0 mode for ZFS. This is in fact worse than using SATA directly because you have your caching/management tier in between. Try to get real JBOD working on the controller (e.g. by patching with a IT-firmware version). You have too less memory for a good L2ARC on your SSDs. (You can also use the caching stuff from 1 to get a really fast ZFS without L2ARC).

Maybe 3)
Use a RAID-controller which does SSD caching and let the controller handle all, if you do not want into detail on LVM/DM-Cache and ZFS.
 
thanks for the answer ,
I have a look at you suggestion to use the SSD for caching
I never used SSD for LVM caching and found for example this post:
https://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/
and I will have a look for the possibility of the on-board LSI Raid Controller for using the SSD for caching
(like this: https://www.thomas-krenn.com/de/wiki/MegaRAID_CacheCade_SSD_Cache)

EDIT:
yes, the onboard controller has the option for "CacheCade Pro 2.0", but you have to by it for circa 250 Euro,
for now I can not get more soft- or hardware for the Server
So I look at the LVM caching matter in more detail at
 
Last edited:
Yes, I will test it in the next days, an will post the result.
But in this setup I does not use software RAID [EDIT:] nor ZFS:
....
First of all I used the onboard LSI hardware raid controller, AVAGO 3108 MegaRAID
...
I build a raid10 based on the four 2TB SAS HDDs.
...
and with the same raid controller I build a Raid0... with two Intel SSDs, this is for the dm-cache
 
Last edited:

SwampRabbit

New Member
Dec 4, 2015
18
1
3
maxprox,

I am still testing a similar set up using dm-cache.

I found the first link you posted, but also used
http://blog-vpodzime.rhcloud.com/?p=45
and it helped me a bit to understand it all.

Connected to a IBM M5015 RAID controller (onboard cache disabled):
2x HGST Travelstar HTS725050A7E630 500GB SATA RAID1 - Proxmox + ISO storage
6x HGST Travelstar HTS721010A9E630 1TB SATA RAID10 - VM data

I tested dm-cache with a cheap 120GB SSD hooked up straight to a SATA port on the motherboard and I saw decent performance increase in Random Reads over the 6 drive RAID10. No real change on anything else.

I would recommend getting good baseline benchmarks without the dm-cache. This is why I started over because I believe I was getting skewed numbers towards the end and I wanted to systematically make sure I could test it properly.

I would NOT recommend creating a cache pool with automatic meta data, this also caused me problems at first.

Right now I am doing the following:
Node testing - pveperf, hdparm, fio
Windows VMs (qcow2 and RAW) - AIDA64 disk benchmarks and fio tests
Debain VMs (qcow2 and RAW) - fio tests

Interesting in hearing your results and will share mine if you want.
 
Last edited:

LnxBil

Famous Member
Feb 21, 2015
5,468
596
133
Germany
You will have very good write performance if you use write-back cache and of course for everything that is already in your cache. Depending on the size of you cache, you need a warm-up phase of several non-sequential reads (sequential reads are normally bypassed and not cached). This is most "feelable" for directory listings and metadata stuff that is accessed and therefore cached more often.

The biggest problem I encountered was to integrate it into jessie via systemd units to get it ready at the right moment. If someone has a better setup or better idea, please share. This is my current setup

I created a new service in /lib/systemd/system/zfs-dmsetup.service:
Code:
[Unit]
Description=DMsetup create cache device
Requires=systemd-udev-settle.service
Before=zfs-import-scan.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/dmsetup create sda-cached --table "0 976773168 cache /dev/sdb2 /dev/sdb3 /dev/sda 512 1 writeback default 0"
ExecStop=dmsetup suspend sda-cached
ExecStop=dmsetup reload sda-cached --table "0 976773168 cache /dev/sdb2 /dev/sdb3 /dev/sda 512 0 cleaner 0"
ExecStop=dmsetup resume sda-cached
ExecStop=dmsetup wait sda-cached
ExecStop=dmsetup suspend sda-cached
ExecStop=dmsetup clear sda-cached
ExecStop=dmsetup remove sda-cached

[Install]
WantedBy=multi-user.target

And I also needed to add a requirement to /lib/systemd/system/zfs-import-scan.service:
Code:
Requires=zfs-dmsetup.service
After=zfs-dmsetup.service

I still need something cool to monitor the cache, e.g. stats like arcstat for dm-cache.
 
maxprox,

I am still testing a similar set up using dm-cache.

I found the first link you posted, but also used
http://blog-vpodzime.rhcloud.com/?p=45
and it helped me a bit to understand it all.

Yes, I also used this site and it was very helpful, but this site is in my "main sources" as second link ;-)


I would recommend getting good baseline benchmarks without the dm-cache. This is why I started over because I believe I was getting skewed numbers towards the end and I wanted to systematically make sure I could test it properly.

which is certainly useful, please, can you tell me, how you have implemented this benchmark without the dm-cache?
Or was it before you install the dm-cache?
Or do you mean the command
Code:
  dmsetup suspend device_name
(... reload, resume and wait )
which I just found?

I would NOT recommend creating a cache pool with automatic meta data, this also caused me problems at first.

I do not understand that. For dm-cache you need two LVs: DataLVcache and DataLVcacheMeta (1000 :1). Or what you mean with "automatic"?

Right now I am doing the following:
Node testing - pveperf, hdparm, fio
Windows VMs (qcow2 and RAW) - AIDA64 disk benchmarks and fio tests
Debain VMs (qcow2 and RAW) - fio tests
Interesting in hearing your results and will share mine if you want.

Yes, definitely, I like.
First of all I installed a debian jessie VM with the phoronix-test-suite, as described here:
https://smartystreets.com/blog/2015/10/performance-testing-with-phoronix
with that I will do some first benchmarks. I am afraid that it will not show the real server scenario with 10 or maybe 100 users/clients.
Do you have a solution or a tip for an objective benchmark...

best regards,
maxprox
 
Last edited:
Hi LnxBil,

regarding your ZFS things I can not contribute anything - sorry.

.......
I still need something cool to monitor the cache, e.g. stats like arcstat for dm-cache.

With the command
Code:
lvs -a -o +devices
...
data  pve  Cwi-aoC---  3.59t [CacheDataLV] [data_corig] 5.83  5.95  0.00  data_corig(0)
...
You can see which LV (=> "data") is cached
and with
Code:
dmsetup status /dev/mapper/pve-data
you see something like descript on this site (below "Testen und Monitoring")
https://www.thomas-krenn.com/de/wiki/Dm-cache
(I found no site in english)
 
Last edited:

LnxBil

Famous Member
Feb 21, 2015
5,468
596
133
Germany
Yes maxprox,

this is also the first site if I google around ... I tried to figure out today what each column of dmsetup status is and found it in the kernel documentation (unfortunately nothing in the manpage). So I need to setup a collectd to generate rrd graphs easily.
 
Hello,

with the above mentioned hardware (hardware(!) Raid10)
and the additional installed dm-cache as described here
https://forum.proxmox.com/threads/can-we-use-lvm-cache-in-proxmox-4-x.25636/#post-134061
I did some first tests with fio:
In Proxmox:
Debian Jessie VM
Memory: 4 GB
CPUs: 4 (2x2)
System on Hard Disk virtio0 cache=unsafe (writeback), size 11G
TestPartition on Hard Disk virtio1 cache=unsafe (writeback), size 12G

Tested in this Debian Jessie VM on the second empty HDD: /dev/vdb:

Code:
cat jobfile_tk_01
; -- start job file --
[global]
rw=randread
size=2G
filename=/dev/vdb
direct=1
bs=1m
runtime=60
group_reporting
name=tk_4job_2g
[tk_4job_2G_01]
[tk_4job_2G_02]
[tk_4job_2G_03]
[tk_4job_2G_04]

; -- end job file --

With the following result:
Code:
root@DebianJessie-164:~/data/fio# fio jobfile_tk_01
tk_4job_2G_01: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
tk_4job_2G_02: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
tk_4job_2G_03: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
tk_4job_2G_04: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
fio-2.1.11
Starting 4 processes
Jobs: 4 (f=4)
tk_4job_2G_01: (groupid=0, jobs=4): err= 0: pid=1238: Thu Mar 31 11:27:32 2016
  read : io=8192.0MB, bw=6390.2MB/s, iops=6390, runt=  1282msec
    clat (usec): min=206, max=3685, avg=567.65, stdev=201.74
     lat (usec): min=207, max=3685, avg=567.96, stdev=201.75
    clat percentiles (usec):
     |  1.00th=[  258],  5.00th=[  334], 10.00th=[  370], 20.00th=[  410],
     | 30.00th=[  438], 40.00th=[  478], 50.00th=[  524], 60.00th=[  580],
     | 70.00th=[  628], 80.00th=[  708], 90.00th=[  828], 95.00th=[  948],
     | 99.00th=[ 1176], 99.50th=[ 1256], 99.90th=[ 1752], 99.95th=[ 2128],
     | 99.99th=[ 3696]
    bw (MB  /s): min= 1556, max= 1896, per=27.58%, avg=1762.25, stdev=114.80
    lat (usec) : 250=0.56%, 500=44.56%, 750=39.17%, 1000=12.12%
    lat (msec) : 2=3.53%, 4=0.06%
  cpu          : usr=0.60%, sys=14.87%, ctx=8284, majf=0, minf=1049
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=8192/w=0/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=6390.2MB/s, minb=6390.2MB/s, maxb=6390.2MB/s, mint=1282msec, maxt=1282msec

Disk stats (read/write):
  vdb: ios=23886/0, merge=0/0, ticks=11232/0, in_queue=11220, util=91.49%



The same test as mentioned above, the only difference was:
direct=0 (is the same without this option; I thought it should be faster(?))
Code:
 Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=2367.7MB/s, minb=2367.7MB/s, maxb=2367.7MB/s, mint=3460msec, maxt=3460msec

Disk stats (read/write):
  vdb: ios=524222/0, merge=66/0, ticks=307408/0, in_queue=308596, util=87.50%

- - - changed Proxmox VM HDDs to no cache - - -
The same two tests as above, but I changed one thing in Proxmox:
I switched the hard disk cache from writeback (unsafe) to default "no cache"
System on Hard Disk virtio0 cache= default "no cache", size 11G
TestPartition on Hard Disk virtio1 cache= default "no cache", size 12G


Code:
Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=5520.3MB/s, minb=5520.3MB/s, maxb=5520.3MB/s, mint=1484msec, maxt=1484msec

Disk stats (read/write):
  vdb: ios=23199/0, merge=0/0, ticks=13844/0, in_queue=13896, util=87.70%

Also the same test as mentioned above, the only difference was: direct=0
Code:
Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=1844.7MB/s, minb=1844.7MB/s, maxb=1844.7MB/s, mint=4441msec, maxt=4441msec

Disk stats (read/write):
  vdb: ios=524273/0, merge=15/0, ticks=430364/0, in_queue=430368, util=91.10%

I think that the results are actually too high. Are there suggestions for realistic values?
For objections and suggestions I am always grateful
the main information site (in german):
https://www.thomas-krenn.com/de/wik...on_Festplatten_mit_SSDs_und_Fusion-io_ioDrive
and
https://www.thomas-krenn.com/de/wiki/Fio_Grundlagen
and
http://bluestop.org/files/fio/HOWTO.txt

best regards,
maxprox
 
Last edited:
now I switched the fio jobfile from --rw=randread to --rw=randwrite

Code:
cat jobfile_tk_02
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/dev/vdb
direct=1
bs=1m
runtime=60
group_reporting
name=tk_4job_2g
[tk_4job_2G_01]
[tk_4job_2G_02]
[tk_4job_2G_03]
[tk_4job_2G_04]

; -- end job file --

The write values are already more realistic

Code:
fio jobfile_tk_02
...
Starting 4 processes
...
tk_4job_2G_01: (groupid=0, jobs=4): err= 0: pid=1348: Thu Mar 31 14:09:39 2016
  write: io=3939.0MB, bw=67151KB/s, iops=65, runt= 60067msec
..
Run status group 0 (all jobs):
  WRITE: io=3939.0MB, aggrb=67150KB/s, minb=67150KB/s, maxb=67150KB/s, mint=60067msec, maxt=60067msec

Disk stats (read/write):
  vdb: ios=230/11802, merge=0/0, ticks=28/714648, in_queue=717940, util=99.98%

about 65 MB/s seems ok
located in the area that has also measured thomas-krenn
 
Last edited:

LnxBil

Famous Member
Feb 21, 2015
5,468
596
133
Germany
You do not have a blocksize of 1 MB in practice. Depending on your workload, 4, 8 or 16 KB is more realistic. If you're hosting e.g. VMs for Linux, ext4 is usually using 4 KB blocksize, so you should benchmark that.
 
okay LnxBil,

now I switched the fio jobfile from --bs=1 to --bs=4K (that is the default value)
the new jobfile:
Code:
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/dev/vdb
direct=1
bs=4K
runtime=60
group_reporting
name=tk_4job_2g
[tk_4job_2G_01]
[tk_4job_2G_02]
[tk_4job_2G_03]
[tk_4job_2G_04]

; -- end job file --
The four sections start with [tk_4Job_2g... is the same as numjobs=4
The result is:

Code:
fio jobfile_tk_02
tk_4job_2G_01: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
tk_4job_2G_04: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.11
Starting 4 processes
Jobs: 4 (f=4): [w(4)] [100.0% done] [0KB/4408KB/0KB /s] [0/1102/0 iops] [eta 00m:00s]
tk_4job_2G_01: (groupid=0, jobs=4): err= 0: pid=1385: Fri Apr  1 00:56:21 2016
  write: io=253948KB, bw=4232.2KB/s, iops=1058, runt= 60005msec
    clat (usec): min=251, max=110654, avg=3774.41, stdev=3354.77
     lat (usec): min=251, max=110655, avg=3775.02, stdev=3354.76
....
  lat (msec) : 100=0.04%, 250=0.01%
  cpu          : usr=0.28%, sys=0.73%, ctx=63662, majf=0, minf=25
...

Run status group 0 (all jobs):
  WRITE: io=253948KB, aggrb=4232KB/s, minb=4232KB/s, maxb=4232KB/s, mint=60005msec, maxt=60005msec

Disk stats (read/write):
  vdb: ios=114/63456, merge=0/0, ticks=20/237240, in_queue=237376, util=100.00%
root@DebianJessie-164:~/data/fio#

around 4MB/s is not so fast
the result with bs=8K:
Code:
Run status group 0 (all jobs):
  WRITE: io=497768KB, aggrb=8295KB/s, minb=8295KB/s, maxb=8295KB/s, mint=60004msec, maxt=60004msec

Disk stats (read/write):
  vdb: ios=114/62182, merge=0/0, ticks=20/237296, in_queue=237652, util=99.93%

and with bs=16K:

Code:
Run status group 0 (all jobs):
  WRITE: io=978432KB, aggrb=16306KB/s, minb=16306KB/s, maxb=16306KB/s, mint=60004msec, maxt=60004msec

Disk stats (read/write):
  vdb: ios=114/61116, merge=0/0, ticks=8/237220, in_queue=237216, util=99.95%

interesting:
bs=4K => 4MB/s; bs=8K => 8MB/s; bs=16K => 16MB/s

each test ran at least twice, where the values do not change significantly
 
Last edited:
Also a very great effect, has the change in Proxmox of HDD caching
now I have again the HDD cache enabled in Proxmox, from "no cache" on writeback "unsafe"
The fio jobfile is the same as above. And I get completely different values

bs=4K
Code:
Run status group 0 (all jobs):
  WRITE: io=8192.0MB, aggrb=171925KB/s, minb=171925KB/s, maxb=171925KB/s, mint=48792msec, maxt=48792msec
Disk stats (read/write):
  vdb: ios=114/2093157, merge=0/0, ticks=8/160492, in_queue=160176, util=95.99%


bs=8K

Code:
Run status group 0 (all jobs):
  WRITE: io=8192.0MB, aggrb=337868KB/s, minb=337868KB/s, maxb=337868KB/s, mint=24828msec, maxt=24828msec

Disk stats (read/write):
  vdb: ios=114/1043699, merge=0/0, ticks=4/82192, in_queue=82060, util=95.12%


bs=16K

Code:
Run status group 0 (all jobs):
  WRITE: io=8192.0MB, aggrb=644781KB/s, minb=644781KB/s, maxb=644781KB/s, mint=13010msec, maxt=13010msec

Disk stats (read/write):
  vdb: ios=222/522411, merge=0/0, ticks=12/44456, in_queue=44624, util=98.28%

can anyone tell me if there is a connection between the caching settings in Proxmox and the installed dm-cache ?

OT:
Until Monday, I'm on my mountain bike ;-)
 
Last edited:

LnxBil

Famous Member
Feb 21, 2015
5,468
596
133
Germany
Storage-Performance is normally measured in IOPS, not in MB/sec, so you have to use these numbers. Please have a look at the german IOPS-Article https://de.wikipedia.org/wiki/Input/Output_operations_Per_Second

Benchmarks are normally done without anything running on the bare metal. If you test inside a VM, you will most probably test your different caching tiers and not the hardware itself. If you have enough RAM and everything is cached outside of you guest, you will have the fastest VM on earth.
 
Storage-Performance is normally measured in IOPS, not in MB/sec, so you have to use these numbers. Please have a look at the german IOPS-Article https://de.wikipedia.org/wiki/Input/Output_operations_Per_Second

Benchmarks are normally done without anything running on the bare metal. If you test inside a VM, you will most probably test your different caching tiers and not the hardware itself. If you have enough RAM and everything is cached outside of you guest, you will have the fastest VM on earth.
Thank you LnxBil for your valuable comments. Now I have accomplished in accordance with further tests.
Bare metal directly on the Proxmox host:

bs=4K
Code:
cat fiojobfile01

; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=4K
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]

; -- end job file --

the entire issue

Code:
fio fiojobfile01
fio_4job_2G: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
fio-2.1.11
Starting 4 processes
Jobs: 4 (f=4): [w(4)] [100.0% done] [0KB/2441KB/0KB /s] [0/610/0 iops] [eta 00m:00s]
fio_4job_2G: (groupid=0, jobs=4): err= 0: pid=19755: Mon Apr 11 14:09:03 2016
  write: io=129124KB, bw=2151.6KB/s, iops=537, runt= 60016msec
  clat (usec): min=189, max=205544, avg=7431.15, stdev=16046.96
  lat (usec): min=189, max=205544, avg=7431.64, stdev=16046.94
  clat percentiles (usec):
  |  1.00th=[  237],  5.00th=[  251], 10.00th=[  258], 20.00th=[  274],
  | 30.00th=[  354], 40.00th=[ 1080], 50.00th=[ 1704], 60.00th=[ 2192],
  | 70.00th=[ 3536], 80.00th=[ 6624], 90.00th=[24960], 95.00th=[43264],
  | 99.00th=[78336], 99.50th=[92672], 99.90th=[123392], 99.95th=[134144],
  | 99.99th=[187392]
  bw (KB  /s): min=  221, max= 1025, per=25.07%, avg=539.30, stdev=139.55
  lat (usec) : 250=4.60%, 500=26.83%, 750=5.64%, 1000=2.36%
  lat (msec) : 2=17.82%, 4=14.73%, 10=12.03%, 20=4.54%, 50=7.72%
  lat (msec) : 100=3.36%, 250=0.36%
  cpu  : usr=0.13%, sys=0.74%, ctx=64576, majf=0, minf=29
  IO depths  : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
  submit  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  issued  : total=r=0/w=32281/d=0, short=r=0/w=0/d=0
  latency  : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: io=129124KB, aggrb=2151KB/s, minb=2151KB/s, maxb=2151KB/s, mint=60016msec, maxt=60016msec

Disk stats (read/write):
  dm-5: ios=0/32307, merge=0/0, ticks=0/59256, in_queue=59256, util=98.37%, aggrios=0/21457, aggrmerge=0/0, aggrticks=0/19549, aggrin_queue=19553, aggrutil=94.81%
  dm-2: ios=0/32046, merge=0/0, ticks=0/1472, in_queue=1484, util=2.47%, aggrios=0/32046, aggrmerge=0/0, aggrticks=0/1404, aggrin_queue=1388, aggrutil=2.31%
  sdb: ios=0/32046, merge=0/0, ticks=0/1404, in_queue=1388, util=2.31%
  dm-3: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
  dm-4: ios=0/32327, merge=0/0, ticks=0/57176, in_queue=57176, util=94.81%, aggrios=0/32360, aggrmerge=0/39, aggrticks=0/57808, aggrin_queue=57780, aggrutil=94.68%
  sda: ios=0/32360, merge=0/39, ticks=0/57808, in_queue=57780, util=94.68%


bs=8K

Code:
cat fiojobfile01
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=8K
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]

; -- end job file --

Code:
....
write: io=256416KB, bw=4273.3KB/s, iops=534, runt= 60008msec
....
Run status group 0 (all jobs):
  WRITE: io=256416KB, aggrb=4273KB/s, minb=4273KB/s, maxb=4273KB/s, mint=60008msec, maxt=60008msec
....

bs=16K

Code:
root@bsprox01:/var/lib/vz/dump# cat fiojobfile01
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=16K
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]
; -- end job file --

Code:
....
write: io=495424KB, bw=8256.9KB/s, iops=516, runt= 60002msec
....
Run status group 0 (all jobs):
  WRITE: io=495424KB, aggrb=8256KB/s, minb=8256KB/s, maxb=8256KB/s, mint=60002msec, maxt=60002msec
....

bs=1M

Code:
 cat fiojobfile01
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=1M
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]
; -- end job file --

Code:
....
write: io=5384.0MB, bw=91862KB/s, iops=89, runt= 60016msec
....
Run status group 0 (all jobs):
  WRITE: io=5384.0MB, aggrb=91862KB/s, minb=91862KB/s, maxb=91862KB/s, mint=60016msec, maxt=60016msec
....

With my hard drives, 4x HGST 2TB 7,200rpm SAS drives (HUS724020ALS640) in one Raid-10 disk volume and the SSD Cache the IOPS=537 (bs=4K) and IOPS=534 (bs=8K) seems to be a good value if I have a look at the wikipedia items:
https://en.wikipedia.org/wiki/IOPS
Here they say that 175 - 210 IOPS is an average value for 15,000 rpm SAS drives
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!