ZFS or hardware Raid - new hardware setup

maxprox · Mar 18, 2016

Hello!
I have a question regarding a new hardware setup.
The new hardware I got is:
Supermicro Server, with a X10DRC-LN4+ Mainboard,
onboard it has 10xSATA 6G and a LSI 3108 SAS Chip with 8x SAS3 (12Gbps) Ports,
(like the LSI MegaRAID SAS3 9361-8i 12Gb/s RAID Controller)
2x Intel Xeon E5-2609v3 6-Core 1,9GHz 15MB 6.4GT/s; 64 GB RAM
HDDs:
2x Intel SSD DC S3700, 100GB, with full End to End data - and enhanced
Power-Loss Data Protection
4x HGST Ultrastar 7K4000 HUS724020ALS640 2TB 3.5" SAS-2 7200 U/min 64MB

My question is, what is the "best practice" for the harddisk/Raid setup?
According to my consideration, there are two options:
1:
I work with the onboard LSI Raid-Controller, build a Raid-10 with the four HGST
SAS drives on witch I install the Proxmox System like standard installation
(with ext4 and so on...)
With the two 100GB SSDs I build no real Raid, for two JBOD I take each
individually in a Raid 0, add the two SSD drives in Proxmox for Disk Images and
use them for the most stressed partitions of the main VM

2:
I also work with onboard LSI Raid-Controller, but compared to point one I build
no real Raid. I take each hard drive and SSD individually in a Raid 0 for 6x JBOD.
With the four HGST SAS drives I build a ZFS-Raid-10.
The two SSD I use either as a cache for ZFS or as described above the most stressed
partitions
2b:
I think the use of the onboard SATA ports for the above scenario is the worst idea?

Suggestions, comments, or constructive criticism is greatly appreciated!
Best regards,

maxprox

LnxBil · Mar 19, 2016

Hi maxprox,

1) You can use the two SSD with LVM-cache or dm-cache for speeding up your ext4 disks. That'll work fine. Then you have a kind of a hybrid disk with the often read/written stuff on your SSDs, but you see only "one" storage.
2) Never use a RAID-controller in RAID0 mode for ZFS. This is in fact worse than using SATA directly because you have your caching/management tier in between. Try to get real JBOD working on the controller (e.g. by patching with a IT-firmware version). You have too less memory for a good L2ARC on your SSDs. (You can also use the caching stuff from 1 to get a really fast ZFS without L2ARC).

Maybe 3)
Use a RAID-controller which does SSD caching and let the controller handle all, if you do not want into detail on LVM/DM-Cache and ZFS.

maxprox · Mar 19, 2016

thanks for the answer ,
I have a look at you suggestion to use the SSD for caching
I never used SSD for LVM caching and found for example this post:
https://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/
and I will have a look for the possibility of the on-board LSI Raid Controller for using the SSD for caching
(like this: https://www.thomas-krenn.com/de/wiki/MegaRAID_CacheCade_SSD_Cache)

EDIT:
yes, the onboard controller has the option for "CacheCade Pro 2.0", but you have to by it for circa 250 Euro,
for now I can not get more soft- or hardware for the Server
So I look at the LVM caching matter in more detail at

maxprox · Mar 28, 2016

With the hardware above I decided to use the lmv cache called dm-cache,
I describe in this post:
https://forum.proxmox.com/threads/can-we-use-lvm-cache-in-proxmox-4-x.25636/#post-134061
now I have to test it.
best regards,
maxprox

sdinet · Mar 28, 2016

Let us know how it goes. I have had nothing but bugs with the software RAID.

maxprox · Mar 29, 2016

Yes, I will test it in the next days, an will post the result.
But in this setup I does not use software RAID [EDIT:] nor ZFS:

....
First of all I used the onboard LSI hardware raid controller, AVAGO 3108 MegaRAID
...
I build a raid10 based on the four 2TB SAS HDDs.
...
and with the same raid controller I build a Raid0... with two Intel SSDs, this is for the dm-cache

SwampRabbit · Mar 29, 2016

maxprox,

I am still testing a similar set up using dm-cache.

I found the first link you posted, but also used
http://blog-vpodzime.rhcloud.com/?p=45
and it helped me a bit to understand it all.

Connected to a IBM M5015 RAID controller (onboard cache disabled):
2x HGST Travelstar HTS725050A7E630 500GB SATA RAID1 - Proxmox + ISO storage
6x HGST Travelstar HTS721010A9E630 1TB SATA RAID10 - VM data

I tested dm-cache with a cheap 120GB SSD hooked up straight to a SATA port on the motherboard and I saw decent performance increase in Random Reads over the 6 drive RAID10. No real change on anything else.

I would recommend getting good baseline benchmarks without the dm-cache. This is why I started over because I believe I was getting skewed numbers towards the end and I wanted to systematically make sure I could test it properly.

I would NOT recommend creating a cache pool with automatic meta data, this also caused me problems at first.

Right now I am doing the following:
Node testing - pveperf, hdparm, fio
Windows VMs (qcow2 and RAW) - AIDA64 disk benchmarks and fio tests
Debain VMs (qcow2 and RAW) - fio tests

Interesting in hearing your results and will share mine if you want.

LnxBil · Mar 29, 2016

You will have very good write performance if you use write-back cache and of course for everything that is already in your cache. Depending on the size of you cache, you need a warm-up phase of several non-sequential reads (sequential reads are normally bypassed and not cached). This is most "feelable" for directory listings and metadata stuff that is accessed and therefore cached more often.

The biggest problem I encountered was to integrate it into jessie via systemd units to get it ready at the right moment. If someone has a better setup or better idea, please share. This is my current setup

I created a new service in /lib/systemd/system/zfs-dmsetup.service:

Code:

[Unit]
Description=DMsetup create cache device
Requires=systemd-udev-settle.service
Before=zfs-import-scan.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/dmsetup create sda-cached --table "0 976773168 cache /dev/sdb2 /dev/sdb3 /dev/sda 512 1 writeback default 0"
ExecStop=dmsetup suspend sda-cached
ExecStop=dmsetup reload sda-cached --table "0 976773168 cache /dev/sdb2 /dev/sdb3 /dev/sda 512 0 cleaner 0"
ExecStop=dmsetup resume sda-cached
ExecStop=dmsetup wait sda-cached
ExecStop=dmsetup suspend sda-cached
ExecStop=dmsetup clear sda-cached
ExecStop=dmsetup remove sda-cached

[Install]
WantedBy=multi-user.target

And I also needed to add a requirement to /lib/systemd/system/zfs-import-scan.service:

Code:

Requires=zfs-dmsetup.service
After=zfs-dmsetup.service

I still need something cool to monitor the cache, e.g. stats like arcstat for dm-cache.

maxprox · Mar 30, 2016

SwampRabbit said:
maxprox,

I am still testing a similar set up using dm-cache.

I found the first link you posted, but also used
http://blog-vpodzime.rhcloud.com/?p=45
and it helped me a bit to understand it all.

Yes, I also used this site and it was very helpful, but this site is in my "main sources" as second link ;-)

SwampRabbit said:
I would recommend getting good baseline benchmarks without the dm-cache. This is why I started over because I believe I was getting skewed numbers towards the end and I wanted to systematically make sure I could test it properly.

which is certainly useful, please, can you tell me, how you have implemented this benchmark without the dm-cache?
Or was it before you install the dm-cache?
Or do you mean the command

Code:

  dmsetup suspend device_name
(... reload, resume and wait )

which I just found?

I would NOT recommend creating a cache pool with automatic meta data, this also caused me problems at first.

I do not understand that. For dm-cache you need two LVs: DataLVcache and DataLVcacheMeta (1000 :1). Or what you mean with "automatic"?

Right now I am doing the following:
Node testing - pveperf, hdparm, fio
Windows VMs (qcow2 and RAW) - AIDA64 disk benchmarks and fio tests
Debain VMs (qcow2 and RAW) - fio tests
Interesting in hearing your results and will share mine if you want.

Yes, definitely, I like.
First of all I installed a debian jessie VM with the phoronix-test-suite, as described here:
https://smartystreets.com/blog/2015/10/performance-testing-with-phoronix
with that I will do some first benchmarks. I am afraid that it will not show the real server scenario with 10 or maybe 100 users/clients.
Do you have a solution or a tip for an objective benchmark...

best regards,
maxprox

maxprox · Mar 30, 2016

Hi LnxBil,

regarding your ZFS things I can not contribute anything - sorry.

LnxBil said:
.......
I still need something cool to monitor the cache, e.g. stats like arcstat for dm-cache.

With the command

Code:

lvs -a -o +devices
...
data  pve  Cwi-aoC---  3.59t [CacheDataLV] [data_corig] 5.83  5.95  0.00  data_corig(0)
...

You can see which LV (=> "data") is cached
and with

Code:

dmsetup status /dev/mapper/pve-data

you see something like descript on this site (below "Testen und Monitoring")
https://www.thomas-krenn.com/de/wiki/Dm-cache
(I found no site in english)

LnxBil · Mar 30, 2016

Yes maxprox,

this is also the first site if I google around ... I tried to figure out today what each column of dmsetup status is and found it in the kernel documentation (unfortunately nothing in the manpage). So I need to setup a collectd to generate rrd graphs easily.

maxprox · Mar 31, 2016

Hello,

with the above mentioned hardware (hardware(!) Raid10)
and the additional installed dm-cache as described here
https://forum.proxmox.com/threads/can-we-use-lvm-cache-in-proxmox-4-x.25636/#post-134061
I did some first tests with fio:
In Proxmox:
Debian Jessie VM
Memory: 4 GB
CPUs: 4 (2x2)
System on Hard Disk virtio0 cache=unsafe (writeback), size 11G
TestPartition on Hard Disk virtio1 cache=unsafe (writeback), size 12G

Tested in this Debian Jessie VM on the second empty HDD: /dev/vdb:

Code:

cat jobfile_tk_01
; -- start job file --
[global]
rw=randread
size=2G
filename=/dev/vdb
direct=1
bs=1m
runtime=60
group_reporting
name=tk_4job_2g
[tk_4job_2G_01]
[tk_4job_2G_02]
[tk_4job_2G_03]
[tk_4job_2G_04]

; -- end job file --

With the following result:

Code:

root@DebianJessie-164:~/data/fio# fio jobfile_tk_01
tk_4job_2G_01: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
tk_4job_2G_02: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
tk_4job_2G_03: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
tk_4job_2G_04: (g=0): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
fio-2.1.11
Starting 4 processes
Jobs: 4 (f=4)
tk_4job_2G_01: (groupid=0, jobs=4): err= 0: pid=1238: Thu Mar 31 11:27:32 2016
  read : io=8192.0MB, bw=6390.2MB/s, iops=6390, runt=  1282msec
    clat (usec): min=206, max=3685, avg=567.65, stdev=201.74
     lat (usec): min=207, max=3685, avg=567.96, stdev=201.75
    clat percentiles (usec):
     |  1.00th=[  258],  5.00th=[  334], 10.00th=[  370], 20.00th=[  410],
     | 30.00th=[  438], 40.00th=[  478], 50.00th=[  524], 60.00th=[  580],
     | 70.00th=[  628], 80.00th=[  708], 90.00th=[  828], 95.00th=[  948],
     | 99.00th=[ 1176], 99.50th=[ 1256], 99.90th=[ 1752], 99.95th=[ 2128],
     | 99.99th=[ 3696]
    bw (MB  /s): min= 1556, max= 1896, per=27.58%, avg=1762.25, stdev=114.80
    lat (usec) : 250=0.56%, 500=44.56%, 750=39.17%, 1000=12.12%
    lat (msec) : 2=3.53%, 4=0.06%
  cpu          : usr=0.60%, sys=14.87%, ctx=8284, majf=0, minf=1049
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=8192/w=0/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=6390.2MB/s, minb=6390.2MB/s, maxb=6390.2MB/s, mint=1282msec, maxt=1282msec

Disk stats (read/write):
  vdb: ios=23886/0, merge=0/0, ticks=11232/0, in_queue=11220, util=91.49%

The same test as mentioned above, the only difference was:
direct=0 (is the same without this option; I thought it should be faster(?))

Code:

 Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=2367.7MB/s, minb=2367.7MB/s, maxb=2367.7MB/s, mint=3460msec, maxt=3460msec

Disk stats (read/write):
  vdb: ios=524222/0, merge=66/0, ticks=307408/0, in_queue=308596, util=87.50%

- - - changed Proxmox VM HDDs to no cache - - -

The same two tests as above, but I changed one thing in Proxmox:
I switched the hard disk cache from writeback (unsafe) to default "no cache"
System on Hard Disk virtio0 cache= default "no cache", size 11G
TestPartition on Hard Disk virtio1 cache= default "no cache", size 12G

Code:

Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=5520.3MB/s, minb=5520.3MB/s, maxb=5520.3MB/s, mint=1484msec, maxt=1484msec

Disk stats (read/write):
  vdb: ios=23199/0, merge=0/0, ticks=13844/0, in_queue=13896, util=87.70%

Also the same test as mentioned above, the only difference was: direct=0

Code:

Run status group 0 (all jobs):
   READ: io=8192.0MB, aggrb=1844.7MB/s, minb=1844.7MB/s, maxb=1844.7MB/s, mint=4441msec, maxt=4441msec

Disk stats (read/write):
  vdb: ios=524273/0, merge=15/0, ticks=430364/0, in_queue=430368, util=91.10%

I think that the results are actually too high. Are there suggestions for realistic values?
For objections and suggestions I am always grateful
the main information site (in german):
https://www.thomas-krenn.com/de/wik...on_Festplatten_mit_SSDs_und_Fusion-io_ioDrive
and
https://www.thomas-krenn.com/de/wiki/Fio_Grundlagen
and
http://bluestop.org/files/fio/HOWTO.txt

best regards,
maxprox

maxprox · Mar 31, 2016

now I switched the fio jobfile from --rw=randread to --rw=randwrite

Code:

cat jobfile_tk_02
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/dev/vdb
direct=1
bs=1m
runtime=60
group_reporting
name=tk_4job_2g
[tk_4job_2G_01]
[tk_4job_2G_02]
[tk_4job_2G_03]
[tk_4job_2G_04]

; -- end job file --

The write values are already more realistic

Code:

fio jobfile_tk_02
...
Starting 4 processes
...
tk_4job_2G_01: (groupid=0, jobs=4): err= 0: pid=1348: Thu Mar 31 14:09:39 2016
  write: io=3939.0MB, bw=67151KB/s, iops=65, runt= 60067msec
..
Run status group 0 (all jobs):
  WRITE: io=3939.0MB, aggrb=67150KB/s, minb=67150KB/s, maxb=67150KB/s, mint=60067msec, maxt=60067msec

Disk stats (read/write):
  vdb: ios=230/11802, merge=0/0, ticks=28/714648, in_queue=717940, util=99.98%

about 65 MB/s seems ok
located in the area that has also measured thomas-krenn

LnxBil · Mar 31, 2016

You do not have a blocksize of 1 MB in practice. Depending on your workload, 4, 8 or 16 KB is more realistic. If you're hosting e.g. VMs for Linux, ext4 is usually using 4 KB blocksize, so you should benchmark that.

maxprox · Apr 1, 2016

okay LnxBil,

now I switched the fio jobfile from --bs=1 to --bs=4K (that is the default value)
the new jobfile:

Code:

; -- start job file --
[global]
rw=randwrite
size=2G
filename=/dev/vdb
direct=1
bs=4K
runtime=60
group_reporting
name=tk_4job_2g
[tk_4job_2G_01]
[tk_4job_2G_02]
[tk_4job_2G_03]
[tk_4job_2G_04]

; -- end job file --

The four sections start with [tk_4Job_2g... is the same as numjobs=4
The result is:

Code:

fio jobfile_tk_02
tk_4job_2G_01: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
tk_4job_2G_04: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.11
Starting 4 processes
Jobs: 4 (f=4): [w(4)] [100.0% done] [0KB/4408KB/0KB /s] [0/1102/0 iops] [eta 00m:00s]
tk_4job_2G_01: (groupid=0, jobs=4): err= 0: pid=1385: Fri Apr  1 00:56:21 2016
  write: io=253948KB, bw=4232.2KB/s, iops=1058, runt= 60005msec
    clat (usec): min=251, max=110654, avg=3774.41, stdev=3354.77
     lat (usec): min=251, max=110655, avg=3775.02, stdev=3354.76
....
  lat (msec) : 100=0.04%, 250=0.01%
  cpu          : usr=0.28%, sys=0.73%, ctx=63662, majf=0, minf=25
...

Run status group 0 (all jobs):
  WRITE: io=253948KB, aggrb=4232KB/s, minb=4232KB/s, maxb=4232KB/s, mint=60005msec, maxt=60005msec

Disk stats (read/write):
  vdb: ios=114/63456, merge=0/0, ticks=20/237240, in_queue=237376, util=100.00%
root@DebianJessie-164:~/data/fio#

around 4MB/s is not so fast
the result with bs=8K:

Code:

Run status group 0 (all jobs):
  WRITE: io=497768KB, aggrb=8295KB/s, minb=8295KB/s, maxb=8295KB/s, mint=60004msec, maxt=60004msec

Disk stats (read/write):
  vdb: ios=114/62182, merge=0/0, ticks=20/237296, in_queue=237652, util=99.93%

and with bs=16K:

Code:

Run status group 0 (all jobs):
  WRITE: io=978432KB, aggrb=16306KB/s, minb=16306KB/s, maxb=16306KB/s, mint=60004msec, maxt=60004msec

Disk stats (read/write):
  vdb: ios=114/61116, merge=0/0, ticks=8/237220, in_queue=237216, util=99.95%

interesting:
bs=4K => 4MB/s; bs=8K => 8MB/s; bs=16K => 16MB/s

each test ran at least twice, where the values do not change significantly

maxprox · Apr 1, 2016

Also a very great effect, has the change in Proxmox of HDD caching
now I have again the HDD cache enabled in Proxmox, from "no cache" on writeback "unsafe"
The fio jobfile is the same as above. And I get completely different values

bs=4K

Code:

Run status group 0 (all jobs):
  WRITE: io=8192.0MB, aggrb=171925KB/s, minb=171925KB/s, maxb=171925KB/s, mint=48792msec, maxt=48792msec
Disk stats (read/write):
  vdb: ios=114/2093157, merge=0/0, ticks=8/160492, in_queue=160176, util=95.99%

bs=8K

Code:

Run status group 0 (all jobs):
  WRITE: io=8192.0MB, aggrb=337868KB/s, minb=337868KB/s, maxb=337868KB/s, mint=24828msec, maxt=24828msec

Disk stats (read/write):
  vdb: ios=114/1043699, merge=0/0, ticks=4/82192, in_queue=82060, util=95.12%

bs=16K

Code:

Run status group 0 (all jobs):
  WRITE: io=8192.0MB, aggrb=644781KB/s, minb=644781KB/s, maxb=644781KB/s, mint=13010msec, maxt=13010msec

Disk stats (read/write):
  vdb: ios=222/522411, merge=0/0, ticks=12/44456, in_queue=44624, util=98.28%

can anyone tell me if there is a connection between the caching settings in Proxmox and the installed dm-cache ?

OT:
Until Monday, I'm on my mountain bike ;-)

LnxBil · Apr 1, 2016

Storage-Performance is normally measured in IOPS, not in MB/sec, so you have to use these numbers. Please have a look at the german IOPS-Article https://de.wikipedia.org/wiki/Input/Output_operations_Per_Second

Benchmarks are normally done without anything running on the bare metal. If you test inside a VM, you will most probably test your different caching tiers and not the hardware itself. If you have enough RAM and everything is cached outside of you guest, you will have the fastest VM on earth.

maxprox · Apr 11, 2016

LnxBil said:
Storage-Performance is normally measured in IOPS, not in MB/sec, so you have to use these numbers. Please have a look at the german IOPS-Article https://de.wikipedia.org/wiki/Input/Output_operations_Per_Second

Benchmarks are normally done without anything running on the bare metal. If you test inside a VM, you will most probably test your different caching tiers and not the hardware itself. If you have enough RAM and everything is cached outside of you guest, you will have the fastest VM on earth.

Thank you LnxBil for your valuable comments. Now I have accomplished in accordance with further tests.
Bare metal directly on the Proxmox host:

bs=4K

Code:

cat fiojobfile01

; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=4K
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]

; -- end job file --

the entire issue

Code:

fio fiojobfile01
fio_4job_2G: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
fio-2.1.11
Starting 4 processes
Jobs: 4 (f=4): [w(4)] [100.0% done] [0KB/2441KB/0KB /s] [0/610/0 iops] [eta 00m:00s]
fio_4job_2G: (groupid=0, jobs=4): err= 0: pid=19755: Mon Apr 11 14:09:03 2016
  write: io=129124KB, bw=2151.6KB/s, iops=537, runt= 60016msec
  clat (usec): min=189, max=205544, avg=7431.15, stdev=16046.96
  lat (usec): min=189, max=205544, avg=7431.64, stdev=16046.94
  clat percentiles (usec):
  |  1.00th=[  237],  5.00th=[  251], 10.00th=[  258], 20.00th=[  274],
  | 30.00th=[  354], 40.00th=[ 1080], 50.00th=[ 1704], 60.00th=[ 2192],
  | 70.00th=[ 3536], 80.00th=[ 6624], 90.00th=[24960], 95.00th=[43264],
  | 99.00th=[78336], 99.50th=[92672], 99.90th=[123392], 99.95th=[134144],
  | 99.99th=[187392]
  bw (KB  /s): min=  221, max= 1025, per=25.07%, avg=539.30, stdev=139.55
  lat (usec) : 250=4.60%, 500=26.83%, 750=5.64%, 1000=2.36%
  lat (msec) : 2=17.82%, 4=14.73%, 10=12.03%, 20=4.54%, 50=7.72%
  lat (msec) : 100=3.36%, 250=0.36%
  cpu  : usr=0.13%, sys=0.74%, ctx=64576, majf=0, minf=29
  IO depths  : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
  submit  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
  issued  : total=r=0/w=32281/d=0, short=r=0/w=0/d=0
  latency  : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: io=129124KB, aggrb=2151KB/s, minb=2151KB/s, maxb=2151KB/s, mint=60016msec, maxt=60016msec

Disk stats (read/write):
  dm-5: ios=0/32307, merge=0/0, ticks=0/59256, in_queue=59256, util=98.37%, aggrios=0/21457, aggrmerge=0/0, aggrticks=0/19549, aggrin_queue=19553, aggrutil=94.81%
  dm-2: ios=0/32046, merge=0/0, ticks=0/1472, in_queue=1484, util=2.47%, aggrios=0/32046, aggrmerge=0/0, aggrticks=0/1404, aggrin_queue=1388, aggrutil=2.31%
  sdb: ios=0/32046, merge=0/0, ticks=0/1404, in_queue=1388, util=2.31%
  dm-3: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
  dm-4: ios=0/32327, merge=0/0, ticks=0/57176, in_queue=57176, util=94.81%, aggrios=0/32360, aggrmerge=0/39, aggrticks=0/57808, aggrin_queue=57780, aggrutil=94.68%
  sda: ios=0/32360, merge=0/39, ticks=0/57808, in_queue=57780, util=94.68%

bs=8K

Code:

cat fiojobfile01
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=8K
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]

; -- end job file --

Code:

....
write: io=256416KB, bw=4273.3KB/s, iops=534, runt= 60008msec
....
Run status group 0 (all jobs):
  WRITE: io=256416KB, aggrb=4273KB/s, minb=4273KB/s, maxb=4273KB/s, mint=60008msec, maxt=60008msec
....

bs=16K

Code:

root@bsprox01:/var/lib/vz/dump# cat fiojobfile01
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=16K
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]
; -- end job file --

Code:

....
write: io=495424KB, bw=8256.9KB/s, iops=516, runt= 60002msec
....
Run status group 0 (all jobs):
  WRITE: io=495424KB, aggrb=8256KB/s, minb=8256KB/s, maxb=8256KB/s, mint=60002msec, maxt=60002msec
....

bs=1M

Code:

 cat fiojobfile01
; -- start job file --
[global]
rw=randwrite
size=2G
filename=/var/lib/vz/dump/fiotestfile
direct=1
bs=1M
runtime=60
name=fiox4job_2g
numjobs=4
group_reporting
[fio_4job_2G]
; -- end job file --

Code:

....
write: io=5384.0MB, bw=91862KB/s, iops=89, runt= 60016msec
....
Run status group 0 (all jobs):
  WRITE: io=5384.0MB, aggrb=91862KB/s, minb=91862KB/s, maxb=91862KB/s, mint=60016msec, maxt=60016msec
....

With my hard drives, 4x HGST 2TB 7,200rpm SAS drives (HUS724020ALS640) in one Raid-10 disk volume and the SSD Cache the IOPS=537 (bs=4K) and IOPS=534 (bs=8K) seems to be a good value if I have a look at the wikipedia items:
https://en.wikipedia.org/wiki/IOPS
Here they say that 175 - 210 IOPS is an average value for 15,000 rpm SAS drives

Search

Search

ZFS or hardware Raid - new hardware setup

maxprox

Renowned Member

LnxBil

Distinguished Member

maxprox

Renowned Member

maxprox

Renowned Member

sdinet

Member

maxprox

Renowned Member

SwampRabbit

New Member

LnxBil

Distinguished Member

maxprox

Renowned Member

maxprox

Renowned Member

LnxBil

Distinguished Member

maxprox

Renowned Member

maxprox

Renowned Member

LnxBil

Distinguished Member

maxprox

Renowned Member

maxprox

Renowned Member

LnxBil

Distinguished Member

maxprox

Renowned Member

We value your privacy