[TUTORIAL] Proxmox ZFS raid1 performance

yarii

Renowned Member
Mar 24, 2014
147
8
83
Got problems with performance of Proxmox:

I have got:
- Proxmox 6.x (clean install from ISO)m
- H220 SAS card in IT mode,
- 6x different SSD drives:
2x Goodram 120GB (system)
2x Samsung 860 PRO 256GB (pool1)
2x IRDM Pro 1TB (pool2)

hdparm -t -T /dev/sd(a-f)
gets about 520MB/s read

but when I run pveperf I get only:

for system pool:
FSYNC/SECOND: 2100

for pool1:
FSYNC/SECOND: 315

for pool2:
FSYNC/SECOND: 355

What went wrong?

I created pool1 with:
zpool create pool1 mirror /dev/disk/by-id/ata-SAMSUNG-* -o ashift=12
 
After upgrade to zfs 0.8.3, the results is more:
FSYNCS/SECOND: 2811.57
FSYNCS/SECOND: 394.92
FSYNCS/SECOND: 359.10

But why the hell the system pool is almost 7 times faster at the same controller.
 
Try to disassemble the pool, and check the benchmark with a single disk and ext4 file system.
 
fio --filename=test --sync=1 --rw=randread --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=1G --runtime=300 && rm test

READ: bw=526MiB/s (552MB/s), 526MiB/s-526MiB/s (552MB/s-552MB/s), io=1024MiB (1074MB), run=1945-1945msec, iops avg=149850.67 (system)
READ: bw=526MiB/s (552MB/s), 526MiB/s-526MiB/s (552MB/s-552MB/s), io=1024MiB (1074MB), run=1945-1945msec, iops avg=126670.00 (pool1)
READ: bw=611MiB/s (640MB/s), 611MiB/s-611MiB/s (640MB/s-640MB/s), io=1024MiB (1074MB), run=1677-1677msec, iops avg=146170.67 (pool2)

fio --filename=test --sync=1 --rw=randwrite --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=1G --runtime=300 && rm test

WRITE: bw=15.1MiB/s (15.8MB/s), 15.1MiB/s-15.1MiB/s (15.8MB/s-15.8MB/s), io=1024MiB (1074MB), run=67968-67968msec, iops avg=3913.05 (system)
WRITE: bw=3560KiB/s (3646kB/s), 3560KiB/s-3560KiB/s (3646kB/s-3646kB/s), io=1024MiB (1074MB), run=294523-294523msec, iops avg=329.61 (pool1)
WRITE: bw=1328KiB/s (1360kB/s), 1328KiB/s-1328KiB/s (1360kB/s-1360kB/s), io=12.8MiB (13.5MB), run=9901-9901msec, iops avg=334.32 (pool2)


EXT4 (one drive SAMSUNG 860 Pro):
EAD: bw=31.7MiB/s (33.3MB/s), 31.7MiB/s-31.7MiB/s (33.3MB/s-33.3MB/s), io=1024MiB (1074MB), run=32265-32265msec, iops avg=8122.25 (one disk SAMSUNG)
WRITE: bw=522KiB/s (535kB/s), 522KiB/s-522KiB/s (535kB/s-535kB/s), io=153MiB (160MB), run=300005-300005msec, iops avg avg=130.56 (one disk SAMSUNG)
 
Could you show the outputs this commands please?

lshw -class disk -class storage
or
hwinfo --disk

dmidecode -s baseboard-product-name

I have a server with a similar configuration, we could compare the performance
Try my script for the benchmark

usage:

./fio.sh <Directory> <uniq name>
 

Attachments

LSHW

# lshw -class disk -class storage
*-sas
description: Serial Attached SCSI controller
product: SAS2308 PCI-Express Fusion-MPT SAS-2
vendor: LSI Logic / Symbios Logic
physical id: 0
bus info: pci@0000:04:00.0
logical name: scsi0
version: 05
width: 64 bits
clock: 33MHz
capabilities: sas pm pciexpress vpd msi msix bus_master cap_list
configuration: driver=mpt3sas latency=0
resources: irq:51 ioport:5000(size=256) memory:f78f0000-f78fffff memory:f7880000-f78bffff
*-disk:0
description: ATA Disk
product: SSDPR-CL100-120-
physical id: 0.0.0
bus info: scsi@0:0.0.0
logical name: /dev/sda
version: 3.S
serial: GVA014359
size: 111GiB (120GB)
capacity: 111GiB (120GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=df0d9041-3a6f-44c1-965b-0e7579c787ca logicalsectorsize=512 sectorsize=512
*-disk:1
description: ATA Disk
product: Samsung SSD 860
physical id: 0.1.0
bus info: scsi@0:0.1.0
logical name: /dev/sdb
version: 1B6Q
serial: S5GANE0MC15824M
size: 238GiB (256GB)
capacity: 238GiB (256GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=f3457a08-00e9-9344-9832-b8c73a76cc7f logicalsectorsize=512 sectorsize=512
*-disk:2
description: ATA Disk
product: Samsung SSD 860
physical id: 0.2.0
bus info: scsi@0:0.2.0
logical name: /dev/sdc
version: 1B6Q
serial: S42VNF0MC14213V
size: 238GiB (256GB)
capacity: 238GiB (256GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=a94e24ff-f437-1045-bf01-fe98027d4fb5 logicalsectorsize=512 sectorsize=512
*-disk:3
description: ATA Disk
product: IRP-SSDPR-S25C-0
physical id: 0.3.0
bus info: scsi@0:0.3.0
logical name: /dev/sdd
version: 13.2
serial: GV9034815
size: 953GiB (1024GB)
capacity: 953GiB (1024GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=d6a46163-e0d8-7e46-9a60-b909e277727f logicalsectorsize=512 sectorsize=512
*-disk:4
description: ATA Disk
product: IRP-SSDPR-S25C-0
physical id: 0.4.0
bus info: scsi@0:0.4.0
logical name: /dev/sde
version: 13.2
serial: GV6014767
size: 953GiB (1024GB)
capacity: 953GiB (1024GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=a1600d41-4b17-8a4e-b513-d8ae9e4791a2 logicalsectorsize=512 sectorsize=512
*-disk:5
description: ATA Disk
product: SSDPR-CL100-120-
physical id: 0.5.0
bus info: scsi@0:0.5.0
logical name: /dev/sdf
version: 3.S
serial: GVA014015
size: 111GiB (120GB)
capacity: 111GiB (120GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=f89085bf-6e6c-4d55-be6d-2a71843008d9 logicalsectorsize=512 sectorsize=512

Thanks for the script - i changed test size to 10G from 64G

My ZFS settings:
- compression = off,
- atime = off,
- recordsize=128k

- ncq enabled for all disks (cat /sys/block/sda/device/queue_depth = 32)

FIO TEST (system-pool)

/root/fio-test.sh /rpool/data/ (system pool)
write bs=1mb iodepth 4
write: IOPS=131, BW=131MiB/s (138MB/s)(10.0GiB/78055msec); 0 zone resets
clat (usec): min=7, max=22082k, avg=22869.87, stdev=607409.37
read bs=1mb iodepth 4
read: IOPS=2126, BW=2126MiB/s (2230MB/s)(10.0GiB/4816msec)
clat (usec): min=7, max=4815, avg=1411.77, stdev=533.38
read bs=4k iodepth 128
read: IOPS=172k, BW=672MiB/s (705MB/s)(10.0GiB/15228msec)
clat (usec): min=2, max=2410, avg=738.37, stdev=141.45
write bs=4k iodepth 128
write: IOPS=33.4k, BW=131MiB/s (137MB/s)(10.0GiB/78449msec); 0 zone resets
clat (usec): min=3, max=10069k, avg=3801.57, stdev=131882.82
randread bs=4k iodepth 128
read: IOPS=28.5k, BW=111MiB/s (117MB/s)(10.0GiB/92123msec)
clat (usec): min=2, max=11695, avg=4463.70, stdev=847.54
randwrite bs=4k iodepth 128
write: IOPS=1130, BW=4522KiB/s (4630kB/s)(810MiB/183507msec); 0 zone resets
clat (usec): min=33, max=17086k, avg=112349.50, stdev=1161099.46
emulation webserver randrw 70/30 512b-512kb
read: IOPS=2223, BW=16.4MiB/s (17.2MB/s)(2977MiB/181186msec)
clat (msec): min=8, max=20088, avg=78.72, stdev=950.71
write: IOPS=955, BW=7143KiB/s (7315kB/s)(1264MiB/181186msec); 0 zone resets
clat (usec): min=40, max=20088k, avg=83743.95, stdev=992524.02



FIO TEST (pool1)
/root/fio-test.sh /zfs1-vps1/test123/
write bs=1mb iodepth 4
write: IOPS=174, BW=174MiB/s (183MB/s)(10.0GiB/58831msec); 0 zone resets
clat (usec): min=11, max=108877, avg=17238.81, stdev=18745.24
read bs=1mb iodepth 4
read: IOPS=3022, BW=3022MiB/s (3169MB/s)(10.0GiB/3388msec)
clat (usec): min=3, max=4590, avg=993.00, stdev=478.09
read bs=4k iodepth 128
read: IOPS=151k, BW=592MiB/s (620MB/s)(10.0GiB/17306msec)
clat (usec): min=2, max=2758, avg=839.07, stdev=225.18
write bs=4k iodepth 128
write: IOPS=11.6k, BW=45.5MiB/s (47.7MB/s)(8191MiB/180001msec); 0 zone resets
clat (usec): min=8, max=29847, avg=10902.91, stdev=2153.88
randread bs=4k iodepth 128
read: IOPS=28.7k, BW=112MiB/s (118MB/s)(10.0GiB/91203msec)
clat (usec): min=2, max=16174, avg=4419.12, stdev=1016.97
randwrite bs=4k iodepth 128
write: IOPS=880, BW=3520KiB/s (3605kB/s)(619MiB/180001msec); 0 zone resets
clat (usec): min=14, max=316310, avg=144298.93, stdev=109633.85
emulation webserver randrw 70/30 512b-512kb
read: IOPS=1013, BW=9722KiB/s (9955kB/s)(1709MiB/180001msec)
clat (usec): min=11, max=261842, avg=176025.64, stdev=22820.02
write: IOPS=434, BW=4107KiB/s (4206kB/s)(722MiB/180001msec); 0 zone resets
clat (usec): min=878, max=261953, avg=176022.13, stdev=22792.43



FIO TEST (system pool, ncq off/queue_depth=1)

/root/fio-test.sh /rpool/data/
write bs=1mb iodepth 4
write: IOPS=171, BW=172MiB/s (180MB/s)(10.0GiB/59660msec); 0 zone resets
clat (usec): min=3, max=16354k, avg=17480.24, stdev=423993.20
read bs=1mb iodepth 4
read: IOPS=2502, BW=2502MiB/s (2624MB/s)(10.0GiB/4092msec)
clat (usec): min=3, max=4543, avg=1199.29, stdev=281.17
read bs=4k iodepth 128
read: IOPS=168k, BW=656MiB/s (688MB/s)(10.0GiB/15604msec)
clat (usec): min=2, max=2721, avg=756.62, stdev=168.75
write bs=4k iodepth 128
write: IOPS=33.1k, BW=129MiB/s (136MB/s)(10.0GiB/79173msec); 0 zone resets
clat (usec): min=3, max=13992k, avg=3836.70, stdev=166870.66
randread bs=4k iodepth 128
read: IOPS=28.6k, BW=112MiB/s (117MB/s)(10.0GiB/91552msec)
clat (usec): min=3, max=11409, avg=4436.00, stdev=850.21
randwrite bs=4k iodepth 128
write: IOPS=1182, BW=4731KiB/s (4845kB/s)(832MiB/180001msec); 0 zone resets
clat (usec): min=9, max=17207k, avg=107364.51, stdev=1136028.09
emulation webserver randrw 70/30 512b-512kb
read: IOPS=2222, BW=16.5MiB/s (17.3MB/s)(2963MiB/180001msec)
clat (usec): min=28, max=20190k, avg=80256.11, stdev=948934.40
write: IOPS=955, BW=7160KiB/s (7332kB/s)(1259MiB/180001msec); 0 zone resets
clat (usec): min=547, max=20190k, avg=80170.91, stdev=957988.32


FIO TEST (pool1, ncq off/queue_depth=1)
/root/fio-test.sh /zfs1-vps1/test123/
write bs=1mb iodepth 4
write: IOPS=211, BW=211MiB/s (222MB/s)(10.0GiB/48417msec); 0 zone resets
clat (usec): min=8, max=117745, avg=14187.79, stdev=9615.54
read bs=1mb iodepth 4
read: IOPS=2873, BW=2873MiB/s (3013MB/s)(10.0GiB/3564msec)
clat (usec): min=3, max=4568, avg=1044.60, stdev=403.51
read bs=4k iodepth 128
read: IOPS=164k, BW=640MiB/s (671MB/s)(10.0GiB/16005msec)
clat (usec): min=2, max=3845, avg=775.99, stdev=182.62
write bs=4k iodepth 128
write: IOPS=12.2k, BW=47.8MiB/s (50.1MB/s)(8603MiB/180001msec); 0 zone resets
clat (usec): min=8, max=19875, avg=10381.98, stdev=2864.57
randread bs=4k iodepth 128
read: IOPS=28.9k, BW=113MiB/s (118MB/s)(10.0GiB/90823msec)
clat (usec): min=2, max=15603, avg=4400.74, stdev=972.05
randwrite bs=4k iodepth 128
write: IOPS=814, BW=3257KiB/s (3335kB/s)(573MiB/180003msec); 0 zone resets
clat (usec): min=13, max=333106, avg=155972.82, stdev=111567.94
emulation webserver randrw 70/30 512b-512kb
read: IOPS=949, BW=9268KiB/s (9491kB/s)(1629MiB/180001msec)
clat (msec): min=2, max=291, avg=187.80, stdev=26.38
write: IOPS=407, BW=3918KiB/s (4012kB/s)(689MiB/180001msec); 0 zone resets
clat (usec): min=30, max=289095, avg=187855.45, stdev=26346.03
 
Last edited:
The firmware is Firmware version: 15.10.01.00. Is it worth update to P19 or P20?
 
I'd say yes. I have had issues with everything below v18 / P18 - in combination with MDADM though. ZFS wasn't for me back then.
 
Unfortunately, my server configuration is not like yours.

Supermicro X11DDW-NT BIOS 3.3
SATA controller: Intel Corporation Lewisburg SATA Controller [AHCI mode] (rev 09)
RAIDZ 4 x Samsung SSD 860 PRO 4TB Firmware Version: RVM01B6Q

Code:
 write: IOPS=1143, BW=1143MiB/s (1199MB/s)(64.0GiB/57323msec); 0 zone resets
    clat (usec): min=3, max=19860, avg=2625.57, stdev=338.50
read bs=1mb iodepth 4
  read: IOPS=2293, BW=2294MiB/s (2405MB/s)(64.0GiB/28574msec)
    clat (usec): min=2, max=15247, avg=1308.93, stdev=393.33
read bs=4k iodepth 128
  read: IOPS=205k, BW=801MiB/s (840MB/s)(64.0GiB/81853msec)
    clat (usec): min=2, max=8222, avg=620.22, stdev=70.17
write bs=4k iodepth 128
  write: IOPS=111k, BW=435MiB/s (456MB/s)(64.0GiB/150559msec); 0 zone resets
    clat (usec): min=3, max=102057, avg=1140.40, stdev=481.99
randread bs=4k iodepth 128
  read: IOPS=22.6k, BW=88.1MiB/s (92.4MB/s)(15.5GiB/180001msec)
    clat (usec): min=2, max=15660, avg=5630.11, stdev=387.70
randwrite bs=4k iodepth 128
  write: IOPS=12.7k, BW=49.7MiB/s (52.2MB/s)(8955MiB/180001msec); 0 zone resets
    clat (usec): min=5, max=123727, avg=9973.03, stdev=3069.27
emulation webserver randrw 70/30 512b-512kb
  read: IOPS=11.6k, BW=91.2MiB/s (95.6MB/s)(16.0GiB/180001msec)
    clat (usec): min=94, max=82661, avg=15432.63, stdev=3386.68
  write: IOPS=4957, BW=39.0MiB/s (40.9MB/s)(7022MiB/180001msec); 0 zone resets
    clat (usec): min=28, max=82616, avg=15437.90, stdev=3393.87

pveperf FSYNCS/SECOND: 726

The benchmark with Single SSD EXT4 mount -o noatime,discard:

Code:
write bs=1mb iodepth 4
  write: IOPS=512, BW=513MiB/s (538MB/s)(64.0GiB/127844msec); 0 zone resets
    clat (usec): min=2086, max=30517, avg=7640.86, stdev=286.06
read bs=1mb iodepth 4
  read: IOPS=540, BW=541MiB/s (567MB/s)(64.0GiB/121165msec)
    clat (usec): min=2122, max=10914, avg=7249.80, stdev=80.91
read bs=4k iodepth 128
  read: IOPS=126k, BW=490MiB/s (514MB/s)(64.0GiB/133640msec)
    clat (usec): min=277, max=4012, avg=1017.21, stdev=27.55
write bs=4k iodepth 128
  write: IOPS=116k, BW=455MiB/s (477MB/s)(64.0GiB/144093msec); 0 zone resets
    clat (usec): min=163, max=5934, avg=1096.71, stdev=53.29
randread bs=4k iodepth 128
  read: IOPS=96.5k, BW=377MiB/s (395MB/s)(64.0GiB/173880msec)
    clat (usec): min=575, max=9601, avg=1317.19, stdev=211.90
randwrite bs=4k iodepth 128
  write: IOPS=81.3k, BW=318MiB/s (333MB/s)(55.8GiB/180011msec); 0 zone resets
    clat (usec): min=810, max=19487, avg=1562.29, stdev=373.08
emulation webserver randrw 70/30 512b-512kb
  read: IOPS=34.7k, BW=183MiB/s (192MB/s)(32.2GiB/180002msec)
    clat (usec): min=938, max=27300, avg=5138.51, stdev=1614.50
  write: IOPS=14.9k, BW=78.4MiB/s (82.2MB/s)(13.8GiB/180002msec); 0 zone resets
    clat (usec): min=1103, max=30256, avg=5159.62, stdev=1711.65
 
Testing variables:
- firmware P20.0.7.0 (Crossflash from H220 to Generic LSI 9207-8e),
- compression = off,
- atime = off,
- recordsize=128k,
- ncq disabled (cat /sys/block/sda/device/queue_depth = 1),
- ashift = 12,

After upgrading to H220 to P20 (crossflash to: LSI 9207-8e firmware)

FIO TEST (system pool, ncq off/queue_depth=1)

write bs=1mb iodepth 4
write: IOPS=118, BW=119MiB/s (124MB/s)(10.0GiB/86367msec); 0 zone resets
clat (usec): min=4, max=20830k, avg=25305.85, stdev=588079.02
read bs=1mb iodepth 4
read: IOPS=2512, BW=2513MiB/s (2635MB/s)(10.0GiB/4075msec)
clat (usec): min=3, max=4488, avg=1194.31, stdev=281.53
read bs=4k iodepth 128
read: IOPS=165k, BW=644MiB/s (676MB/s)(10.0GiB/15891msec)
clat (usec): min=2, max=2350, avg=770.55, stdev=147.48
write bs=4k iodepth 128
write: IOPS=28.5k, BW=111MiB/s (117MB/s)(10.0GiB/92117msec); 0 zone resets
clat (usec): min=3, max=14180k, avg=4463.72, stdev=175793.89
randread bs=4k iodepth 128
read: IOPS=27.0k, BW=109MiB/s (115MB/s)(10.0GiB/93695msec)
clat (usec): min=2, max=11616, avg=4539.83, stdev=868.79
randwrite bs=4k iodepth 128
write: IOPS=872, BW=3488KiB/s (3572kB/s)(634MiB/186158msec); 0 zone resets
clat (usec): min=26, max=25074k, avg=145621.18, stdev=1573254.69
emulation webserver randrw 70/30 512b-512kb
read: IOPS=2088, BW=15.4MiB/s (16.1MB/s)(2996MiB/194708msec)
clat (msec): min=8, max=26055, avg=85.40, stdev=1152.56
write: IOPS=896, BW=6689KiB/s (6850kB/s)(1272MiB/194708msec); 0 zone resets
clat (usec): min=39, max=26055k, avg=85443.87, stdev=1142500.79


FIO TEST (pool1, ncq off/queue_depth=1)
write bs=1mb iodepth 4
write: IOPS=520, BW=521MiB/s (546MB/s)(10.0GiB/19673msec); 0 zone resets
clat (usec): min=10, max=9296, avg=5766.45, stdev=2663.81
read bs=1mb iodepth 4
read: IOPS=2324, BW=2324MiB/s (2437MB/s)(10.0GiB/4406msec)
clat (usec): min=3, max=4773, avg=1291.27, stdev=249.54
read bs=4k iodepth 128
read: IOPS=169k, BW=662MiB/s (694MB/s)(10.0GiB/15466msec)
clat (usec): min=2, max=3695, avg=749.92, stdev=155.66
write bs=4k iodepth 128
write: IOPS=86.0k, BW=336MiB/s (352MB/s)(10.0GiB/30470msec); 0 zone resets
clat (usec): min=3, max=5180, avg=1477.03, stdev=299.92
randread bs=4k iodepth 128
read: IOPS=28.0k, BW=110MiB/s (115MB/s)(10.0GiB/93515msec)
clat (usec): min=2, max=12255, avg=4531.08, stdev=938.18
randwrite bs=4k iodepth 128
write: IOPS=3651, BW=14.3MiB/s (14.0MB/s)(2568MiB/180001msec); 0 zone resets
clat (usec): min=9, max=132642, avg=34780.66, stdev=5308.62
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9865, BW=40.8MiB/s (42.8MB/s)(7177MiB/175700msec)
clat (usec): min=67, max=38403, avg=18082.66, stdev=6228.58
write: IOPS=4229, BW=17.4MiB/s (18.3MB/s)(3063MiB/175700msec); 0 zone resets
clat (usec): min=12, max=38673, avg=18112.79, stdev=6227.13


FIO TEST (pool2 ncq off/queue_depth=1)
write bs=1mb iodepth 4
write: IOPS=473, BW=473MiB/s (496MB/s)(10.0GiB/21640msec); 0 zone resets
clat (usec): min=9, max=30601, avg=6341.10, stdev=2224.91
read bs=1mb iodepth 4
read: IOPS=960, BW=961MiB/s (1008MB/s)(10.0GiB/10656msec)
clat (usec): min=2, max=9407, avg=3122.13, stdev=1102.09
read bs=4k iodepth 128
read: IOPS=131k, BW=511MiB/s (535MB/s)(10.0GiB/20057msec)
clat (usec): min=2, max=4795, avg=972.25, stdev=159.78
write bs=4k iodepth 128
write: IOPS=60.5k, BW=236MiB/s (248MB/s)(10.0GiB/43342msec); 0 zone resets
clat (usec): min=3, max=7089, avg=2100.76, stdev=917.96
randread bs=4k iodepth 128
read: IOPS=63.4k, BW=248MiB/s (260MB/s)(10.0GiB/41349msec)
clat (usec): min=2, max=12485, avg=2003.65, stdev=459.32
randwrite bs=4k iodepth 128
write: IOPS=21.7k, BW=84.8MiB/s (88.9MB/s)(10.0GiB/120747msec); 0 zone resets
clat (usec): min=3, max=13451, avg=5851.61, stdev=2222.43
emulation webserver randrw 70/30 512b-512kb
read: IOPS=26.5k, BW=110MiB/s (115MB/s)(7177MiB/65355msec)
clat (usec): min=24, max=25868, avg=6727.14, stdev=3649.66
write: IOPS=11.4k, BW=46.9MiB/s (49.1MB/s)(3063MiB/65355msec); 0 zone resets
clat (usec): min=10, max=25834, avg=6736.05, stdev=3657.01

FIO TEST (ext4 single drive, ncq off/queue_depth=1)
write bs=1mb iodepth 4
write: IOPS=473, BW=473MiB/s (496MB/s)(10.0GiB/21639msec); 0 zone resets
clat (usec): min=2334, max=15815, avg=8308.61, stdev=194.15
read bs=1mb iodepth 4
read: IOPS=503, BW=503MiB/s (528MB/s)(10.0GiB/20340msec)
clat (usec): min=3953, max=11802, avg=7834.96, stdev=120.67
read bs=4k iodepth 128
read: IOPS=106k, BW=416MiB/s (436MB/s)(10.0GiB/24645msec)
clat (usec): min=570, max=3907, avg=1196.66, stdev=151.64
write bs=4k iodepth 128
write: IOPS=102k, BW=400MiB/s (419MB/s)(10.0GiB/25626msec); 0 zone resets
clat (usec): min=679, max=11261, avg=1244.52, stdev=229.33
randread bs=4k iodepth 128
read: IOPS=9733, BW=38.0MiB/s (39.9MB/s)(6844MiB/180013msec)
clat (usec): min=187, max=215402, avg=13131.91, stdev=9060.26
randwrite bs=4k iodepth 128
write: IOPS=22.2k, BW=86.8MiB/s (90.0MB/s)(10.0GiB/118032msec); 0 zone resets
clat (usec): min=75, max=92003, avg=5744.31, stdev=4091.72
emulation webserver randrw 70/30 512b-512kb
read: IOPS=6985, BW=33.3MiB/s (34.9MB/s)(5990MiB/180026msec)
clat (usec): min=177, max=152845, avg=30506.63, stdev=26359.95
write: IOPS=2994, BW=14.2MiB/s (14.9MB/s)(2553MiB/180026msec); 0 zone resets
clat (usec): min=64, max=116782, avg=14234.82, stdev=13641.12
 
Next test I did is different ashift:

ASHIFT = 0
write bs=1mb iodepth 4
write: IOPS=509, BW=510MiB/s (535MB/s)(10.0GiB/20087msec); 0 zone resets
clat (usec): min=8, max=107857, avg=5888.93, stdev=2487.83
read bs=1mb iodepth 4
read: IOPS=2244, BW=2244MiB/s (2353MB/s)(10.0GiB/4563msec)
clat (usec): min=3, max=4786, avg=1337.35, stdev=255.68
read bs=4k iodepth 128
read: IOPS=166k, BW=649MiB/s (681MB/s)(10.0GiB/15774msec)
clat (usec): min=2, max=2873, avg=764.83, stdev=187.12
write bs=4k iodepth 128
write: IOPS=87.2k, BW=341MiB/s (357MB/s)(10.0GiB/30050msec); 0 zone resets
clat (usec): min=3, max=4311, avg=1456.65, stdev=344.98
randread bs=4k iodepth 128
read: IOPS=26.9k, BW=105MiB/s (110MB/s)(10.0GiB/97360msec)
clat (usec): min=3, max=12494, avg=4717.36, stdev=894.87
randwrite bs=4k iodepth 128
write: IOPS=3653, BW=14.3MiB/s (14.0MB/s)(2569MiB/180001msec); 0 zone resets
clat (usec): min=10, max=125955, avg=34761.49, stdev=5461.84
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9878, BW=40.9MiB/s (42.9MB/s)(7177MiB/175485msec)
clat (usec): min=39, max=35740, avg=18064.05, stdev=6126.58
write: IOPS=4234, BW=17.5MiB/s (18.3MB/s)(3063MiB/175485msec); 0 zone resets
clat (usec): min=8, max=36173, avg=18085.19, stdev=6120.08


ASHIFT = 9
write bs=1mb iodepth 4
write: IOPS=513, BW=514MiB/s (539MB/s)(10.0GiB/19939msec); 0 zone resets
clat (usec): min=16, max=110592, avg=5845.04, stdev=3643.51
read bs=1mb iodepth 4
read: IOPS=2224, BW=2224MiB/s (2332MB/s)(10.0GiB/4604msec)
clat (usec): min=3, max=4693, avg=1349.42, stdev=323.85
read bs=4k iodepth 128
read: IOPS=172k, BW=671MiB/s (704MB/s)(10.0GiB/15253msec)
clat (usec): min=3, max=2374, avg=739.55, stdev=137.63
write bs=4k iodepth 128
write: IOPS=83.5k, BW=326MiB/s (342MB/s)(10.0GiB/31399msec); 0 zone resets
clat (usec): min=3, max=5196, avg=1522.05, stdev=359.36
randread bs=4k iodepth 128
read: IOPS=25.8k, BW=101MiB/s (105MB/s)(10.0GiB/101778msec)
clat (usec): min=2, max=12178, avg=4931.37, stdev=958.71
randwrite bs=4k iodepth 128
write: IOPS=3650, BW=14.3MiB/s (14.9MB/s)(2566MiB/180001msec); 0 zone resets
clat (usec): min=10, max=123027, avg=34796.32, stdev=5219.97
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9869, BW=40.9MiB/s (42.8MB/s)(7177MiB/175639msec)
clat (usec): min=59, max=37972, avg=18077.05, stdev=6205.07
write: IOPS=4231, BW=17.4MiB/s (18.3MB/s)(3063MiB/175639msec); 0 zone resets
clat (usec): min=43, max=37919, avg=18106.69, stdev=6200.01



ASHIFT=12
write bs=1mb iodepth 4
write: IOPS=508, BW=509MiB/s (533MB/s)(10.0GiB/20137msec); 0 zone resets
clat (usec): min=16, max=11319, avg=5903.43, stdev=1819.80
read bs=1mb iodepth 4
read: IOPS=2424, BW=2424MiB/s (2542MB/s)(10.0GiB/4224msec)
clat (usec): min=3, max=4757, avg=1238.06, stdev=245.12
read bs=4k iodepth 128
read: IOPS=172k, BW=671MiB/s (704MB/s)(10.0GiB/15258msec)
clat (usec): min=2, max=2485, avg=739.78, stdev=142.23
write bs=4k iodepth 128
write: IOPS=81.5k, BW=318MiB/s (334MB/s)(10.0GiB/32180msec); 0 zone resets
clat (usec): min=3, max=4915, avg=1559.91, stdev=389.57
randread bs=4k iodepth 128
read: IOPS=27.0k, BW=109MiB/s (115MB/s)(10.0GiB/93770msec)
clat (usec): min=2, max=12628, avg=4543.48, stdev=905.01
randwrite bs=4k iodepth 128
write: IOPS=3662, BW=14.3MiB/s (15.0MB/s)(2575MiB/180001msec); 0 zone resets
clat (usec): min=10, max=130044, avg=34678.70, stdev=5288.31
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9911, BW=41.0MiB/s (43.0MB/s)(7177MiB/174901msec)
clat (usec): min=70, max=35210, avg=18000.66, stdev=6298.07
write: IOPS=4249, BW=17.5MiB/s (18.4MB/s)(3063MiB/174901msec); 0 zone resets
clat (usec): min=12, max=35153, avg=18030.61, stdev=6285.10


ASHIFT=13
write bs=1mb iodepth 4
write: IOPS=509, BW=510MiB/s (535MB/s)(10.0GiB/20082msec); 0 zone resets
clat (usec): min=8, max=10417, avg=5887.34, stdev=1840.02
read bs=1mb iodepth 4
read: IOPS=2294, BW=2295MiB/s (2406MB/s)(10.0GiB/4462msec)
clat (usec): min=3, max=4633, avg=1307.79, stdev=252.53
read bs=4k iodepth 128
read: IOPS=175k, BW=684MiB/s (717MB/s)(10.0GiB/14970msec)
clat (usec): min=2, max=2432, avg=725.86, stdev=132.07
write bs=4k iodepth 128
write: IOPS=80.5k, BW=315MiB/s (330MB/s)(10.0GiB/32545msec); 0 zone resets
clat (usec): min=3, max=5157, avg=1577.59, stdev=398.41
randread bs=4k iodepth 128
read: IOPS=26.8k, BW=105MiB/s (110MB/s)(10.0GiB/97940msec)
clat (usec): min=2, max=12043, avg=4745.48, stdev=1000.95
randwrite bs=4k iodepth 128
write: IOPS=3658, BW=14.3MiB/s (14.0MB/s)(2572MiB/180001msec); 0 zone resets
clat (usec): min=10, max=42706, avg=34718.98, stdev=4880.64
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9869, BW=40.9MiB/s (42.8MB/s)(7177MiB/175631msec)
clat (usec): min=58, max=37680, avg=18077.16, stdev=6217.01
write: IOPS=4231, BW=17.4MiB/s (18.3MB/s)(3063MiB/175631msec); 0 zone resets
clat (usec): min=22, max=37568, avg=18104.19, stdev=6213.95
 
Last edited:
I did some tests for others with different recordsizes.

recordsize 32K, pool1
write bs=1mb iodepth 4
write: IOPS=499, BW=500MiB/s (524MB/s)(10.0GiB/20485msec); 0 zone resets
clat (usec): min=8, max=12788, avg=6004.62, stdev=2669.45
read bs=1mb iodepth 4
read: IOPS=2323, BW=2324MiB/s (2436MB/s)(10.0GiB/4407msec)
clat (usec): min=3, max=4742, avg=1291.60, stdev=259.51
read bs=4k iodepth 128
read: IOPS=163k, BW=635MiB/s (666MB/s)(10.0GiB/16115msec)
clat (usec): min=2, max=2432, avg=781.36, stdev=163.83
write bs=4k iodepth 128
write: IOPS=81.9k, BW=320MiB/s (335MB/s)(10.0GiB/32011msec); 0 zone resets
clat (usec): min=3, max=3737, avg=1551.72, stdev=318.67
randread bs=4k iodepth 128
read: IOPS=27.0k, BW=106MiB/s (111MB/s)(10.0GiB/97059msec)
clat (usec): min=2, max=12486, avg=4702.73, stdev=904.91
randwrite bs=4k iodepth 128
write: IOPS=3645, BW=14.2MiB/s (14.9MB/s)(2563MiB/180001msec); 0 zone resets
clat (usec): min=8, max=121420, avg=34840.43, stdev=5102.03
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9848, BW=40.8MiB/s (42.8MB/s)(7177MiB/176007msec)
clat (usec): min=67, max=42084, avg=18114.50, stdev=6203.74
write: IOPS=4222, BW=17.4MiB/s (18.2MB/s)(3063MiB/176007msec); 0 zone resets
clat (usec): min=46, max=42067, avg=18146.24, stdev=6205.03

recordsize 16K, pool1
write bs=1mb iodepth 4
write: IOPS=449, BW=450MiB/s (472MB/s)(10.0GiB/22762msec); 0 zone resets
clat (usec): min=9, max=25053, avg=6670.56, stdev=2414.50
read bs=1mb iodepth 4
read: IOPS=967, BW=968MiB/s (1015MB/s)(10.0GiB/10581msec)
clat (usec): min=2, max=14562, avg=3100.03, stdev=966.29
read bs=4k iodepth 128
read: IOPS=128k, BW=501MiB/s (525MB/s)(10.0GiB/20447msec)
clat (usec): min=2, max=4831, avg=991.14, stdev=183.55
write bs=4k iodepth 128
write: IOPS=55.6k, BW=217MiB/s (228MB/s)(10.0GiB/47161msec); 0 zone resets
clat (usec): min=3, max=7235, avg=2285.88, stdev=1130.46
randread bs=4k iodepth 128
read: IOPS=56.3k, BW=220MiB/s (231MB/s)(10.0GiB/46563msec)
clat (usec): min=3, max=14885, avg=2256.44, stdev=487.97
randwrite bs=4k iodepth 128
write: IOPS=20.4k, BW=79.5MiB/s (83.4MB/s)(10.0GiB/128752msec); 0 zone resets
clat (usec): min=7, max=13696, avg=6239.55, stdev=2259.08
emulation webserver randrw 70/30 512b-512kb
read: IOPS=25.4k, BW=105MiB/s (110MB/s)(7177MiB/68113msec)
clat (usec): min=20, max=23596, avg=7014.93, stdev=3802.11
write: IOPS=10.9k, BW=44.0MiB/s (47.2MB/s)(3063MiB/68113msec); 0 zone resets
clat (usec): min=10, max=23563, avg=7010.79, stdev=3795.70



recordsize 8K, pool1
write bs=1mb iodepth 4
write: IOPS=347, BW=348MiB/s (365MB/s)(10.0GiB/29442msec); 0 zone resets
clat (usec): min=6, max=23186, avg=8626.54, stdev=2963.52
read bs=1mb iodepth 4
read: IOPS=705, BW=705MiB/s (740MB/s)(10.0GiB/14515msec)
clat (usec): min=3, max=13876, avg=4251.95, stdev=1026.51
read bs=4k iodepth 128
read: IOPS=107k, BW=416MiB/s (437MB/s)(10.0GiB/24586msec)
clat (usec): min=2, max=7246, avg=1191.54, stdev=195.43
write bs=4k iodepth 128
write: IOPS=45.8k, BW=179MiB/s (188MB/s)(10.0GiB/57222msec); 0 zone resets
clat (usec): min=3, max=9356, avg=2773.39, stdev=1639.92
randread bs=4k iodepth 128
read: IOPS=67.3k, BW=263MiB/s (276MB/s)(10.0GiB/38972msec)
clat (usec): min=3, max=15985, avg=1888.53, stdev=610.87
randwrite bs=4k iodepth 128
write: IOPS=24.7k, BW=96.4MiB/s (101MB/s)(10.0GiB/106265msec); 0 zone resets
clat (usec): min=3, max=14988, avg=5149.83, stdev=2673.49
emulation webserver randrw 70/30 512b-512kb
read: IOPS=25.8k, BW=107MiB/s (112MB/s)(7177MiB/67209msec)
clat (usec): min=18, max=35565, avg=6921.36, stdev=4596.51
write: IOPS=11.1k, BW=45.6MiB/s (47.8MB/s)(3063MiB/67209msec); 0 zone resets
clat (usec): min=7, max=35492, avg=6919.02, stdev=4587.26
 
@sa10 could You give me information about your test envirnoment?

For real good compare I need ZFS dataset parameters:
- recordsize,
- ashift,
- compression,
- atime.
 
I changed the script a bit to bring the result closer to reality.
I added the ability to compress data at the level of 30% (PERC="30") as on my production system
That changed the results but quite a little bit.
 

Attachments

From my point of view:
- flashing generic LSI P20.0.7.0 version to H220 LSI gets about 30% !!!!!!!!
- turned off NCQ (for i in a b c d e f; do echo 1 > /sys/block/sd$i/device/queue_depth; done) - we get about 3-6%
- changed scheduler to none from mq-deadline (for i in a b c d e f; do echo none > /sys/block/sd$i/queue/scheduler; done) - also few percent,
- create pool with ashift = 12 also few percent,

I think that I won't get more from my SSD drives ;-)

I marked this as TUTORIAL because many people ask regulary here about performance on zfs.