[TUTORIAL] Proxmox ZFS raid1 performance

yarii

Renowned Member
Mar 24, 2014
145
8
83
Got problems with performance of Proxmox:

I have got:
- Proxmox 6.x (clean install from ISO)m
- H220 SAS card in IT mode,
- 6x different SSD drives:
2x Goodram 120GB (system)
2x Samsung 860 PRO 256GB (pool1)
2x IRDM Pro 1TB (pool2)

hdparm -t -T /dev/sd(a-f)
gets about 520MB/s read

but when I run pveperf I get only:

for system pool:
FSYNC/SECOND: 2100

for pool1:
FSYNC/SECOND: 315

for pool2:
FSYNC/SECOND: 355

What went wrong?

I created pool1 with:
zpool create pool1 mirror /dev/disk/by-id/ata-SAMSUNG-* -o ashift=12
 
After upgrade to zfs 0.8.3, the results is more:
FSYNCS/SECOND: 2811.57
FSYNCS/SECOND: 394.92
FSYNCS/SECOND: 359.10

But why the hell the system pool is almost 7 times faster at the same controller.
 
Try to disassemble the pool, and check the benchmark with a single disk and ext4 file system.
 
fio --filename=test --sync=1 --rw=randread --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=1G --runtime=300 && rm test

READ: bw=526MiB/s (552MB/s), 526MiB/s-526MiB/s (552MB/s-552MB/s), io=1024MiB (1074MB), run=1945-1945msec, iops avg=149850.67 (system)
READ: bw=526MiB/s (552MB/s), 526MiB/s-526MiB/s (552MB/s-552MB/s), io=1024MiB (1074MB), run=1945-1945msec, iops avg=126670.00 (pool1)
READ: bw=611MiB/s (640MB/s), 611MiB/s-611MiB/s (640MB/s-640MB/s), io=1024MiB (1074MB), run=1677-1677msec, iops avg=146170.67 (pool2)

fio --filename=test --sync=1 --rw=randwrite --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=1G --runtime=300 && rm test

WRITE: bw=15.1MiB/s (15.8MB/s), 15.1MiB/s-15.1MiB/s (15.8MB/s-15.8MB/s), io=1024MiB (1074MB), run=67968-67968msec, iops avg=3913.05 (system)
WRITE: bw=3560KiB/s (3646kB/s), 3560KiB/s-3560KiB/s (3646kB/s-3646kB/s), io=1024MiB (1074MB), run=294523-294523msec, iops avg=329.61 (pool1)
WRITE: bw=1328KiB/s (1360kB/s), 1328KiB/s-1328KiB/s (1360kB/s-1360kB/s), io=12.8MiB (13.5MB), run=9901-9901msec, iops avg=334.32 (pool2)


EXT4 (one drive SAMSUNG 860 Pro):
EAD: bw=31.7MiB/s (33.3MB/s), 31.7MiB/s-31.7MiB/s (33.3MB/s-33.3MB/s), io=1024MiB (1074MB), run=32265-32265msec, iops avg=8122.25 (one disk SAMSUNG)
WRITE: bw=522KiB/s (535kB/s), 522KiB/s-522KiB/s (535kB/s-535kB/s), io=153MiB (160MB), run=300005-300005msec, iops avg avg=130.56 (one disk SAMSUNG)
 
Could you show the outputs this commands please?

lshw -class disk -class storage
or
hwinfo --disk

dmidecode -s baseboard-product-name

I have a server with a similar configuration, we could compare the performance
Try my script for the benchmark

usage:

./fio.sh <Directory> <uniq name>
 

Attachments

LSHW

# lshw -class disk -class storage
*-sas
description: Serial Attached SCSI controller
product: SAS2308 PCI-Express Fusion-MPT SAS-2
vendor: LSI Logic / Symbios Logic
physical id: 0
bus info: pci@0000:04:00.0
logical name: scsi0
version: 05
width: 64 bits
clock: 33MHz
capabilities: sas pm pciexpress vpd msi msix bus_master cap_list
configuration: driver=mpt3sas latency=0
resources: irq:51 ioport:5000(size=256) memory:f78f0000-f78fffff memory:f7880000-f78bffff
*-disk:0
description: ATA Disk
product: SSDPR-CL100-120-
physical id: 0.0.0
bus info: scsi@0:0.0.0
logical name: /dev/sda
version: 3.S
serial: GVA014359
size: 111GiB (120GB)
capacity: 111GiB (120GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=df0d9041-3a6f-44c1-965b-0e7579c787ca logicalsectorsize=512 sectorsize=512
*-disk:1
description: ATA Disk
product: Samsung SSD 860
physical id: 0.1.0
bus info: scsi@0:0.1.0
logical name: /dev/sdb
version: 1B6Q
serial: S5GANE0MC15824M
size: 238GiB (256GB)
capacity: 238GiB (256GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=f3457a08-00e9-9344-9832-b8c73a76cc7f logicalsectorsize=512 sectorsize=512
*-disk:2
description: ATA Disk
product: Samsung SSD 860
physical id: 0.2.0
bus info: scsi@0:0.2.0
logical name: /dev/sdc
version: 1B6Q
serial: S42VNF0MC14213V
size: 238GiB (256GB)
capacity: 238GiB (256GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=a94e24ff-f437-1045-bf01-fe98027d4fb5 logicalsectorsize=512 sectorsize=512
*-disk:3
description: ATA Disk
product: IRP-SSDPR-S25C-0
physical id: 0.3.0
bus info: scsi@0:0.3.0
logical name: /dev/sdd
version: 13.2
serial: GV9034815
size: 953GiB (1024GB)
capacity: 953GiB (1024GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=d6a46163-e0d8-7e46-9a60-b909e277727f logicalsectorsize=512 sectorsize=512
*-disk:4
description: ATA Disk
product: IRP-SSDPR-S25C-0
physical id: 0.4.0
bus info: scsi@0:0.4.0
logical name: /dev/sde
version: 13.2
serial: GV6014767
size: 953GiB (1024GB)
capacity: 953GiB (1024GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=a1600d41-4b17-8a4e-b513-d8ae9e4791a2 logicalsectorsize=512 sectorsize=512
*-disk:5
description: ATA Disk
product: SSDPR-CL100-120-
physical id: 0.5.0
bus info: scsi@0:0.5.0
logical name: /dev/sdf
version: 3.S
serial: GVA014015
size: 111GiB (120GB)
capacity: 111GiB (120GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=6 guid=f89085bf-6e6c-4d55-be6d-2a71843008d9 logicalsectorsize=512 sectorsize=512

Thanks for the script - i changed test size to 10G from 64G

My ZFS settings:
- compression = off,
- atime = off,
- recordsize=128k

- ncq enabled for all disks (cat /sys/block/sda/device/queue_depth = 32)

FIO TEST (system-pool)

/root/fio-test.sh /rpool/data/ (system pool)
write bs=1mb iodepth 4
write: IOPS=131, BW=131MiB/s (138MB/s)(10.0GiB/78055msec); 0 zone resets
clat (usec): min=7, max=22082k, avg=22869.87, stdev=607409.37
read bs=1mb iodepth 4
read: IOPS=2126, BW=2126MiB/s (2230MB/s)(10.0GiB/4816msec)
clat (usec): min=7, max=4815, avg=1411.77, stdev=533.38
read bs=4k iodepth 128
read: IOPS=172k, BW=672MiB/s (705MB/s)(10.0GiB/15228msec)
clat (usec): min=2, max=2410, avg=738.37, stdev=141.45
write bs=4k iodepth 128
write: IOPS=33.4k, BW=131MiB/s (137MB/s)(10.0GiB/78449msec); 0 zone resets
clat (usec): min=3, max=10069k, avg=3801.57, stdev=131882.82
randread bs=4k iodepth 128
read: IOPS=28.5k, BW=111MiB/s (117MB/s)(10.0GiB/92123msec)
clat (usec): min=2, max=11695, avg=4463.70, stdev=847.54
randwrite bs=4k iodepth 128
write: IOPS=1130, BW=4522KiB/s (4630kB/s)(810MiB/183507msec); 0 zone resets
clat (usec): min=33, max=17086k, avg=112349.50, stdev=1161099.46
emulation webserver randrw 70/30 512b-512kb
read: IOPS=2223, BW=16.4MiB/s (17.2MB/s)(2977MiB/181186msec)
clat (msec): min=8, max=20088, avg=78.72, stdev=950.71
write: IOPS=955, BW=7143KiB/s (7315kB/s)(1264MiB/181186msec); 0 zone resets
clat (usec): min=40, max=20088k, avg=83743.95, stdev=992524.02



FIO TEST (pool1)
/root/fio-test.sh /zfs1-vps1/test123/
write bs=1mb iodepth 4
write: IOPS=174, BW=174MiB/s (183MB/s)(10.0GiB/58831msec); 0 zone resets
clat (usec): min=11, max=108877, avg=17238.81, stdev=18745.24
read bs=1mb iodepth 4
read: IOPS=3022, BW=3022MiB/s (3169MB/s)(10.0GiB/3388msec)
clat (usec): min=3, max=4590, avg=993.00, stdev=478.09
read bs=4k iodepth 128
read: IOPS=151k, BW=592MiB/s (620MB/s)(10.0GiB/17306msec)
clat (usec): min=2, max=2758, avg=839.07, stdev=225.18
write bs=4k iodepth 128
write: IOPS=11.6k, BW=45.5MiB/s (47.7MB/s)(8191MiB/180001msec); 0 zone resets
clat (usec): min=8, max=29847, avg=10902.91, stdev=2153.88
randread bs=4k iodepth 128
read: IOPS=28.7k, BW=112MiB/s (118MB/s)(10.0GiB/91203msec)
clat (usec): min=2, max=16174, avg=4419.12, stdev=1016.97
randwrite bs=4k iodepth 128
write: IOPS=880, BW=3520KiB/s (3605kB/s)(619MiB/180001msec); 0 zone resets
clat (usec): min=14, max=316310, avg=144298.93, stdev=109633.85
emulation webserver randrw 70/30 512b-512kb
read: IOPS=1013, BW=9722KiB/s (9955kB/s)(1709MiB/180001msec)
clat (usec): min=11, max=261842, avg=176025.64, stdev=22820.02
write: IOPS=434, BW=4107KiB/s (4206kB/s)(722MiB/180001msec); 0 zone resets
clat (usec): min=878, max=261953, avg=176022.13, stdev=22792.43



FIO TEST (system pool, ncq off/queue_depth=1)

/root/fio-test.sh /rpool/data/
write bs=1mb iodepth 4
write: IOPS=171, BW=172MiB/s (180MB/s)(10.0GiB/59660msec); 0 zone resets
clat (usec): min=3, max=16354k, avg=17480.24, stdev=423993.20
read bs=1mb iodepth 4
read: IOPS=2502, BW=2502MiB/s (2624MB/s)(10.0GiB/4092msec)
clat (usec): min=3, max=4543, avg=1199.29, stdev=281.17
read bs=4k iodepth 128
read: IOPS=168k, BW=656MiB/s (688MB/s)(10.0GiB/15604msec)
clat (usec): min=2, max=2721, avg=756.62, stdev=168.75
write bs=4k iodepth 128
write: IOPS=33.1k, BW=129MiB/s (136MB/s)(10.0GiB/79173msec); 0 zone resets
clat (usec): min=3, max=13992k, avg=3836.70, stdev=166870.66
randread bs=4k iodepth 128
read: IOPS=28.6k, BW=112MiB/s (117MB/s)(10.0GiB/91552msec)
clat (usec): min=3, max=11409, avg=4436.00, stdev=850.21
randwrite bs=4k iodepth 128
write: IOPS=1182, BW=4731KiB/s (4845kB/s)(832MiB/180001msec); 0 zone resets
clat (usec): min=9, max=17207k, avg=107364.51, stdev=1136028.09
emulation webserver randrw 70/30 512b-512kb
read: IOPS=2222, BW=16.5MiB/s (17.3MB/s)(2963MiB/180001msec)
clat (usec): min=28, max=20190k, avg=80256.11, stdev=948934.40
write: IOPS=955, BW=7160KiB/s (7332kB/s)(1259MiB/180001msec); 0 zone resets
clat (usec): min=547, max=20190k, avg=80170.91, stdev=957988.32


FIO TEST (pool1, ncq off/queue_depth=1)
/root/fio-test.sh /zfs1-vps1/test123/
write bs=1mb iodepth 4
write: IOPS=211, BW=211MiB/s (222MB/s)(10.0GiB/48417msec); 0 zone resets
clat (usec): min=8, max=117745, avg=14187.79, stdev=9615.54
read bs=1mb iodepth 4
read: IOPS=2873, BW=2873MiB/s (3013MB/s)(10.0GiB/3564msec)
clat (usec): min=3, max=4568, avg=1044.60, stdev=403.51
read bs=4k iodepth 128
read: IOPS=164k, BW=640MiB/s (671MB/s)(10.0GiB/16005msec)
clat (usec): min=2, max=3845, avg=775.99, stdev=182.62
write bs=4k iodepth 128
write: IOPS=12.2k, BW=47.8MiB/s (50.1MB/s)(8603MiB/180001msec); 0 zone resets
clat (usec): min=8, max=19875, avg=10381.98, stdev=2864.57
randread bs=4k iodepth 128
read: IOPS=28.9k, BW=113MiB/s (118MB/s)(10.0GiB/90823msec)
clat (usec): min=2, max=15603, avg=4400.74, stdev=972.05
randwrite bs=4k iodepth 128
write: IOPS=814, BW=3257KiB/s (3335kB/s)(573MiB/180003msec); 0 zone resets
clat (usec): min=13, max=333106, avg=155972.82, stdev=111567.94
emulation webserver randrw 70/30 512b-512kb
read: IOPS=949, BW=9268KiB/s (9491kB/s)(1629MiB/180001msec)
clat (msec): min=2, max=291, avg=187.80, stdev=26.38
write: IOPS=407, BW=3918KiB/s (4012kB/s)(689MiB/180001msec); 0 zone resets
clat (usec): min=30, max=289095, avg=187855.45, stdev=26346.03
 
Last edited:
The firmware is Firmware version: 15.10.01.00. Is it worth update to P19 or P20?
 
I'd say yes. I have had issues with everything below v18 / P18 - in combination with MDADM though. ZFS wasn't for me back then.
 
Unfortunately, my server configuration is not like yours.

Supermicro X11DDW-NT BIOS 3.3
SATA controller: Intel Corporation Lewisburg SATA Controller [AHCI mode] (rev 09)
RAIDZ 4 x Samsung SSD 860 PRO 4TB Firmware Version: RVM01B6Q

Code:
 write: IOPS=1143, BW=1143MiB/s (1199MB/s)(64.0GiB/57323msec); 0 zone resets
    clat (usec): min=3, max=19860, avg=2625.57, stdev=338.50
read bs=1mb iodepth 4
  read: IOPS=2293, BW=2294MiB/s (2405MB/s)(64.0GiB/28574msec)
    clat (usec): min=2, max=15247, avg=1308.93, stdev=393.33
read bs=4k iodepth 128
  read: IOPS=205k, BW=801MiB/s (840MB/s)(64.0GiB/81853msec)
    clat (usec): min=2, max=8222, avg=620.22, stdev=70.17
write bs=4k iodepth 128
  write: IOPS=111k, BW=435MiB/s (456MB/s)(64.0GiB/150559msec); 0 zone resets
    clat (usec): min=3, max=102057, avg=1140.40, stdev=481.99
randread bs=4k iodepth 128
  read: IOPS=22.6k, BW=88.1MiB/s (92.4MB/s)(15.5GiB/180001msec)
    clat (usec): min=2, max=15660, avg=5630.11, stdev=387.70
randwrite bs=4k iodepth 128
  write: IOPS=12.7k, BW=49.7MiB/s (52.2MB/s)(8955MiB/180001msec); 0 zone resets
    clat (usec): min=5, max=123727, avg=9973.03, stdev=3069.27
emulation webserver randrw 70/30 512b-512kb
  read: IOPS=11.6k, BW=91.2MiB/s (95.6MB/s)(16.0GiB/180001msec)
    clat (usec): min=94, max=82661, avg=15432.63, stdev=3386.68
  write: IOPS=4957, BW=39.0MiB/s (40.9MB/s)(7022MiB/180001msec); 0 zone resets
    clat (usec): min=28, max=82616, avg=15437.90, stdev=3393.87

pveperf FSYNCS/SECOND: 726

The benchmark with Single SSD EXT4 mount -o noatime,discard:

Code:
write bs=1mb iodepth 4
  write: IOPS=512, BW=513MiB/s (538MB/s)(64.0GiB/127844msec); 0 zone resets
    clat (usec): min=2086, max=30517, avg=7640.86, stdev=286.06
read bs=1mb iodepth 4
  read: IOPS=540, BW=541MiB/s (567MB/s)(64.0GiB/121165msec)
    clat (usec): min=2122, max=10914, avg=7249.80, stdev=80.91
read bs=4k iodepth 128
  read: IOPS=126k, BW=490MiB/s (514MB/s)(64.0GiB/133640msec)
    clat (usec): min=277, max=4012, avg=1017.21, stdev=27.55
write bs=4k iodepth 128
  write: IOPS=116k, BW=455MiB/s (477MB/s)(64.0GiB/144093msec); 0 zone resets
    clat (usec): min=163, max=5934, avg=1096.71, stdev=53.29
randread bs=4k iodepth 128
  read: IOPS=96.5k, BW=377MiB/s (395MB/s)(64.0GiB/173880msec)
    clat (usec): min=575, max=9601, avg=1317.19, stdev=211.90
randwrite bs=4k iodepth 128
  write: IOPS=81.3k, BW=318MiB/s (333MB/s)(55.8GiB/180011msec); 0 zone resets
    clat (usec): min=810, max=19487, avg=1562.29, stdev=373.08
emulation webserver randrw 70/30 512b-512kb
  read: IOPS=34.7k, BW=183MiB/s (192MB/s)(32.2GiB/180002msec)
    clat (usec): min=938, max=27300, avg=5138.51, stdev=1614.50
  write: IOPS=14.9k, BW=78.4MiB/s (82.2MB/s)(13.8GiB/180002msec); 0 zone resets
    clat (usec): min=1103, max=30256, avg=5159.62, stdev=1711.65
 
Testing variables:
- firmware P20.0.7.0 (Crossflash from H220 to Generic LSI 9207-8e),
- compression = off,
- atime = off,
- recordsize=128k,
- ncq disabled (cat /sys/block/sda/device/queue_depth = 1),
- ashift = 12,

After upgrading to H220 to P20 (crossflash to: LSI 9207-8e firmware)

FIO TEST (system pool, ncq off/queue_depth=1)

write bs=1mb iodepth 4
write: IOPS=118, BW=119MiB/s (124MB/s)(10.0GiB/86367msec); 0 zone resets
clat (usec): min=4, max=20830k, avg=25305.85, stdev=588079.02
read bs=1mb iodepth 4
read: IOPS=2512, BW=2513MiB/s (2635MB/s)(10.0GiB/4075msec)
clat (usec): min=3, max=4488, avg=1194.31, stdev=281.53
read bs=4k iodepth 128
read: IOPS=165k, BW=644MiB/s (676MB/s)(10.0GiB/15891msec)
clat (usec): min=2, max=2350, avg=770.55, stdev=147.48
write bs=4k iodepth 128
write: IOPS=28.5k, BW=111MiB/s (117MB/s)(10.0GiB/92117msec); 0 zone resets
clat (usec): min=3, max=14180k, avg=4463.72, stdev=175793.89
randread bs=4k iodepth 128
read: IOPS=27.0k, BW=109MiB/s (115MB/s)(10.0GiB/93695msec)
clat (usec): min=2, max=11616, avg=4539.83, stdev=868.79
randwrite bs=4k iodepth 128
write: IOPS=872, BW=3488KiB/s (3572kB/s)(634MiB/186158msec); 0 zone resets
clat (usec): min=26, max=25074k, avg=145621.18, stdev=1573254.69
emulation webserver randrw 70/30 512b-512kb
read: IOPS=2088, BW=15.4MiB/s (16.1MB/s)(2996MiB/194708msec)
clat (msec): min=8, max=26055, avg=85.40, stdev=1152.56
write: IOPS=896, BW=6689KiB/s (6850kB/s)(1272MiB/194708msec); 0 zone resets
clat (usec): min=39, max=26055k, avg=85443.87, stdev=1142500.79


FIO TEST (pool1, ncq off/queue_depth=1)
write bs=1mb iodepth 4
write: IOPS=520, BW=521MiB/s (546MB/s)(10.0GiB/19673msec); 0 zone resets
clat (usec): min=10, max=9296, avg=5766.45, stdev=2663.81
read bs=1mb iodepth 4
read: IOPS=2324, BW=2324MiB/s (2437MB/s)(10.0GiB/4406msec)
clat (usec): min=3, max=4773, avg=1291.27, stdev=249.54
read bs=4k iodepth 128
read: IOPS=169k, BW=662MiB/s (694MB/s)(10.0GiB/15466msec)
clat (usec): min=2, max=3695, avg=749.92, stdev=155.66
write bs=4k iodepth 128
write: IOPS=86.0k, BW=336MiB/s (352MB/s)(10.0GiB/30470msec); 0 zone resets
clat (usec): min=3, max=5180, avg=1477.03, stdev=299.92
randread bs=4k iodepth 128
read: IOPS=28.0k, BW=110MiB/s (115MB/s)(10.0GiB/93515msec)
clat (usec): min=2, max=12255, avg=4531.08, stdev=938.18
randwrite bs=4k iodepth 128
write: IOPS=3651, BW=14.3MiB/s (14.0MB/s)(2568MiB/180001msec); 0 zone resets
clat (usec): min=9, max=132642, avg=34780.66, stdev=5308.62
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9865, BW=40.8MiB/s (42.8MB/s)(7177MiB/175700msec)
clat (usec): min=67, max=38403, avg=18082.66, stdev=6228.58
write: IOPS=4229, BW=17.4MiB/s (18.3MB/s)(3063MiB/175700msec); 0 zone resets
clat (usec): min=12, max=38673, avg=18112.79, stdev=6227.13


FIO TEST (pool2 ncq off/queue_depth=1)
write bs=1mb iodepth 4
write: IOPS=473, BW=473MiB/s (496MB/s)(10.0GiB/21640msec); 0 zone resets
clat (usec): min=9, max=30601, avg=6341.10, stdev=2224.91
read bs=1mb iodepth 4
read: IOPS=960, BW=961MiB/s (1008MB/s)(10.0GiB/10656msec)
clat (usec): min=2, max=9407, avg=3122.13, stdev=1102.09
read bs=4k iodepth 128
read: IOPS=131k, BW=511MiB/s (535MB/s)(10.0GiB/20057msec)
clat (usec): min=2, max=4795, avg=972.25, stdev=159.78
write bs=4k iodepth 128
write: IOPS=60.5k, BW=236MiB/s (248MB/s)(10.0GiB/43342msec); 0 zone resets
clat (usec): min=3, max=7089, avg=2100.76, stdev=917.96
randread bs=4k iodepth 128
read: IOPS=63.4k, BW=248MiB/s (260MB/s)(10.0GiB/41349msec)
clat (usec): min=2, max=12485, avg=2003.65, stdev=459.32
randwrite bs=4k iodepth 128
write: IOPS=21.7k, BW=84.8MiB/s (88.9MB/s)(10.0GiB/120747msec); 0 zone resets
clat (usec): min=3, max=13451, avg=5851.61, stdev=2222.43
emulation webserver randrw 70/30 512b-512kb
read: IOPS=26.5k, BW=110MiB/s (115MB/s)(7177MiB/65355msec)
clat (usec): min=24, max=25868, avg=6727.14, stdev=3649.66
write: IOPS=11.4k, BW=46.9MiB/s (49.1MB/s)(3063MiB/65355msec); 0 zone resets
clat (usec): min=10, max=25834, avg=6736.05, stdev=3657.01

FIO TEST (ext4 single drive, ncq off/queue_depth=1)
write bs=1mb iodepth 4
write: IOPS=473, BW=473MiB/s (496MB/s)(10.0GiB/21639msec); 0 zone resets
clat (usec): min=2334, max=15815, avg=8308.61, stdev=194.15
read bs=1mb iodepth 4
read: IOPS=503, BW=503MiB/s (528MB/s)(10.0GiB/20340msec)
clat (usec): min=3953, max=11802, avg=7834.96, stdev=120.67
read bs=4k iodepth 128
read: IOPS=106k, BW=416MiB/s (436MB/s)(10.0GiB/24645msec)
clat (usec): min=570, max=3907, avg=1196.66, stdev=151.64
write bs=4k iodepth 128
write: IOPS=102k, BW=400MiB/s (419MB/s)(10.0GiB/25626msec); 0 zone resets
clat (usec): min=679, max=11261, avg=1244.52, stdev=229.33
randread bs=4k iodepth 128
read: IOPS=9733, BW=38.0MiB/s (39.9MB/s)(6844MiB/180013msec)
clat (usec): min=187, max=215402, avg=13131.91, stdev=9060.26
randwrite bs=4k iodepth 128
write: IOPS=22.2k, BW=86.8MiB/s (90.0MB/s)(10.0GiB/118032msec); 0 zone resets
clat (usec): min=75, max=92003, avg=5744.31, stdev=4091.72
emulation webserver randrw 70/30 512b-512kb
read: IOPS=6985, BW=33.3MiB/s (34.9MB/s)(5990MiB/180026msec)
clat (usec): min=177, max=152845, avg=30506.63, stdev=26359.95
write: IOPS=2994, BW=14.2MiB/s (14.9MB/s)(2553MiB/180026msec); 0 zone resets
clat (usec): min=64, max=116782, avg=14234.82, stdev=13641.12
 
Next test I did is different ashift:

ASHIFT = 0
write bs=1mb iodepth 4
write: IOPS=509, BW=510MiB/s (535MB/s)(10.0GiB/20087msec); 0 zone resets
clat (usec): min=8, max=107857, avg=5888.93, stdev=2487.83
read bs=1mb iodepth 4
read: IOPS=2244, BW=2244MiB/s (2353MB/s)(10.0GiB/4563msec)
clat (usec): min=3, max=4786, avg=1337.35, stdev=255.68
read bs=4k iodepth 128
read: IOPS=166k, BW=649MiB/s (681MB/s)(10.0GiB/15774msec)
clat (usec): min=2, max=2873, avg=764.83, stdev=187.12
write bs=4k iodepth 128
write: IOPS=87.2k, BW=341MiB/s (357MB/s)(10.0GiB/30050msec); 0 zone resets
clat (usec): min=3, max=4311, avg=1456.65, stdev=344.98
randread bs=4k iodepth 128
read: IOPS=26.9k, BW=105MiB/s (110MB/s)(10.0GiB/97360msec)
clat (usec): min=3, max=12494, avg=4717.36, stdev=894.87
randwrite bs=4k iodepth 128
write: IOPS=3653, BW=14.3MiB/s (14.0MB/s)(2569MiB/180001msec); 0 zone resets
clat (usec): min=10, max=125955, avg=34761.49, stdev=5461.84
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9878, BW=40.9MiB/s (42.9MB/s)(7177MiB/175485msec)
clat (usec): min=39, max=35740, avg=18064.05, stdev=6126.58
write: IOPS=4234, BW=17.5MiB/s (18.3MB/s)(3063MiB/175485msec); 0 zone resets
clat (usec): min=8, max=36173, avg=18085.19, stdev=6120.08


ASHIFT = 9
write bs=1mb iodepth 4
write: IOPS=513, BW=514MiB/s (539MB/s)(10.0GiB/19939msec); 0 zone resets
clat (usec): min=16, max=110592, avg=5845.04, stdev=3643.51
read bs=1mb iodepth 4
read: IOPS=2224, BW=2224MiB/s (2332MB/s)(10.0GiB/4604msec)
clat (usec): min=3, max=4693, avg=1349.42, stdev=323.85
read bs=4k iodepth 128
read: IOPS=172k, BW=671MiB/s (704MB/s)(10.0GiB/15253msec)
clat (usec): min=3, max=2374, avg=739.55, stdev=137.63
write bs=4k iodepth 128
write: IOPS=83.5k, BW=326MiB/s (342MB/s)(10.0GiB/31399msec); 0 zone resets
clat (usec): min=3, max=5196, avg=1522.05, stdev=359.36
randread bs=4k iodepth 128
read: IOPS=25.8k, BW=101MiB/s (105MB/s)(10.0GiB/101778msec)
clat (usec): min=2, max=12178, avg=4931.37, stdev=958.71
randwrite bs=4k iodepth 128
write: IOPS=3650, BW=14.3MiB/s (14.9MB/s)(2566MiB/180001msec); 0 zone resets
clat (usec): min=10, max=123027, avg=34796.32, stdev=5219.97
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9869, BW=40.9MiB/s (42.8MB/s)(7177MiB/175639msec)
clat (usec): min=59, max=37972, avg=18077.05, stdev=6205.07
write: IOPS=4231, BW=17.4MiB/s (18.3MB/s)(3063MiB/175639msec); 0 zone resets
clat (usec): min=43, max=37919, avg=18106.69, stdev=6200.01



ASHIFT=12
write bs=1mb iodepth 4
write: IOPS=508, BW=509MiB/s (533MB/s)(10.0GiB/20137msec); 0 zone resets
clat (usec): min=16, max=11319, avg=5903.43, stdev=1819.80
read bs=1mb iodepth 4
read: IOPS=2424, BW=2424MiB/s (2542MB/s)(10.0GiB/4224msec)
clat (usec): min=3, max=4757, avg=1238.06, stdev=245.12
read bs=4k iodepth 128
read: IOPS=172k, BW=671MiB/s (704MB/s)(10.0GiB/15258msec)
clat (usec): min=2, max=2485, avg=739.78, stdev=142.23
write bs=4k iodepth 128
write: IOPS=81.5k, BW=318MiB/s (334MB/s)(10.0GiB/32180msec); 0 zone resets
clat (usec): min=3, max=4915, avg=1559.91, stdev=389.57
randread bs=4k iodepth 128
read: IOPS=27.0k, BW=109MiB/s (115MB/s)(10.0GiB/93770msec)
clat (usec): min=2, max=12628, avg=4543.48, stdev=905.01
randwrite bs=4k iodepth 128
write: IOPS=3662, BW=14.3MiB/s (15.0MB/s)(2575MiB/180001msec); 0 zone resets
clat (usec): min=10, max=130044, avg=34678.70, stdev=5288.31
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9911, BW=41.0MiB/s (43.0MB/s)(7177MiB/174901msec)
clat (usec): min=70, max=35210, avg=18000.66, stdev=6298.07
write: IOPS=4249, BW=17.5MiB/s (18.4MB/s)(3063MiB/174901msec); 0 zone resets
clat (usec): min=12, max=35153, avg=18030.61, stdev=6285.10


ASHIFT=13
write bs=1mb iodepth 4
write: IOPS=509, BW=510MiB/s (535MB/s)(10.0GiB/20082msec); 0 zone resets
clat (usec): min=8, max=10417, avg=5887.34, stdev=1840.02
read bs=1mb iodepth 4
read: IOPS=2294, BW=2295MiB/s (2406MB/s)(10.0GiB/4462msec)
clat (usec): min=3, max=4633, avg=1307.79, stdev=252.53
read bs=4k iodepth 128
read: IOPS=175k, BW=684MiB/s (717MB/s)(10.0GiB/14970msec)
clat (usec): min=2, max=2432, avg=725.86, stdev=132.07
write bs=4k iodepth 128
write: IOPS=80.5k, BW=315MiB/s (330MB/s)(10.0GiB/32545msec); 0 zone resets
clat (usec): min=3, max=5157, avg=1577.59, stdev=398.41
randread bs=4k iodepth 128
read: IOPS=26.8k, BW=105MiB/s (110MB/s)(10.0GiB/97940msec)
clat (usec): min=2, max=12043, avg=4745.48, stdev=1000.95
randwrite bs=4k iodepth 128
write: IOPS=3658, BW=14.3MiB/s (14.0MB/s)(2572MiB/180001msec); 0 zone resets
clat (usec): min=10, max=42706, avg=34718.98, stdev=4880.64
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9869, BW=40.9MiB/s (42.8MB/s)(7177MiB/175631msec)
clat (usec): min=58, max=37680, avg=18077.16, stdev=6217.01
write: IOPS=4231, BW=17.4MiB/s (18.3MB/s)(3063MiB/175631msec); 0 zone resets
clat (usec): min=22, max=37568, avg=18104.19, stdev=6213.95
 
Last edited:
I did some tests for others with different recordsizes.

recordsize 32K, pool1
write bs=1mb iodepth 4
write: IOPS=499, BW=500MiB/s (524MB/s)(10.0GiB/20485msec); 0 zone resets
clat (usec): min=8, max=12788, avg=6004.62, stdev=2669.45
read bs=1mb iodepth 4
read: IOPS=2323, BW=2324MiB/s (2436MB/s)(10.0GiB/4407msec)
clat (usec): min=3, max=4742, avg=1291.60, stdev=259.51
read bs=4k iodepth 128
read: IOPS=163k, BW=635MiB/s (666MB/s)(10.0GiB/16115msec)
clat (usec): min=2, max=2432, avg=781.36, stdev=163.83
write bs=4k iodepth 128
write: IOPS=81.9k, BW=320MiB/s (335MB/s)(10.0GiB/32011msec); 0 zone resets
clat (usec): min=3, max=3737, avg=1551.72, stdev=318.67
randread bs=4k iodepth 128
read: IOPS=27.0k, BW=106MiB/s (111MB/s)(10.0GiB/97059msec)
clat (usec): min=2, max=12486, avg=4702.73, stdev=904.91
randwrite bs=4k iodepth 128
write: IOPS=3645, BW=14.2MiB/s (14.9MB/s)(2563MiB/180001msec); 0 zone resets
clat (usec): min=8, max=121420, avg=34840.43, stdev=5102.03
emulation webserver randrw 70/30 512b-512kb
read: IOPS=9848, BW=40.8MiB/s (42.8MB/s)(7177MiB/176007msec)
clat (usec): min=67, max=42084, avg=18114.50, stdev=6203.74
write: IOPS=4222, BW=17.4MiB/s (18.2MB/s)(3063MiB/176007msec); 0 zone resets
clat (usec): min=46, max=42067, avg=18146.24, stdev=6205.03

recordsize 16K, pool1
write bs=1mb iodepth 4
write: IOPS=449, BW=450MiB/s (472MB/s)(10.0GiB/22762msec); 0 zone resets
clat (usec): min=9, max=25053, avg=6670.56, stdev=2414.50
read bs=1mb iodepth 4
read: IOPS=967, BW=968MiB/s (1015MB/s)(10.0GiB/10581msec)
clat (usec): min=2, max=14562, avg=3100.03, stdev=966.29
read bs=4k iodepth 128
read: IOPS=128k, BW=501MiB/s (525MB/s)(10.0GiB/20447msec)
clat (usec): min=2, max=4831, avg=991.14, stdev=183.55
write bs=4k iodepth 128
write: IOPS=55.6k, BW=217MiB/s (228MB/s)(10.0GiB/47161msec); 0 zone resets
clat (usec): min=3, max=7235, avg=2285.88, stdev=1130.46
randread bs=4k iodepth 128
read: IOPS=56.3k, BW=220MiB/s (231MB/s)(10.0GiB/46563msec)
clat (usec): min=3, max=14885, avg=2256.44, stdev=487.97
randwrite bs=4k iodepth 128
write: IOPS=20.4k, BW=79.5MiB/s (83.4MB/s)(10.0GiB/128752msec); 0 zone resets
clat (usec): min=7, max=13696, avg=6239.55, stdev=2259.08
emulation webserver randrw 70/30 512b-512kb
read: IOPS=25.4k, BW=105MiB/s (110MB/s)(7177MiB/68113msec)
clat (usec): min=20, max=23596, avg=7014.93, stdev=3802.11
write: IOPS=10.9k, BW=44.0MiB/s (47.2MB/s)(3063MiB/68113msec); 0 zone resets
clat (usec): min=10, max=23563, avg=7010.79, stdev=3795.70



recordsize 8K, pool1
write bs=1mb iodepth 4
write: IOPS=347, BW=348MiB/s (365MB/s)(10.0GiB/29442msec); 0 zone resets
clat (usec): min=6, max=23186, avg=8626.54, stdev=2963.52
read bs=1mb iodepth 4
read: IOPS=705, BW=705MiB/s (740MB/s)(10.0GiB/14515msec)
clat (usec): min=3, max=13876, avg=4251.95, stdev=1026.51
read bs=4k iodepth 128
read: IOPS=107k, BW=416MiB/s (437MB/s)(10.0GiB/24586msec)
clat (usec): min=2, max=7246, avg=1191.54, stdev=195.43
write bs=4k iodepth 128
write: IOPS=45.8k, BW=179MiB/s (188MB/s)(10.0GiB/57222msec); 0 zone resets
clat (usec): min=3, max=9356, avg=2773.39, stdev=1639.92
randread bs=4k iodepth 128
read: IOPS=67.3k, BW=263MiB/s (276MB/s)(10.0GiB/38972msec)
clat (usec): min=3, max=15985, avg=1888.53, stdev=610.87
randwrite bs=4k iodepth 128
write: IOPS=24.7k, BW=96.4MiB/s (101MB/s)(10.0GiB/106265msec); 0 zone resets
clat (usec): min=3, max=14988, avg=5149.83, stdev=2673.49
emulation webserver randrw 70/30 512b-512kb
read: IOPS=25.8k, BW=107MiB/s (112MB/s)(7177MiB/67209msec)
clat (usec): min=18, max=35565, avg=6921.36, stdev=4596.51
write: IOPS=11.1k, BW=45.6MiB/s (47.8MB/s)(3063MiB/67209msec); 0 zone resets
clat (usec): min=7, max=35492, avg=6919.02, stdev=4587.26
 
@sa10 could You give me information about your test envirnoment?

For real good compare I need ZFS dataset parameters:
- recordsize,
- ashift,
- compression,
- atime.
 
I changed the script a bit to bring the result closer to reality.
I added the ability to compress data at the level of 30% (PERC="30") as on my production system
That changed the results but quite a little bit.
 

Attachments

From my point of view:
- flashing generic LSI P20.0.7.0 version to H220 LSI gets about 30% !!!!!!!!
- turned off NCQ (for i in a b c d e f; do echo 1 > /sys/block/sd$i/device/queue_depth; done) - we get about 3-6%
- changed scheduler to none from mq-deadline (for i in a b c d e f; do echo none > /sys/block/sd$i/queue/scheduler; done) - also few percent,
- create pool with ashift = 12 also few percent,

I think that I won't get more from my SSD drives ;-)

I marked this as TUTORIAL because many people ask regulary here about performance on zfs.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!