Fresh install 4.4: ZFS performance 6x Samsung 850 evo

Bagwani

Active Member
Dec 26, 2016
5
0
41
Hi,

Currently I am testing my new Proxmox 4.4 homeserver with following configuration
  • 1x motherboard - Asrock C2750D4I C2750D4I
  • 6x ssd - samsung 850 EVO 500GB ssds
  • 4x memory 8GB ECC - Kingston Technology 8GB 1600MHz DDR3L Module KVR16LE11/8
  • 1x power supply - be quiet! Pure Power L8 - 300W BN220
  • 1x case - Fractal Design Node 304
starting conditions:
  1. test where performed with no VMs running.
  2. ashift is 12
  3. I use the whole ssd drive
  4. the zfs pool is including root
  5. compression is on with compressor lz4
  6. noatime is off
  7. drives are directly connected to onboard sata3

With the following bonnie++ test I get the following result
Code:
root@pve1:~# bonnie++ -u root -r 1024 -s 16384 -d /rpool -f -b -n 1 -c 8
Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97  ------Sequential Output------ --Sequential Input- --Random-
Concurrency  8  -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine  Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
pve1  16G  341123  91 255745  98  1011765  99  5241  81
Latency  10201us  26746us  54778us  450ms
Version  1.97  ------Sequential Create------ --------Random Create--------
pve1  -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
  1  131  2 +++++ +++  160  3  203  4 +++++ +++  204  3
Latency  4624us  144us  22984us  4590us  12us  4595us
1.97,1.97,pve1,8,1482783936,16G,,,,341123,91,255745,98,,,1011765,99,5241,81,1,,,,,131,2,+++++,+++,160,3,203,4,+++++,+++,204,3,,10201us,26746us,,54778us,450ms,4624us,144us,22984us,4590us,12us,4595us
Summarized
Write: 333 MByte/sec
Rewrite: 249 MByte/sec
Read: 988 Mbyte/sec

And these are my pveperf results
Code:
root@pve1:~# pveperf /rpool
CPU BOGOMIPS:  38401.52
REGEX/SECOND:  394533
HD SIZE:  1639.99 GB (rpool)
FSYNCS/SECOND:  637.96
DNS EXT:  45.46 ms
DNS INT:  16.91 ms (somedomain.com)  note: domain is obfuscated with somedomain.com

I was expecting more performance from 6 SSD in RaidZ2.

I found a test where the person got twice the performance out of 6x 256GB SSD drives
source: https://calomel.org/zfs_raid_speed_capacity.html
6x 256GB raid6, raidz2 933 gigabytes ( w= 721MB/s , rw=530MB/s , r=1754MB/s )

Questions:
1) Is this the performance I can expect from 6x sumsung 850 evo in RaidZ2 configuration?
2) If not, what should I change about my zfs/proxmox settings?
3) How can I set metaslab_lba_weighting_enabled to 0 and make it persistent after reboot?
4) I am only using SSD harddrives and this should give more performance or not?

# By default metaslab_lba_weighting_enabled=1 on my system
Code:
root@pve1:~# cat /sys/module/zfs/parameters/metaslab_lba_weighting_enabled
1

What I tried to change zfs metaslab_lba_weighting_enabled to 0 and did not work
1) put options zfs metaslab_lba_weighting_enabled=0 in /lib/modules-load.d/zfs.conf
2) put options zfs metaslab_lba_weighting_enabled=0 in /etc/modprobe.d/zfs.conf

Below you can find detailed information about my tests and setup
Code:
root@pve1:~# pveversion -v
proxmox-ve: 4.4-76 (running kernel: 4.4.35-1-pve)
pve-manager: 4.4-2 (running version: 4.4-2/80259e05)
pve-kernel-4.4.35-1-pve: 4.4.35-76
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-102
pve-firmware: 1.1-10
libpve-common-perl: 4.0-84
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-70
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-1
pve-qemu-kvm: 2.7.0-9
pve-container: 1.0-89
pve-firewall: 2.0-33
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-2
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80

Code:
root@pve1:~# zpool status -v
  pool: rpool
state: ONLINE
  scan: scrub repaired 0 in 0h5m with 0 errors on Mon Dec 26 14:09:15 2016
config:

  NAME  STATE  READ WRITE CKSUM
  rpool  ONLINE  0  0  0
  raidz2-0  ONLINE  0  0  0
  ata-Samsung_SSD_850_EVO_500GB_S2XXXXXXXXXXXXX-part2  ONLINE  0  0  0
  ata-Samsung_SSD_850_EVO_500GB_S2XXXXXXXXXXXXX-part2  ONLINE  0  0  0
  ata-Samsung_SSD_850_EVO_500GB_S2XXXXXXXXXXXXX-part2  ONLINE  0  0  0
  ata-Samsung_SSD_850_EVO_500GB_S2XXXXXXXXXXXXX-part2  ONLINE  0  0  0
  ata-Samsung_SSD_850_EVO_500GB_S2XXXXXXXXXXXXX-part2  ONLINE  0  0  0
  ata-Samsung_SSD_850_EVO_500GB_S2XXXXXXXXXXXXX-part2  ONLINE  0  0  0
note: serial number is obfuscated with XXXXXXXXXXXXX

Code:
root@pve1:~# zpool list
NAME  SIZE  ALLOC  FREE  EXPANDSZ  FRAG  CAP  DEDUP  HEALTH  ALTROOT
rpool  2.72T  222G  2.50T  -  3%  7%  1.00x  ONLINE  -

Code:
root@pve1:~# sysctl vm.swappiness
vm.swappiness = 10

Code:
root@pve1:~# service ksmtuned status
? ksmtuned.service - Kernel Samepage Merging (KSM) Tuning Daemon
  Loaded: loaded (/lib/systemd/system/ksmtuned.service; disabled)
  Active: inactive (dead)

Code:
root@pve1:~# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r  b  swpd  free  buff  cache  si  so  bi  bo  in  cs us sy id wa st
0  0  0 30465440  0 152460  0  0  3157  1  324  28  1  3 96  0  0

Code:
root@pve1:~# zfs get all rpool
NAME  PROPERTY  VALUE  SOURCE
rpool  type  filesystem  -
rpool  creation  Sun Dec 25 13:23 2016  -
rpool  used  156G  -
rpool  available  1.60T  -
rpool  referenced  192K  -
rpool  compressratio  1.12x  -
rpool  mounted  yes  -
rpool  quota  none  default
rpool  reservation  none  default
rpool  recordsize  128K  default
rpool  mountpoint  /rpool  default
rpool  sharenfs  off  default
rpool  checksum  on  default
rpool  compression  lz4  local
rpool  atime  off  local
rpool  devices  on  default
rpool  exec  on  default
rpool  setuid  on  default
rpool  readonly  off  default
rpool  zoned  off  default
rpool  snapdir  hidden  default
rpool  aclinherit  restricted  default
rpool  canmount  on  default
rpool  xattr  sa  local
rpool  copies  1  default
rpool  version  5  -
rpool  utf8only  off  -
rpool  normalization  none  -
rpool  casesensitivity  sensitive  -
rpool  vscan  off  default
rpool  nbmand  off  default
rpool  sharesmb  off  default
rpool  refquota  none  default
rpool  refreservation  none  default
rpool  primarycache  all  default
rpool  secondarycache  all  default
rpool  usedbysnapshots  0  -
rpool  usedbydataset  192K  -
rpool  usedbychildren  156G  -
rpool  usedbyrefreservation  0  -
rpool  logbias  latency  default
rpool  dedup  off  default
rpool  mlslabel  none  default
rpool  sync  standard  local
rpool  refcompressratio  1.00x  -
rpool  written  192K  -
rpool  logicalused  111G  -
rpool  logicalreferenced  40K  -
rpool  filesystem_limit  none  default
rpool  snapshot_limit  none  default
rpool  filesystem_count  none  default
rpool  snapshot_count  none  default
rpool  snapdev  hidden  default
rpool  acltype  off  default
rpool  context  none  default
rpool  fscontext  none  default
rpool  defcontext  none  default
rpool  rootcontext  none  default
rpool  relatime  off  default
rpool  redundant_metadata  all  default
rpool  overlay  off  default
 
Last edited:
Some more info

iostat
Code:
root@pve1:~# iostat -x 1
Linux 4.4.35-1-pve (pve1)  12/26/2016  _x86_64_  (8 CPU)

avg-cpu:  %user  %nice %system %iowait  %steal  %idle
  1.37  0.00  2.78  0.16  0.00  95.68

Device:  rrqm/s  wrqm/s  r/s  w/s  rkB/s  wkB/s avgrq-sz avgqu-sz  await r_await w_await  svctm  %util
sda  0.00  0.00  0.08  0.00  2.71  0.00  71.36  0.00  0.15  0.15  0.00  0.13  0.00
sdb  1.85  0.00  150.49  30.65  3684.30  305.60  44.05  0.05  0.26  0.28  0.17  0.17  3.03
sdc  2.10  0.00  148.88  29.99  3714.39  300.94  44.90  0.05  0.27  0.29  0.17  0.17  3.05
sdd  1.79  0.00  165.27  29.79  3770.14  298.58  41.72  0.05  0.25  0.26  0.17  0.15  3.00
sde  1.92  0.00  146.89  31.03  3663.15  305.65  44.61  0.05  0.26  0.28  0.17  0.16  2.91
sdf  2.01  0.00  143.51  30.40  3643.63  302.49  45.38  0.05  0.27  0.29  0.17  0.17  2.87
sdg  1.63  0.00  160.37  29.68  3719.64  301.06  42.31  0.05  0.25  0.27  0.16  0.16  2.96
zd0  0.00  0.00  0.01  0.00  0.11  0.00  16.90  0.00  0.03  0.03  0.00  0.03  0.00
zd16  0.00  0.00  1.19  0.94  43.47  3.54  44.10  0.00  0.46  0.65  0.23  0.35  0.07
zd32  0.00  0.00  1.22  0.95  43.65  3.54  43.40  0.00  0.54  0.78  0.23  0.38  0.08
zd48  0.00  0.00  12.42  0.23  52.48  3.69  8.88  0.00  0.10  0.09  0.55  0.10  0.12
zd64  0.00  0.00  0.02  0.00  0.13  0.00  16.00  0.00  0.20  0.20  0.00  0.20  0.00
dm-0  0.00  0.00  0.01  0.00  0.29  0.00  74.59  0.00  0.11  0.11  0.00  0.11  0.00
dm-1  0.00  0.00  0.01  0.00  0.29  0.00  74.59  0.00  0.16  0.16  0.00  0.16  0.00
dm-2  0.00  0.00  0.01  0.00  0.29  0.00  74.59  0.00  0.11  0.11  0.00  0.11  0.00

avg-cpu:  %user  %nice %system %iowait  %steal  %idle
  0.12  0.00  0.12  0.00  0.00  99.75

Device:  rrqm/s  wrqm/s  r/s  w/s  rkB/s  wkB/s avgrq-sz avgqu-sz  await r_await w_await  svctm  %util
sda  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
sdb  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
sdc  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
sdd  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
sde  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
sdf  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
sdg  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
zd0  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
zd16  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
zd32  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
zd48  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
zd64  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
dm-0  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
dm-1  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
dm-2  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00



Smart info Drive1
Code:
root@pve1:~# smartctl -i /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Samsung based SSDs
Device Model:  Samsung SSD 850 EVO 500GB
Serial Number:  S2XXXXXXXXXXXXX
LU WWN Device Id: 5 002538 d417e3469
Firmware Version: EMT02B6Q
User Capacity:  500,107,862,016 bytes [500 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  Solid State Device
Form Factor:  2.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Mon Dec 26 15:24:38 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Smart info Drive2
Code:
root@pve1:~# smartctl -i /dev/sdc
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Samsung based SSDs
Device Model:  Samsung SSD 850 EVO 500GB
Serial Number:  S2XXXXXXXXXXXXX
LU WWN Device Id: 5 002538 d417e32d0
Firmware Version: EMT02B6Q
User Capacity:  500,107,862,016 bytes [500 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  Solid State Device
Form Factor:  2.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Mon Dec 26 15:26:42 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Smart info Drive3
Code:
root@pve1:~# smartctl -i /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Samsung based SSDs
Device Model:  Samsung SSD 850 EVO 500GB
Serial Number:  S2XXXXXXXXXXXXX
LU WWN Device Id: 5 002538 d417e3344
Firmware Version: EMT02B6Q
User Capacity:  500,107,862,016 bytes [500 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  Solid State Device
Form Factor:  2.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Mon Dec 26 15:27:02 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Smart info Drive4
Code:
root@pve1:~# smartctl -i /dev/sde
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Samsung based SSDs
Device Model:  Samsung SSD 850 EVO 500GB
Serial Number:  S2XXXXXXXXXXXXX
LU WWN Device Id: 5 002538 d417e3309
Firmware Version: EMT02B6Q
User Capacity:  500,107,862,016 bytes [500 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  Solid State Device
Form Factor:  2.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Mon Dec 26 15:27:17 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Smart info Drive5
Code:
root@pve1:~# smartctl -i /dev/sdf
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Samsung based SSDs
Device Model:  Samsung SSD 850 EVO 500GB
Serial Number:  S2XXXXXXXXXXXXX
LU WWN Device Id: 5 002538 d417e33b5
Firmware Version: EMT02B6Q
User Capacity:  500,107,862,016 bytes [500 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  Solid State Device
Form Factor:  2.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Mon Dec 26 15:27:33 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Smart info Drive6
Code:
root@pve1:~# smartctl -i /dev/sdg
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Samsung based SSDs
Device Model:  Samsung SSD 850 EVO 500GB
Serial Number:  S2XXXXXXXXXXXXX
LU WWN Device Id: 5 002538 d417e359d
Firmware Version: EMT02B6Q
User Capacity:  500,107,862,016 bytes [500 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  Solid State Device
Form Factor:  2.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Mon Dec 26 15:27:46 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
(1)
Unfortunately i can't say anything about the performance issue but i know that most of the current samsung SSDs use 8K blocks internally (ashift=13). But i guess it doesn't help that much in performance.

(2)
I guess you can change the metaslab_lba_weighting_enabled with:
# echo 0 > /sys/module/zfs/parameters/metaslab_lba_weighting_enabled
Some zfs module parameters need an update of initramfs. But let's wait what some other people say.
 
Last edited:
Thanks bogo22. That is really helpful! Maybe the ashift 13 could explain half of the performance.

To give some more information on how the SSD drives are connected to the controllers

The C2750D4I has the following SATA3 controllers
  1. Intel® C2750 : 2 x SATA3 6.0 Gb/s, 4 x SATA2 3.0 Gb/s
  2. Marvell SE9172: 2 x SATA3 6.0 Gb/s, support RAID 0, 1
  3. Marvell SE9230: 4 x SATA3 6.0 Gb/s, support RAID 0, 1, 10
source: http://www.asrockrack.com/general/productdetail.asp?Model=C2750D4I#Manual

On every SATA3 controller I connected 2 SSD's.
  • 2 drives on Intel® C2750 which has 2 x SATA3 ports
  • 2 drives on Marvell SE9172 which has 2 x SATA3 ports
  • 2 drives on Marvell SE9230 which has 4 x SATA3 ports

Thinking about possible causes of slower then expected speed.
1) SATA cables. I ordered 6 brand new SATA3 6GB/s cables just to make sure it is not the cables. Will give an update when the cables arrive.
2) Internal controllers bandwidth/performance. I could solve this with external controller. C2750D4I has room for PCI Expres 2.0 8x lanes card.
3) Cowboy Neal shooting x-rays in direction of my homeserver :p
 
Thanks Nemesiz. That explains slower write speed in relation to read speed.

What I do not understand is that I have 6 SSD drives which are capable of doing each around 421Mbyte/sec sequential read and each drive is on a SATA3 6GBit/sec channel. Which should total 421x6=2526 MByte/sec sequential read performance because they can read in parallel in Raidz2 or is this not correct? Now I am having a read speed of 988 Mbyte/sec.
 
Becareful: EVO are not enterprisessd's. It is a not good idea to use this on production. They are not designed to run 24/7 ore with Data sets more then 40GB per day, or files bigger then 12GB. Use 2 RaidZ vor better performance, not Z2.
 
I was expecting more performance from 6 SSD in RaidZ2.

I found a test where the person got twice the performance out of 6x 256GB SSD drives
source: https://calomel.org/zfs_raid_speed_capacity.html
You are comparing apples to pears. Your reference uses 850 PRO while you are using 850 EVO. What your test shows is the difference between consumer and enterprise grade SSD's.
 
  • Like
Reactions: fireon
Thanks Nemesiz. That explains slower write speed in relation to read speed.

What I do not understand is that I have 6 SSD drives which are capable of doing each around 421Mbyte/sec sequential read and each drive is on a SATA3 6GBit/sec channel. Which should total 421x6=2526 MByte/sec sequential read performance because they can read in parallel in Raidz2 or is this not correct? Now I am having a read speed of 988 Mbyte/sec.

Checksum - another read cycle in ZFS
 
Thanks Nemesiz. That explains slower write speed in relation to read speed.

What I do not understand is that I have 6 SSD drives which are capable of doing each around 421Mbyte/sec sequential read and each drive is on a SATA3 6GBit/sec channel. Which should total 421x6=2526 MByte/sec sequential read performance because they can read in parallel in Raidz2 or is this not correct? Now I am having a read speed of 988 Mbyte/sec.

You forget that you are only reading data simultaneously from 4 disks, the rest is parity from the remaining 2 disks, since you are running RAIDZ2. So the theoretical limit of an uncompressed fully sequential read is closer to 1600MBytes/sec. The rest of the difference could easily come from your low frequency Atom processor, since RAIDZ2 I/O involves considerable CPU time for parity calculation - you might want to check CPU usage during benchmarking. Also checksums, (de)compression use your CPU heavily. And don't forget the fact that your reads are never fully sequential, ZFS being copy-on-write fragments the disk image considerably, so you can't count on maximum sequential read speed of your disks.

Becareful: EVO are not enterprisessd's. It is a not good idea to use this on production. They are not designed to run 24/7 ore with Data sets more then 40GB per day, or files bigger then 12GB. Use 2 RaidZ vor better performance, not Z2.

Yeah, none of that is true. While EVOs are not enterprise grade, that only means they write slower and wear out sooner (due to TLC flash), there is no limit on data set (what does that even mean?) or file size (would be stupid). So while you are better off using them for read-intensive workloads, as they are not designed for heavy writes, but they will work just as well for any workload. We are using them 24/7, in production without any problems in RAIDZ for more than a year, and web-scale companies do as well: DigitalOcean has put 840 and 850 EVOs in RAID5 arrays for quite some time.
 
Last edited:
Depending on your workload, a bunch of enterprises disks with a Intel DC3700S for Slog can give a large boost over some pair of 850s.

I had a setup with 4x Samsung EVO 850 1TB in Raid10, burst speed was good, but FSync around 1800 which is really bad. I made another Test Setup with 4x 8 TB Seagate Enterprise NAS in Raid10 + Intel DC3700S 100gb for Slog, i got a boost in FSync up to 4800. FSync is very important for Windows KVMs, Database and NFS. So depending on your workload, a real Enterprise SSD for SLog can help alot.
 
Have here a RaidZ with 3 Enterprise SSD's, and an fsync ahead of 6700.
 
Honestly, I suspect the motherboard+cpu is the choke point.

Asrock C2750D4I C2750D4I

That's an atom based board, and they're not known for high IO speeds, quite the opposite in fact.. I'd be curious to see you try all the same hardware with something non-ATOM.. .something I7-based with higher IO speed SATA/SAS cards, and PCI-E lanes.

For reference, here's a couple of pveperfs from production machines:

12 4TB Seagate fixed disks.. with an opteron server motherboard.
Code:
CPU BOGOMIPS:      134399.84
REGEX/SECOND:      844901
HD SIZE:           9050.54 GB (bigZ/IN)
FSYNCS/SECOND:     1240.64
DNS EXT:           17.39 ms
DNS INT:           18.46 ms ()

4x72G 15k SAS disks, 2xOldXeon
Code:
CPU BOGOMIPS:      23940.00
REGEX/SECOND:      1211288
HD SIZE:           122.28 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     280.29
DNS EXT:           55.99 ms
DNS INT:           31.46 ms ()

NFS-to-SAN, 2xOldXeon
Code:
CPU BOGOMIPS:      23940.00
REGEX/SECOND:      1207106
HD SIZE:           5393.97 GB (192.168.X.X:/mnt/NFS/)
FSYNCS/SECOND:     685.47
DNS EXT:           72.68 ms
DNS INT:           31.33 ms ()
 
Last edited:
And a couple of bonnie++ results with your cmd-line options:

12x4TB Seagate, Opterons
Code:
root@pmx8:~# bonnie++ -u root -r 1024 -s 16384 -d /bigZ/IN -f -b -n 1 -c 8
Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   8     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
pmx8            16G           118777  56 251053  97           879364  99 +++++ +++
Latency                       51637us    5960us               320us   46947us
Version  1.97       ------Sequential Create------ --------Random Create--------
pmx8                -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1   278   4 +++++ +++   210   4   208   4 +++++ +++   203   4
Latency              2550us     198us    1112us    1137us      36us   33286us
1.97,1.97,pmx8,8,1483008612,16G,,,,118777,56,251053,97,,,879364,99,+++++,+++,1,,,,,278,4,+++++,+++,210,4,208,4,+++++,+++,203,4,,51637us,5960us,,320us,46947us,2550us,198us,1112us,1137us,36us,33286us


4x72G 15k local disks, 2xOldXeon:
Code:
root@pmx1:~# bonnie++ -u root -r 1024 -s 16384 -d /rpool -f -b -n 1 -c 8
Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   8     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
pmx1            16G           438137  91 298126  78           816651  75  1414  35
Latency                       32364us   58215us             31634us   43174us
Version  1.97       ------Sequential Create------ --------Random Create--------
pmx1                -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1   412   4 +++++ +++   162   1   206   2 +++++ +++   205   2
Latency              6645us      94us    8038us   10706us      20us   10644us
1.97,1.97,pmx1,8,1483006624,16G,,,,438137,91,298126,78,,,816651,75,1414,35,1,,,,,412,4,+++++,+++,162,1,206,2,+++++,+++,205,2,,32364us,58215us,,31634us,43174us,6645us,94us,8038us,10706us,20us,10644us



NFS-to-SAN, 2xOldXeon
Code:
root@pmx1:~# bonnie++ -u root -r 1024 -s 16384 -d /mnt/pve/NFS/ -f -b -n 1 -c 8
Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   8     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
pmx1            16G           111573  11 55481  11           114191   7  1045  98
Latency                        3631ms   13798ms             34845us    1232ms
Version  1.97       ------Sequential Create------ --------Random Create--------
pmx1                -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                  1   225   1 +++++ +++   205   0   204   1 +++++ +++   214   0
Latency              1044us    1368us    3999us     744us     991us     845us
1.97,1.97,pmx1,8,1483006376,16G,,,,111573,11,55481,11,,,114191,7,1045,98,1,,,,,225,1,+++++,+++,205,0,204,1,+++++,+++,214,0,,3631ms,13798ms,,34845us,1232ms,1044us,1368us,3999us,744us,991us,845us
 
Last edited:
Hey, i experience exactly the same thing on all consumer grade ssd i have ever tested Samsung 840/850 evo / pro or Intel 600p and so on.
What is even worse they all get “relocated sectors counts” and wear out rise fairly qucickly. I even manged to destroy a intel 600p in 2 month as root pool on server without much workload on i.

I am not sure if i am doing something wrong, But in all the test i have ever made with ZFS and these drives is a complete joke, the drives performs terrible and they creates massive io waits and that is with cpu or ram not used over 10% in xeon systems.

my solutions was to buy better hardware. I run my setups from intel S3710 and P3700 and the difference is night and day.
they have been running high load on them for 2 years and counting. without any issues

se my post here trying to get help with the exect problem.
https://forum.proxmox.com/threads/performance-issues-on-raid-1-ssd.28592/#post-159499
 
Last edited: