Poor vDisk performance

MisterY

Renowned Member
Oct 10, 2016
141
4
83
38
Hi,

first my sys:
Opteron 3280
16GB ECC
2*250GB SSD in ZFS-Raid1 (for Host- and Guest-OS)
3*1TB HDD in ZFS-RaidZ1 (for Data-vDisks)

I got some problems with the performance of the disks inside my VMs (both Win and Linux).

with dd or crystaldiskmark ("CDM") i get very good values. (around 400-1000mb/s writing and reading). And if I mount a sambashare and test it with CDM over LAN, I'll get 118mb/s reading and 60mb/s writing. so far so good. But when I'm trying to copy something with "cp" or over samba, I'll just get something like 20mb/s reading and 50mb/s writing. Changing the cache to writethrough increases the reading performance over samba up to 112mb/s, but the writing breaks down to underneath 1mb/s. Changing to "unsafe" gives me 20/60 for read/write.

Here some tests of the Host
Code:
root@pve:~# dd if=/dev/zero of=tempfile bs=1G count=1 conv=fdatasync,notrunc
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.28362 s, 836 MB/s

Code:
root@pve:~# dd if=tempfile of=/dev/null bs=1G count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 0.773135 s, 1.4 GB/s

Code:
root@pve:~# pveperf
CPU BOGOMIPS:      38400.40
REGEX/SECOND:      1012884
HD SIZE:           150.40 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND:     684.98
DNS EXT:           144.05 ms

Code:
 zfs get all rpool
NAME   PROPERTY              VALUE                  SOURCE
rpool  type                  filesystem             -
rpool  creation              Mon Feb 20 21:38 2017  -
rpool  used                  69.4G                  -
rpool  available             146G                   -
rpool  referenced            96K                    -
rpool  compressratio         1.25x                  -
rpool  mounted               yes                    -
rpool  quota                 none                   default
rpool  reservation           none                   default
rpool  recordsize            128K                   default
rpool  mountpoint            /rpool                 default
rpool  sharenfs              off                    default
rpool  checksum              on                     default
rpool  compression           on                     local
rpool  atime                 off                    local
rpool  devices               on                     default
rpool  exec                  on                     default
rpool  setuid                on                     default
rpool  readonly              off                    default
rpool  zoned                 off                    default
rpool  snapdir               hidden                 default
rpool  aclinherit            restricted             default
rpool  canmount              on                     default
rpool  xattr                 on                     default
rpool  copies                1                      default
rpool  version               5                      -
rpool  utf8only              off                    -
rpool  normalization         none                   -
rpool  casesensitivity       sensitive              -
rpool  vscan                 off                    default
rpool  nbmand                off                    default
rpool  sharesmb              off                    default
rpool  refquota              none                   default
rpool  refreservation        none                   default
rpool  primarycache          all                    default
rpool  secondarycache        all                    default
rpool  usedbysnapshots       0                      -
rpool  usedbydataset         96K                    -
rpool  usedbychildren        69.4G                  -
rpool  usedbyrefreservation  0                      -
rpool  logbias               latency                default
rpool  dedup                 off                    default
rpool  mlslabel              none                   default
rpool  sync                  standard               local
rpool  refcompressratio      1.00x                  -
rpool  written               96K                    -
rpool  logicalused           77.1G                  -
rpool  logicalreferenced     40K                    -
rpool  filesystem_limit      none                   default
rpool  snapshot_limit        none                   default
rpool  filesystem_count      none                   default
rpool  snapshot_count        none                   default
rpool  snapdev               hidden                 default
rpool  acltype               off                    default
rpool  context               none                   default
rpool  fscontext             none                   default
rpool  defcontext            none                   default
rpool  rootcontext           none                   default
rpool  relatime              off                    default
rpool  redundant_metadata    all                    default
rpool  overlay               off                    default

Code:
root@pve:~# zfs get all tank
NAME  PROPERTY              VALUE                  SOURCE
tank  type                  filesystem             -
tank  creation              Thu Apr 20 21:21 2017  -
tank  used                  380G                   -
tank  available             1.38T                  -
tank  referenced            128K                   -
tank  compressratio         1.00x                  -
tank  mounted               yes                    -
tank  quota                 none                   default
tank  reservation           none                   default
tank  recordsize            128K                   default
tank  mountpoint            /tank                  default
tank  sharenfs              off                    default
tank  checksum              on                     default
tank  compression           lz4                    local
tank  atime                 on                     default
tank  devices               on                     default
tank  exec                  on                     default
tank  setuid                on                     default
tank  readonly              off                    default
tank  zoned                 off                    default
tank  snapdir               hidden                 default
tank  aclinherit            restricted             default
tank  canmount              on                     default
tank  xattr                 on                     default
tank  copies                1                      default
tank  version               5                      -
tank  utf8only              off                    -
tank  normalization         none                   -
tank  casesensitivity       sensitive              -
tank  vscan                 off                    default
tank  nbmand                off                    default
tank  sharesmb              off                    default
tank  refquota              none                   default
tank  refreservation        none                   default
tank  primarycache          all                    default
tank  secondarycache        all                    default
tank  usedbysnapshots       0                      -
tank  usedbydataset         128K                   -
tank  usedbychildren        380G                   -
tank  usedbyrefreservation  0                      -
tank  logbias               latency                default
tank  dedup                 off                    default
tank  mlslabel              none                   default
tank  sync                  standard               default
tank  refcompressratio      1.00x                  -
tank  written               128K                   -
tank  logicalused           285G                   -
tank  logicalreferenced     40K                    -
tank  filesystem_limit      none                   default
tank  snapshot_limit        none                   default
tank  filesystem_count      none                   default
tank  snapshot_count        none                   default
tank  snapdev               hidden                 default
tank  acltype               off                    default
tank  context               none                   default
tank  fscontext             none                   default
tank  defcontext            none                   default
tank  rootcontext           none                   default
tank  relatime              off                    default
tank  redundant_metadata    all                    default
tank  overlay               off                    default

Guest (SSD-ZFS Pool):
Code:
root@ubuntuVM:~$ dd if=/dev/zero of=tempfile bs=1G count=1 conv=fdatasync,notrunc
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 17,0659 s, 62,9 MB/s
root@ubuntuVM:~$ echo 3 | sudo tee /proc/sys/vm/drop_caches echo 3 | sudo tee /proc/sys/vm/drop_caches
3
root@ubuntuVM:~$  dd if=tempfile of=/dev/null bs=1G count=1
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 1,693 s, 634 MB/s
root@ubuntuVM:~$  dd if=tempfile of=/dev/null bs=1G count=1
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 2,48739 s, 432 MB/s

Guest (HDD-ZFS Pool; "no cache"):
Code:
root@ubuntuVM:~$ dd if=/dev/zero of=tempfile bs=1G count=1 conv=fdatasync,notrunc
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 17,0659 s, 72,1 MB/s
root@ubuntuVM:~$  dd if=tempfile of=/dev/null bs=1G count=1
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 0,91693 s, 1.2 GB/s

Guest (HDD-ZFS Pool; "unsafe"):
Code:
root@ubuntuVM:~$ dd if=/dev/zero of=tempfile bs=1G count=1 conv=fdatasync,notrunc
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 2,0659 s, 371,1 MB/s
root@ubuntuVM:~$  dd if=tempfile of=/dev/null bs=1G count=1
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 0,91693 s, 985 MB/s


Do you have any hints, how I can improve the HDD performance of the vDisks for filetransfer?
 
Sorry, but your tests are total bogus. Please use fio to test and you'll see that you will not have good performance. 3 disks is just not good enough.

Testing with zero and ZFS does not generate I/O besides storing empty files. Zeros are unmapped and not stored on disk, so you will benchmark how efficient you can write metadata to your pool. Use either a "real" dataset like a big movie or the flexible i o benchmark (fio) to get "real" data.
 
Sry, but I don't believe that. According to other Tests, the performance of a 3-drive Raidz1 should be still ok enough and at the minimum as fast as at least one drive. If I enable writethrough, the reading speed is as high as it should be.
If I copy an ISO from the SSD-Pool to the HDD-Pool I get a filetransfer speed of about 250mb/s each direction. And I think this should be still ok.
 
Sry, but I don't believe that. According to other Tests, the performance of a 3-drive Raidz1 should be still ok enough and at the minimum as fast as at least one drive. If I enable writethrough, the reading speed is as high as it should be.
If I copy an ISO from the SSD-Pool to the HDD-Pool I get a filetransfer speed of about 250mb/s each direction. And I think this should be still ok.
Hi,
write speed of 250MB/s to an 3-hdd raidz1 sounds that you measure caching!
I don't assume that you get good performance with an 2.4 GHz Opteron and tree disks. My zfs-tests with AMD-CPUs (FX8350) are not really good (app. one year ago).

Udo
 
I don't think that the cache is bigger than or big as the 1.8GB iso file.
This Opteron has a Turbo of 2.8ghz and should have enough power for this and three disks with 128mb cache each and a transferrate of about 180mb/s should be fast enough, too. And more disks can't be added, the server is full. The only thing what I could do, is probably replace a single-HDD (not ZFS; just for backups) and insert a SSD-Cache. Would this work? Would be a 60GB SSD enough?
Instead of this, what should/can I do to get good parity and speed? I set the Server on Proxmox, ECC and ZFS to get a fast and "safe" system.

Code:
zpool iostat -v 30
                                                capacity     operations    bandwidth
pool                                         alloc   free   read  write   read  write
-------------------------------------------  -----  -----  -----  -----  -----  -----
rpool                                        70.2G   152G    256    271  2.20M  3.87M
  mirror                                     70.2G   152G    256    271  2.20M  3.87M
    sdb2                                         -      -     38     44  1.20M  3.88M
    sde2                                         -      -     33     45  1.05M  3.88M
-------------------------------------------  -----  -----  -----  -----  -----  -----
tank                                          590G  2.14T    200    238  1.56M  3.85M
  raidz1                                      590G  2.14T    200    238  1.56M  3.85M
    ata-HGST_HTS541010A9E680_JD10001VJW784M      -      -     18     19   993K  2.21M
    ata-ST1000LM048-2E7172_ZDE0EY6J              -      -     16     19  1001K  2.21M
    ata-ST1000LM048-2E7172_ZDE0EYE5              -      -     16     19  1001K  2.21M
-------------------------------------------  -----  -----  -----  -----  -----  -----

I don't think that this is too slow
 
Last edited:
Sorry, but disk transfer rates of 180 MB/s do not matter in real virtualization szenarios because they're normally measured in sequential, single i/o thread performance and are only applicable if you have big files that are not fragmented over the disk. These are ideal start conditions, but as soon as the disk gets filled and snapshots are made, you will asymptotically have mixed random read/write with some bigger read blocks. 3 disks will not be enough for that scenario. ZFS is always lightning fast in the beginning and will only be lightning fast afterwards if you increase your memory, 16 GB is not very much for that big pool.
 
why does it run for other people with 3*3TB and 8GB Ram?

What alternatives do I have?
RAM is too expensive at the moment and if I spend more money, the WAF will decrease very rapidly..

should I switch to Raid 5?
Or Raid1 and a single HDD?
or ZFS Raid1?

I got Nextcloud running in an Ubuntu 16 VM and I don't want to lose this data. I make every week a Backup to another HDD and every 2 weeks to another server, but I would like to have it fast and "safe".

Or is it possible to use the SSD-ZFS-RAID1 as Cache for the HDD-ZFS-RAIDZ1?

Here some CMD-Tests with Samba of a Linux-Guest:
SSD-ZFSRaid1:
Code:
----------------------------------------------------------------------
CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                           Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :   118.471 MB/s
  Sequential Write (Q= 32,T= 1) :    60.530 MB/s
  Random Read 4KiB (Q= 32,T= 1) :    83.087 MB/s [ 20284.9 IOPS]
 Random Write 4KiB (Q= 32,T= 1) :    66.507 MB/s [ 16237.1 IOPS]
         Sequential Read (T= 1) :   101.502 MB/s
        Sequential Write (T= 1) :    49.074 MB/s
   Random Read 4KiB (Q= 1,T= 1) :     5.663 MB/s [  1382.6 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :     6.232 MB/s [  1521.5 IOPS]

  Test : 1024 MiB [] (x1)  [Interval=5 sec]

and HDD-ZFS-RAIDZ1:
Code:
----------------------------------------------------------------------
CrystalDiskMark 5.1.2 x64 (C) 2007-2016 hiyohiyo
                           Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 4) :   115.367 MB/s
  Sequential Write (Q= 32,T= 4) :    53.227 MB/s
  Random Read 4KiB (Q= 32,T= 4) :    67.689 MB/s [ 16525.6 IOPS]
 Random Write 4KiB (Q= 32,T= 4) :    55.710 MB/s [ 13601.1 IOPS]
         Sequential Read (T= 1) :    99.611 MB/s
        Sequential Write (T= 1) :    48.861 MB/s
   Random Read 4KiB (Q= 1,T= 1) :     5.586 MB/s [  1363.8 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :     5.712 MB/s [  1394.5 IOPS]

  Test : 1024 MiB [] (x1)  [Interval=5 sec]

there's not much a big difference between HDD and SSD??

Edit:
And here is the CMD over Samba of the Host:

SSD:
Code:
   Sequential Read (Q= 32,T= 4) :   118.515 MB/s
  Sequential Write (Q= 32,T= 4) :   118.359 MB/s
  Random Read 4KiB (Q= 32,T= 4) :   115.762 MB/s [ 28262.2 IOPS]
 Random Write 4KiB (Q= 32,T= 4) :    16.686 MB/s [  4073.7 IOPS]
         Sequential Read (T= 1) :   106.734 MB/s
        Sequential Write (T= 1) :    98.983 MB/s
   Random Read 4KiB (Q= 1,T= 1) :     8.706 MB/s [  2125.5 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :     7.511 MB/s [  1833.7 IOPS]

HDD:
Code:
Sequential Read (Q= 32,T= 1) :   118.526 MB/s
  Sequential Write (Q= 32,T= 1) :   118.436 MB/s
  Random Read 4KiB (Q= 32,T= 1) :    82.035 MB/s [ 20028.1 IOPS]
 Random Write 4KiB (Q= 32,T= 1) :    20.709 MB/s [  5055.9 IOPS]
         Sequential Read (T= 1) :   104.006 MB/s
        Sequential Write (T= 1) :   102.773 MB/s
   Random Read 4KiB (Q= 1,T= 1) :     8.063 MB/s [  1968.5 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :     7.474 MB/s [  1824.7 IOPS]

copying files to the ZFS (SSD and HDD) over GBit-LAN is fast as possible. So I don't think it's my system. There's something bad with caching to the Guests.

Filetransfer to a vDisk on a single Disk has the same issues. So definitly no, the hardware is NOT the problem.

edit: Access to a Sambashare in a LXC-Container gives me the full Gbit-LAN. So definitively is the hardware NOT the problem.
 
Last edited:
how does the vm config look ?