omnios + napp-it for zfs questions

RobFantini · Sep 21, 2014

Hello,
I'm attempting to set up Comstar to use for kvm's .

omnios + napp-it are already working.

I've 4 1-TB disks set to raidz .

I'm looking for help on the correct way to set up Comstar to use with PVE.

My guess is to do the following from Comstar menu in napp-it :

1- create a volume .
Is just one large volume needed or one per kvm?
should ' thin provision ' be set ?

2- do logical units and targets need to be created?

mir · Sep 21, 2014

In Comstar all you need to do is to create a target and a target portal group. To this target portal group you add your target.
Volumes and logical units are created by Proxmox as needed when you create a disk for a VM.

When adding a ZFS storage in the 'Datacenter' storage tab choose (provided you use proxmox 3.3):
Add->ZFS

Following should be provide in the window:
ID: Unique name
Portal: IP of storage
Pool: Name of ZFS pool
Block Size: 8k gives better performance but slightly more space is wasted
Target: Target name
iSCSI provider: comstar
Thin Provision: Optional but allows to over provision your storage
Write Cache: If pools option 'sync' is standard or always it is safe to enable write cache for improved performance

RobFantini · Sep 22, 2014

Hello Mir,
1-st thanks for your reply...

I am having an issue using the storage , when I try to restore a vzdump this occurs:

Code:

  restore vma archive: lzop -d -c /tank/fbc-pve-bkup/dump/vzdump-qemu-88671-2014_09_16-22_03_39.vma.lzo|vma extract -v -r /var/tmp/vzdumptmp943471.fifo - /var/tmp/vzdumptmp943471
 CFG: size: 462 name: qemu-server.conf
 DEV: dev_id=1 size: 17188257792 devname: drive-virtio0
 CTIME: Tue Sep 16 22:03:40 2014
 new volume ID is 'h241:vm-4671-disk-1'
 map 'drive-virtio0' to 'iscsi://172.30.24.241/iqn.2010-09.org.napp-it:1411325555/0' (write zeros = 1)
 

 ** (process:943475): ERROR **: can't open file iscsi://172.30.24.241/iqn.2010-09.org.napp-it:1411325555/0 - iSCSI: Failed to connect to LUN : SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_OPERATION_CODE(0x2000)
 

 /bin/bash: line 1: 943474 Broken pipe lzop -d -c /tank/fbc-pve-bkup/dump/vzdump-qemu-88671-2014_09_16-22_03_39.vma.lzo
 943475 Trace/breakpoint trap | vma extract -v -r /var/tmp/vzdumptmp943471.fifo - /var/tmp/vzdumptmp943471
 temporary volume 'h241:vm-4671-disk-1' sucessfuly removed
 TASK ERROR: command 'lzop -d -c /tank/fbc-pve-bkup/dump/vzdump-qemu-88671-2014_09_16-22_03_39.vma.lzo|vma extract -v -r /var/tmp/vzdumptmp943471.fifo - /var/tmp/vzdumptmp943471' failed: exit code 133

this is our pve storage :

Code:

 /etc/pve/ storage.cfg  
   zfs: h241
         blocksize 8k
         target iqn.2010-09.org.napp-it:1411325555
         pool tank
         iscsiprovider comstar
         portal 172.30.24.241
         content images
         comstar_tg tg1
         nowritecache

and iscsi info from napp-it:

Code:

[B]itadm list-target -v[/B][HR][/HR]
TARGET NAME                                                  STATE    SESSIONS 
iqn.2010-09.org.napp-it:1411325555                           online   0        
    alias:                  09.21.2014
    auth:                   none (defaults)
    targetchapuser:         -
    targetchapsecret:       unset
    tpg-tags:               default

Do you have any suggestions to fix the issue?

thanks
Rob Fantini

mir · Sep 22, 2014

Try to delete the reference to the taget group. Eg. Tg1

RobFantini · Sep 23, 2014

Mir,
deleting the target group fixed the above issue, thank you.

Next I am restoring a vzdump backup of a kvm with a 16GB disk.
After 2+ hours that is just 1/2 done:

Code:

progress 50% (read 8594128896 bytes, duration 8649 sec)

I'll check the network set up on omnios .. In mean time if you've any suggestions please let me know.

mir · Sep 23, 2014

What do you get when running Pools->Benchmarks in napp-it?

What is your pool setting for sync? (zfs get all tank |grep sync)
What is your pool setting for atime? (zfs get all tank |grep atime)
What is your pool setting for compression? (zfs get all tank |grep compression)

What does the following show:
zpool status tank

From where to you read the vzdump backup and to where do you write?

RobFantini · Sep 23, 2014

What do you get when running Pools->Benchmarks in napp-it?
There are a few test , I just ran 'filebench:

Code:

[B]Results for fileserver.f, please wait 30 s until restarting tests..[/B][HR][/HR]start filebench..
Filebench Version 1.4.9.1
23327: 0.000: Allocated 126MB of shared memory
23327: 0.015: File-server Version 3.0 personality successfully loaded
23327: 0.015: Creating/pre-allocating files and filesets
23327: 0.026: Fileset bigfileset: 10000 files, 0 leafdirs, avg dir width = 20, avg dir depth = 3.1, 1254.784MB
23327: 0.028: Removed any existing fileset bigfileset in 1 seconds
23327: 0.028: making tree for filset /tank/filebench.tst/bigfileset
23327: 0.044: Creating fileset bigfileset...
23327: 0.901: Preallocated 8015 of 10000 of fileset bigfileset in 1 seconds
23327: 0.901: waiting for fileset pre-allocation to finish
23327: 0.901: Starting 1 filereader instances
23330: 0.928: Starting 50 filereaderthread threads
23327: 2.075: Running...
23327: 32.078: Run took 30 seconds...
23327: 32.082: Per-Operation Breakdown
statfile1            44218ops     1474ops/s   0.0mb/s      0.1ms/op       12us/op-cpu [0ms - 158ms]
deletefile1          44210ops     1474ops/s   0.0mb/s      7.0ms/op       67us/op-cpu [0ms - 232ms]
closefile3           44230ops     1474ops/s   0.0mb/s      0.0ms/op        2us/op-cpu [0ms - 17ms]
readfile1            44230ops     1474ops/s 193.1mb/s      0.1ms/op       45us/op-cpu [0ms - 90ms]
openfile2            44230ops     1474ops/s   0.0mb/s      0.2ms/op       18us/op-cpu [0ms - 229ms]
closefile2           44230ops     1474ops/s   0.0mb/s      0.0ms/op        4us/op-cpu [0ms - 42ms]
appendfilerand1      44230ops     1474ops/s  11.5mb/s      7.3ms/op       48us/op-cpu [0ms - 189ms]
openfile1            44242ops     1475ops/s   0.0mb/s      0.2ms/op       17us/op-cpu [0ms - 145ms]
closefile1           44243ops     1475ops/s   0.0mb/s      0.0ms/op        4us/op-cpu [0ms - 42ms]
wrtfile1             44244ops     1475ops/s 184.0mb/s     10.8ms/op       79us/op-cpu [0ms - 258ms]
createfile1          44257ops     1475ops/s   0.0mb/s      7.1ms/op       71us/op-cpu [0ms - 186ms]
23327: 32.082: 

IO Summary: [HR][/HR]486564 ops, 16217.365 ops/s, (1474/2949 r/w), 388.6mb/s,    255us cpu/op,  10.9ms latency
23327: 32.082: Shutting down processes

ok.

What is your pool setting for sync? (zfs get all tank |grep sync):

Code:

tank  sync                  standard               default

What is your pool setting for atime? (zfs get all tank |grep atime)

Code:

tank  atime                 on                     default

What is your pool setting for compression? (zfs get all tank |grep compression)

Code:

tank  compression           off                    default

What does the following show:
zpool status tank

Code:

  pool: tank
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            c6d0    ONLINE       0     0     0
            c7d0    ONLINE       0     0     0
            c8d1    ONLINE       0     0     0
            c9d0    ONLINE       0     0     0

errors: No known data errors

From where to you read the vzdump backup and to where do you write?

RobFantini · Sep 23, 2014

> From where to you read the vzdump backup and to where do you write?

both hosts are in same rack connected using 1GB speed ethernet.

iperf client at napp-it tested 627 Mbits/sec with restore in progress.

restore is still in progress:

Code:

progress 92% (read 15813246976 bytes, duration 25787 sec)

that is 7 hours...

mir · Sep 23, 2014

That looks descend. You could try zfs set atime=off tank.

mir · Sep 23, 2014

RobFantini said:
> From where to you read the vzdump backup and to where do you write?

both hosts are in same rack connected using 1GB speed ethernet.

iperf client at napp-it tested 627 Mbits/sec with restore in progress.

restore is still in progress:

Code:

progress 92% (read 15813246976 bytes, duration 25787 sec)

that is 7 hours...

What sort of storage are you reading from? What are the performance of that storage?

Iperf test indicates that your problems seems to be related to the storage you are reading from.

RobFantini · Sep 23, 2014

>What sort of storage are you reading from? What are the performance of that storage?

the data is read from a pve system using zfsonlinux . so that is slow but reliable. I do not thing it is the cause of a 7 hour vzrestore ..

> You could try zfs set atime=off tank
OK I set that.

on the napp-it system the 4 zfs drives are attached to sata on the motherboard. I have a LSI 9207-8e on order. when that comes in I'll add 4 more drives + z cache and log drive..

do you use log and cache drives?

mir · Sep 23, 2014

RobFantini said:
>What sort of storage are you reading from? What are the performance of that storage?

the data is read from a pve system using zfsonlinux . so that is slow but reliable. I do not thing it is the cause of a 7 hour vzrestore ..

It depends on how the restore is done. If the uncompressing is done on the slow disk system and files then are copied one by one this could cause a slow performance. Also network performance has a big impact.

You haven't mention the amount of RAM in the omnios box. Anything below 8 GB RAM will have a severe impact on performance and anything below 16 GB will never have decent performance.

RobFantini said:
do you use log and cache drives?

Yes, a SSD for log and cache. The SSD is a 128 GB which is partitioned in a 8 GB for log and 92 GB for cache.

And very basic read/write test done with napp-it dd-bench for 20 GB.

Code:

Memory size: 16329 Megabytes

write 20.48 GB via dd, please wait...
time dd if=/dev/zero of=/vMotion/dd.tst bs=2048000 count=10000

10000+0 records in
10000+0 records out
20480000000 bytes transferred in 15.202720 secs (1347127325 bytes/sec)

real       15.2
user        0.0
sys        14.0

20.48 GB in 15.2s = [B]1347.37 MB/s Write[/B]

wait 40 s
[HR][/HR]read 20.48 GB via dd, please wait...
time dd if=/vMotion/dd.tst of=/dev/null bs=2048000

10000+0 records in
10000+0 records out
20480000000 bytes transferred in 10.925937 secs (1874438825 bytes/sec)

real       10.9
user        0.0
sys        10.8

20.48 GB in 10.9s = [B]1878.90 MB/s Read
[/B]

RobFantini · Sep 25, 2014

Hello Mir,
I'm still waiting for new LSI controller card , I think that will solve the slowness issue... Fos instance in kvm cli typing ' ls -l / ' lags.. even 2-nd time.

In the mean time a couple of quesitons:

1- for pve datastorage you wrote above: " Write Cache: If pools option 'sync' is standard or always it is safe to enable write cache for improved performance"

- I looked but did not find in napp-it gui
could you tell me where that is set?

2- for kvm storage, what setting should be used for cache?

thanks in advance
Rob

mir · Sep 25, 2014

I am more and more thinking of network congestion. Also regarding your disks, are they what is referred to as "green" or "green line" disks? Green disks runs with reduced rotation speed and spins down quickly as such not suited for storage arrays for "hot data".

1) napp-it gui 'pools'. The list of pools has a column called 'SYNC' which contains clickable links to change sync.
2) it depends. You will need to test. nocache works best for my usecase.

RobFantini · Sep 25, 2014

mir said:
I am more and more thinking of network congestion. Also regarding your disks, are they what is referred to as "green" or "green line" disks? Green disks runs with reduced rotation speed and spins down quickly as such not suited for storage arrays for "hot data".

Mir - the disks are WD RE series .... The issue I think is that I'm using a desktop motherboard for testing and one of the sata connections goes to an odd type of connection.
So I think the LSI card will solve the issue , but we'll see..

RobFantini · Sep 25, 2014

>1) napp-it gui 'pools'. The list of pools has a column called 'SYNC' which contains clickable links to change sync.
OK finally found it.

the options are: standard, always and disabled. I set to 'always '.

RobFantini · Sep 26, 2014

Mir
the LSI 9207-8e card makes a huge difference. cli within kvm is very fast.

thanks for your help so far.

I want to add to the pve wiki page on how to set up an omnios system for kvm use.

did you write parts of the existing page?

also I found the using google chrome for napp-it works best. I can get at the top level menus . using iceweasel/firefox I could not.

mir · Sep 26, 2014

what existing page are you referring to?

You wrote you have configure sync always. I would recommend to use sync standard since sync always means that every write is waiting for a flush even if the guest have not requested a flush which means a big performance cost.

napp-it with iceweasel/firefox works exactly like chrome for me.

PS. I can recommend to buy the app_monitor extension for realtime monitoring of your pools. See http://www.napp-it.org/extensions/monitoring_en.html

RobFantini · Sep 27, 2014

>what existing page are you referring to?
http://pve.proxmox.com/wiki/Storage:_ZFS . there is mention of omnios , napp-it and cache settings.

I'll change the sync to standard.

Also today I added a log and cache drive. Napp-it has a lot of options and was easy for setting up log and cache, but I would never have figured out Comstar with out your help.

So adding how to use pve + comstar etc is something I'll try to get to.

I plan to continute to use ceph cluster for production and omnios / zfs for backups and backup/stand by systems. I've used zfs for 5+ years and know zfs is very reliable, but not very fast yet.

With my setup it is still taking too long to restore backups to omnios ... I saw another thread regarding backups and using /tmp or something. Maybe similar could be used for restore.

omnios + napp-it for zfs questions

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

Famous Member

We value your privacy