I've been chewing on this for months. Getting hardware into place to test it. Won't bore you with the story. I didn't waste much money if it isn't wonderful, but I'm hoping for wonderful.
Special VDEV is ... you can add a couple SSDs to a ZFS pool to speed it up. This isn't old-style hybrid drive with a cache up front. Like most ZFS stuff, its sorta like that, but different. The SSD vdev contains all the metadata for the pool. You can also tell it to accept files up to a certain size. That gives you fast searches and ... this part is me interpreting ... let's you address some of the worst part of a performance curve for enterprise storage.
There's two config items at play.
The recordsize is the point where a record gets broken into chunks. Anything larger is cut up into pieces of recordsize.
The special_small_blocks is the size block that will be written to the special vdev. Files of size larger than special_small_blocks get written to the main pool.
I need to figure out where the balance point between recordsize and special_small_blocks is for a PBS server.
Here's a histogram from 2 of them.
(Warning. If you run this code, do it in the datastore with the files you want to count, and be aware that it might run overnight.)
On b0x1, I think
recordsize=512k
special_small_blocks=256k
On b0x2, I think
recordsize=1M
special_small_blocks=256k
I'm fairly new to ZFS. This is an advanced topic. Any insight, or even just your own interpretation of these histograms would be welcome.
Thanks.
Special VDEV is ... you can add a couple SSDs to a ZFS pool to speed it up. This isn't old-style hybrid drive with a cache up front. Like most ZFS stuff, its sorta like that, but different. The SSD vdev contains all the metadata for the pool. You can also tell it to accept files up to a certain size. That gives you fast searches and ... this part is me interpreting ... let's you address some of the worst part of a performance curve for enterprise storage.
There's two config items at play.
The recordsize is the point where a record gets broken into chunks. Anything larger is cut up into pieces of recordsize.
The special_small_blocks is the size block that will be written to the special vdev. Files of size larger than special_small_blocks get written to the main pool.
I need to figure out where the balance point between recordsize and special_small_blocks is for a PBS server.
Here's a histogram from 2 of them.
(Warning. If you run this code, do it in the datastore with the files you want to count, and be aware that it might run overnight.)
On b0x1, I think
recordsize=512k
special_small_blocks=256k
On b0x2, I think
recordsize=1M
special_small_blocks=256k
I'm fairly new to ZFS. This is an advanced topic. Any insight, or even just your own interpretation of these histograms would be welcome.
Thanks.
Code:
b0x1: /mnt/datastore/Backups]# find . -type f -print0 | xargs -0 ls -l | awk '{ n=int(log($5)/log(2)); if (n<10) { n=10; } size[n]++ } END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' | sort -n | awk 'function human(x) { x[1]/=1024; if (x[1]>=1024) { x[2]++; human(x) } } { a[1]=$1; a[2]=0; human(a); printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'
1k: 3484
2k: 2391
4k: 3367
8k: 6027
16k: 10471
32k: 16789
64k: 32644
128k: 74453
256k: 215039
512k: 383804
1M: 385362
2M: 405238
4M: 69865
8M: 7
Code:
b0x2:/rpool/BACKUP# find . -type f -print0 | xargs -0 ls -l | awk '{ n=int(log($5)/log(2)); if (n<10) { n=10; } size[n]++ } END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' | sort -n | awk 'function human(x) { x[1]/=1024; if (x[1]>=1024) { x[2]++; human(x) } } { a[1]=$1; a[2]=0; human(a); printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'
1k: 10372
2k: 7137
4k: 3602
8k: 6272
16k: 10094
32k: 20460
64k: 33754
128k: 100055
256k: 195302
512k: 453410
1M: 394942
2M: 530326
4M: 80923
1G: 1
Last edited: