Taking advantage of Special vdev after data has been written

mgaudette112

Member
Dec 21, 2023
39
2
8
Hi,

I need a little sanity check. I have added special vdev to my 8TB datastore to make verifying a little faster. I do know that once data is written, it won't magically take advantage of this new addition.

Unfortunately, I do not have 6TB+ necessary lying around to do the "zfs send / receive" that is typically done. I am wondering if it makes any sense to put the datastore offline, take the .chunks folder, and (with a script) copy each subfolder into a new one, delete the old one, and rename the newly copied one under the old name.

Will this put the special vdev in play after this copy/rename script job?
 
please show the output from: zpool list -v <pool-name>
Code:
NAME                         SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
backup                      7.70T  4.88T  2.82T        -         -    11%    63%  1.00x    ONLINE  -
  mirror-0                  7.27T  4.88T  2.38T        -         -    11%  67.2%      -    ONLINE
    wwn-0x5000c50091aef82c  7.28T      -      -        -         -      -      -      -    ONLINE
    wwn-0x5000c50091e5ec8d  7.28T      -      -        -         -      -      -      -    ONLINE
special                         -      -      -        -         -      -      -      -         -
  mirror-1                   444G   220M   444G        -         -     0%  0.04%      -    ONLINE
    wwn-0x50026b7686ad4066   447G      -      -        -         -      -      -      -    ONLINE
    wwn-0x50026b7686ad3f64   447G      -      -        -         -      -      -      -    ONLINE
 
You can set your quota for the zfs pool "backup" to 6.20 TByte with: zfs set quota=6.20T backup
The ZFS Special Device will now be used, over the time it will grow up.
Witch SSD drive do you use for wwn-0x50026b7686ad4066 and wwn-0x50026b7686ad3f64?

please run: lsblk -o name,uuid,fstype,mountpoint,label,size
 
You can set your quota for the zfs pool "backup" to 6.20 TByte with: zfs set quota=6.20T backup
The ZFS Special Device will now be used, over the time it will grow up.
Witch SSD drive do you use for wwn-0x50026b7686ad4066 and wwn-0x50026b7686ad3f64?

please run: lsblk -o name,uuid,fstype,mountpoint,label,size

Im sorry, I think you are misunderstanding. I dont want to use exclusively the special vdev for new data, I want to balance my pool so that metadata and small files are on the special vdev.
 
Ok, you can run this script:
Code:
find . -type f -print0 | xargs -0 ls -l | \
awk '{ n=int(log($5)/log(2)); if (n<10) { n=10; } size[n]++ } END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' | \
sort -n | \
awk 'function human(x) { x[1]/=1024; if (x[1]>=1024) { x[2]++; human(x) } } { a[1]=$1; a[2]=0; human(a); printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'
# Source: https://forum.level1techs.com/t/zfs-metadata-special-device-z/159954

Then you setup for some Dataset the new size special_small_blocks .

I use zfs set special_small_blocks=128K <pool>/<dataset> on some Datasets.

Check: zfs get special_small_blocks -r backup -t filesystem
 
Last edited:
CONCLUSION - I understand from another thread ( https://forum.proxmox.com/threads/zfs-metadata-special-device.129031/#post-564923 ) that the metadata can't be moved quite so easily to the special vdev, so I ended up splitting my mirror, using the second drive for a temporary zfs send/receive and resilvering my mirror.

Obviously I would not recommend this in general as the datastore is not redundant during this procedure, but I do have a remote backup holding what matters so I took the chance.
 
Last edited:
  • Like
Reactions: Johannes S
I have added special vdev to my 8TB datastore to make verifying a little faster.
Special device does not help with verify, as verify reads the whole data from your HDD to checksum each chunk again and make sure that it still has the same checksum it had when originally stored. I mean, it barely puts any load on the special vdev.

Special vDev does help with backup listing, garbage collector and a bit on PBS servers with some degree of backup/restores concurrency.

I ended up splitting my mirror, using the second drive for a temporary zfs send/receive and resilvering my mirror.
Good option. I would have created a second datastore on the splitted up disk, pointed my PVE to the new datastore and used PBS sync to move the backups from the old datastore to the new. Then resilver the mirror from the new DS to the old disk. Mainly to keep control on what is where and to be able to pause/resume the process if needed as I usually deal with 100TB+ datastores. Also sometimes is a good option to sync to another PBS, setup special vdev and sync back.
 
  • Like
Reactions: Johannes S