mostly sync=standard seems to have made a huge difference which didn't seem to make much of one at all before.
i recreated my small pool and used
zpool upgrade <poolname>
to upgrade my 12TB drive, which if you upgrade an old pool to make use of fast dedup over old dedup, you need to create new datasets with a different algorithm (i ended up using
checksum=skein
dedup=skein,verify
on my backups dataset after some testing ) otherwise it won't make a new table and use fast dedup. (if you use it on any of your datasets )
i also set these in /etc/modprobe.d/zfs.conf
Code:
options zfs zfs_arc_max=10087301120
options zfs zfs_vdev_min_pending=1
options zfs zfs_vdev_max_pending=32
options zfs zfs_txg_timeout=40
options zfs zfs_no_write_throttle=1
options zfs zfs_dirty_data_max_max_percent=50
options zfs zfs_dirty_data_max_percent=50
options zfs zfs_delay_min_dirty_percent=80
for zfs_vdev_*_pending= (got the idea from here
ZFS Slow Performance Fix), it depends on drive type, 1-8 for sata drives, 32 for SAS drives i believe.
but i am not entirely sure about everything i set I'm just experimenting here.
i ended up using these on my backups dataset (and most my other datasets just with lower zstd compression and different record sizes depending on content)
Code:
zfs set primarycache=metadata twelve/backupz
zfs set compression=zstd-19 twelve/backupz
zfs set recordsize=128k twelve/backupz
zfs set checksum=skein twelve/backupz
primarycache=metadata = for data not frequently accessed to not waste the arc, this twelve pool is all for backups and files so basically nothing on it will be frequently re-read.
compression=zstd-19, i usually use zstd for files that mostly dont compress well but might have a file here and there that will, off for media, 3-7 for basic files / fast access, 11-13 for files that are almost never accessed but i dont want to be slow but 19 is the highest compression and slowest.
recordsize=128k, i usually use 128k for compressible / small files, backups, etc, 256-512k for mixed files, 1M-2M for media, 4M-8M for AI models
for dedup= / checksum settings =
Checksums and Their Use in ZFS skein / blake3 seem to be best unless you want to be extremely sure there is absolutely no risk of hash collisions, i think sha512/sha256 are best (for no risk of hash collision and data integrity) but sha256 is the default for dedup so you need a different algorithm if upgrading a pool and not creating a new pool.
there may be more experienced people who can offer more in depth / better advice on this though, I'm again just experimenting here going off everything i have been reading on ZFS