Post: Testing XFS vs ZFS speed hashing 700k files

waltar

Renowned Member
Jul 29, 2024
1,410
417
83
There's a interessant post where a user compare xfs with full clearing filesystem cache vs. zfs with nearly emtpy arc by max it to 4GB:
https://www.reddit.com/r/DataHoarder/comments/1f4ghzr/testing_xfs_vs_zfs_speed_hashing_700k_files/
If you compare xfs filesystem cache vs. arc by :
"cd /usr ; tar cf /<xfs-mount>/<dir-anywhere>/os.tar * /etc ; time cat /<xfs-mount>/<dir-anywhere>/os.tar >/dev/null" and
"cd /usr ; tar cf /<zfs-pool>/<dir-anywhere>/os.tar * /etc ; time cat /<zfs-pool>/<dir-anywhere>/os.tar >/dev/null"
you will see nearly same performance difference as in nearly cache-less test from user above - intelligence of arc here or there, it's slower.
So I'm wondering myself why the user is wondering itself about the difference while a system which has more "to_do" by checksums would be any time faster.
More work has it's price which add latency in the I/O path, you could zfs mirror where the price is capacity, doing raidz/draid where it's performance again
or take more disks in more vdevs then the price are the additional disks needed but data checksums (even in cow) are not for free.
For sure checksums are really nice and the value could be decided by everyone itself.
 
I have been using ZFS for data storage since long, it lost its lustre around the time NVMes became widely accessible and BTRFS got through its crisis (I do not use other than RAID1s). It used to be reliable, it made sense for dead data on pool full of spinning drives. I never ended up using it for a system drive, never seen the value. I didn't quite grasp why it would be good for any system drive, included those of VMs. I am more worried about any new features, even reflinks that took forever, than using BTRFS for the same.

From the linked post:

XFS was 4x faster.

Think this was a fair test? Possible caveat that the ZFS pool is at 66% capacity and has a bunch of other datasets and snapshots, but I'm not sure if that matters.

I suspect it does not even matter. It would have ended up this way in any other setup.