[ANN] bzfs 1.18.0 near real-time ZFS replication tool is out

werwolf_

New Member
Feb 17, 2026
2
3
3
It improves operational stability by default. Also runs nightly tests on AlmaLinux-10.1 and AlmaLinux-9.7, and ties up some loose ends in the docs.

If you missed 1.17, it also improves handling of snapshots that carry a `zfs hold`. Also improves monitoring of snapshots, especially the timely pruning of snapshots (not just the timely creation and replication of the latest snapshots). Also added security hardening and running without ssh configuration files.

Details are in the changelog: https://github.com/whoschek/bzfs/blob/main/CHANGELOG.md
 
nice, do you have practical experiences? can you compare it to other solutions like zrepl, syncoid? tnx

Aspectbzfszreplsyncoid
LanguageShell scriptGoPerl
Config StyleCommand-line/cronYAML files, daemonCommand-line/cron
EncryptionSSH onlyNative TLS + client certsSSH only
CompressionManual pipelineConfigurableBuilt-in lzop
ComplexityLow (simple/fast)High (robust)Medium (safe defaults)
Migration EaseN/AFrom syncoid possibleFrom zrep works
 
Last edited:
* FYI, bzfs is written in Python (not a "shell script"), compression is configurable via CLI options, and it is similarly robust as zrepl, and far more robust and reliable than syncoid.

* I think zrepl (and syncoid) have different focus and angle and serves different needs.
* In a nutshell, bzfs can operate at much larger scale than zrepl (and syncoid), at much lower latency, in a more flexible and straightforward way.
* Here are just a few points off the top of my head that bzfs does and zrepl (and syncoid) doesn't:
* manage periodic ZFS snapshot creation, replication, pruning, and monitoring, across a fleet of N source hosts and M destination hosts, using the same single shared fleet-wide jobconfig script. Each of the M destination hosts receives replicas from (the same set of) N src hosts.
* Monitor if snapshots are successfully taken on schedule, successfully replicated on schedule, and successfully pruned on schedule, across the entire fleet.
* More powerful include/exclude filters for selecting what datasets and snapshots and properties to replicate.
* Can be strict or told to be tolerant of runtime errors.
* Has parametrizable retry logic
* Can be used not just for backup, but also for low latency replication use cases
* Can be scripted
* Includes a script that uses Lima to locally create a guest VM, then runs the testsuite or custom test scripts inside of the guest VM
* Has experimental AI Agent skill that can generate Bash or Python scripts that use bzfs and bzfs_jobrunner for custom ZFS workflows in a sandboxed test VM
* It handles the many edge cases that you will eventually run into over the course of your deployment (and which make other tools get stuck or fail). https://youtu.be/6Kw901oqxI8?si=_4uoG_ADbXznvaeZ&t=2408
* Other aspects:
* The zrepl codebase is vastly more complex and larger than bzfs; IMO, the designs and abstractions it introduces are much more complex than they need to be. For example, building a home grown daemon and secure transport layer is more a liability than upside for a tool like this. Complexity has a prize.
* bzfs is easier to change, test and maintain because Python is more readable to contemporary engineers than Go and Perl, and because the codebase is so much smaller and more straightforward than zrepl.
* These are just some points. Maybe the most important point is that zrepl is more a monolithic end user app than a building block. I believe it's good to have an rsync'ish CLI for ZFS replication that keeps simple things simple, and makes complex things possible, and enables higher level infra and various UIs to be built on top of that.
 
Last edited:
  • Like
Reactions: ucholak
@ucholak Please note that syncoid offers zstd / lz4 configurable compression options out of the box too.

Also, for having used all three tools for quite some time, I found the following:

Syncoid:
- Is really easy to setup and fits smaller deployments
- Separate snapshot & sync tools are neat
- Clean and easy to adapt codebase if specific features are needed

Zrepl has some neat advantages:
- Doesn't necessary rely on ssh as it can discuss in TCP TLS which is an advantage when not wanting to give ssh access
- Has a builtin console to see what's going on (progress per job), can also manually trigger a sync
- Has builtin prometheus support and a nice Grafana dashboard
- Development has slowed down a couple of years
- There is also a fork with newer features, but it's out of scope here
- Major issue for me would be that sync goes into a subdir, eg host1:tank/A becomes host2:tank/B/A

Bzfs:
- Has a ton of options
- Has one of the weirdest documentation I've ever read. Config files mix with code, and running `--help` will flood your screen with so much info you'd think you launched `man`
- Creates tons of log files (had gigabytes in just a couple of months)
- Tries to extract "progress" by logging `pv` output in separate log files which it also keeps forever (no log rotations)

All three tools have advantages / disadvantages, my journey got me started with zrepl, which I really enjoyed. I would have stayed if it wasn't for the subdir problem. I then tried bzfs for a couple of months, which I found painfully diffcult to understand because of too much information (and probably AI generated docs). I ended up using syncoid which really does the job IMHO, while keeping things simple and straight.