[ANN] bzfs 1.18.0 near real-time ZFS replication tool is out

werwolf_

New Member
Feb 17, 2026
2
3
3
It improves operational stability by default. Also runs nightly tests on AlmaLinux-10.1 and AlmaLinux-9.7, and ties up some loose ends in the docs.

If you missed 1.17, it also improves handling of snapshots that carry a `zfs hold`. Also improves monitoring of snapshots, especially the timely pruning of snapshots (not just the timely creation and replication of the latest snapshots). Also added security hardening and running without ssh configuration files.

Details are in the changelog: https://github.com/whoschek/bzfs/blob/main/CHANGELOG.md
 
nice, do you have practical experiences? can you compare it to other solutions like zrepl, syncoid? tnx

Aspectbzfszreplsyncoid
LanguageShell scriptGoPerl
Config StyleCommand-line/cronYAML files, daemonCommand-line/cron
EncryptionSSH onlyNative TLS + client certsSSH only
CompressionManual pipelineConfigurableBuilt-in lzop
ComplexityLow (simple/fast)High (robust)Medium (safe defaults)
Migration EaseN/AFrom syncoid possibleFrom zrep works
 
Last edited:
* FYI, bzfs is written in Python (not a "shell script"), compression is configurable via CLI options, and it is similarly robust as zrepl, and far more robust and reliable than syncoid.

* I think zrepl (and syncoid) have different focus and angle and serves different needs.
* In a nutshell, bzfs can operate at much larger scale than zrepl (and syncoid), at much lower latency, in a more flexible and straightforward way.
* Here are just a few points off the top of my head that bzfs does and zrepl (and syncoid) doesn't:
* manage periodic ZFS snapshot creation, replication, pruning, and monitoring, across a fleet of N source hosts and M destination hosts, using the same single shared fleet-wide jobconfig script. Each of the M destination hosts receives replicas from (the same set of) N src hosts.
* Monitor if snapshots are successfully taken on schedule, successfully replicated on schedule, and successfully pruned on schedule, across the entire fleet.
* More powerful include/exclude filters for selecting what datasets and snapshots and properties to replicate.
* Can be strict or told to be tolerant of runtime errors.
* Has parametrizable retry logic
* Can be used not just for backup, but also for low latency replication use cases
* Can be scripted
* Includes a script that uses Lima to locally create a guest VM, then runs the testsuite or custom test scripts inside of the guest VM
* Has experimental AI Agent skill that can generate Bash or Python scripts that use bzfs and bzfs_jobrunner for custom ZFS workflows in a sandboxed test VM
* It handles the many edge cases that you will eventually run into over the course of your deployment (and which make other tools get stuck or fail). https://youtu.be/6Kw901oqxI8?si=_4uoG_ADbXznvaeZ&t=2408
* Other aspects:
* The zrepl codebase is vastly more complex and larger than bzfs; IMO, the designs and abstractions it introduces are much more complex than they need to be. For example, building a home grown daemon and secure transport layer is more a liability than upside for a tool like this. Complexity has a prize.
* bzfs is easier to change, test and maintain because Python is more readable to contemporary engineers than Go and Perl, and because the codebase is so much smaller and more straightforward than zrepl.
* These are just some points. Maybe the most important point is that zrepl is more a monolithic end user app than a building block. I believe it's good to have an rsync'ish CLI for ZFS replication that keeps simple things simple, and makes complex things possible, and enables higher level infra and various UIs to be built on top of that.
 
Last edited:
  • Like
Reactions: ucholak