Backup tool for standalone ZFS datasets to Proxmox Backup Server

NickyDoes

Member
May 16, 2022
27
8
23
Raleigh, North Carolina, USA
PBS handles VM and LXC backups well, but it doesn't have a native answer for ZFS datasets that live outside VMs and containers.

If you run a NAS with ZFS datasets—photos, videos, media libraries, ISOs—exported over NFS or SMB and mounted into your VMs, that data lives on the hypervisor or a storage server, not inside the guest. PBS doesn't touch it.

I wrote zpbs-backup to fill this gap. Configuration lives entirely in ZFS properties. `zpbs-backup` manages those properties.

Code:
zpbs-backup set zpbs:backup=true tank/media
Code:
zpbs-backup set zpbs:schedule=weekly tank/media/movies
Bash:
zpbs-backup set zpbs:retention=30d,12w,12m,3y tank/photos
Bash:
zpbs-backup set zpbs:priority=10 tank/documents
Bash:
zpbs-backup set zpbs:backup=false tank/media/downloads
The tool discovers marked datasets and backs them up to PBS.

zpbs-backup

Features:
  • Works with any ZFS dataset (VM NAS, Samba, External NAS)
  • Auto-discovery via ZFS properties—no config files
  • Inheritance through the dataset hierarchy (enable on parent, children follow)
  • Per-dataset schedules (daily/weekly/monthly)
  • Per-dataset retention policies Priority ordering for critical data
  • Dry-run and audit modes Systemd timer for unattended operation
  • email and syslog notification options
 
  • Like
Reactions: woma and Johannes S
What does this do, what you can't achieve with the native ptoxmox-backup-Client?
 
I've already written what it does, and am not about to go on the full defensive. Reiterating succinctly: This is a wrapper for PBC. It adds inheritable backup config from within the dataset properties (vs. remembering to modify your PBC scripts). Other features as previously described.

I'm happy to defend or defer if I've missed something, but you've got to ask a more cogent question first.
 
Dear NickyDoes, thank you for this. This was exactly what I was looking for :).

I had some trouble getting it to run (partly because I did not read the documentation well enough).
Some observations:
* `pip install .` didn't work on proxmox ve. I had to make a virtualenv and point the systemd service to the executable in the virtualenv directory;
* even though I gave the token full permissions, it still did not manage to create its automatically derived namespace. It worked when I suggested a namespace myself, though;
* I'm still not having any luck pruning (Error: permission check failed - missing Datastore.Modify|Datastore.Prune -> even though the token has these permissions). If I find the time I'll probably post an issue in your repo if that's allright with you.

But other than that, this is really useful for us. Thanks again!

Best regards,

Victor
 
I have some simplifications to features and docs queued up for a push - nothing overly breaking, but it's worth reading/checking. I'm expecting to get time later today (US East Coast / GMT -5) to make this a properly installable package. Go ahead and either post or put up a PR in the meantime. I'm happy to have others go through my code, docs, etc.

Thanks for the help on this.
 
I've made many modifications and improvements to zpbs-backup. If you tried it already, and were disappointed or confused, I apologize. Please give v0.5.0 or later a try. Expect most changes to be breaking, aside from the dataset properties you already might have set.

Here's what's been improved:

  • the tool is fully installable. Download the appropriate package from github releases and install with something like `sudo apt install ./zpbs-backup_0.5.0_amd64.deb`.
  • proxmox-backup-server config can be via environment variables or at `/etc/zpbs-backup/pbs.conf`.
  • added a pbs connectivity check. This also announces which config source zpbs-backup is using to connect to pbs.
  • installs as a service. Runs at ~2 am. Run now, or run now with -bg to run now in the background
This tool solves a long overdue problem for me. I hope you find it useful as well.
 
  • Like
Reactions: broth-itk
So timely, thank you! Was using sanoid/syncoid, but was interested in setting up pbs.
Glad it's working for you.

A little preview: For Sanoid / Samba users, I started another project as a modern Sanoid replacement. It's still rough and not feature complete.

- policies live on the dataset (like zpbs-backup)
- independent - no reliance on cron or similar system services
- single binary, authored in Go
- (planned) Samba integration for near-native Windows Previous Versions compatibility

https://github.com/ndemarco/zvolta
 
Last edited:
This is incredible, I'm looking forward to trying it out. While change-detection is valuable, I wonder if there's a way to selectively trigger it basedon the creation of a new snapshot, for example?

I.E. I have TrueNAS taking snapshots only when there have been changes made to the contents of the dataset - it would be nice if I could run a backup (with change-detection leveraged by PBC) only on the datasets where I already know there is new data since the last snapshot/backup because of the new snapshot.

For reference I have hundreds of per-project datasets and tens of millions of files, and most of the projects are dormant on any given day/week. So saving the compute time would be a huge efficiency gain.
 
This is incredible, I'm looking forward to trying it out. While change-detection is valuable, I wonder if there's a way to selectively trigger it basedon the creation of a new snapshot, for example?

I.E. I have TrueNAS taking snapshots only when there have been changes made to the contents of the dataset - it would be nice if I could run a backup (with change-detection leveraged by PBC) only on the datasets where I already know there is new data since the last snapshot/backup because of the new snapshot.

For reference I have hundreds of per-project datasets and tens of millions of files, and most of the projects are dormant on any given day/week. So saving the compute time would be a huge efficiency gain.

Would checling the ZFS written property work?

`zpbs-backup` queries `zfs get written <dataset>` during auto-discovery.
If 0B written, no data has changed since the last snapshot. zpbs-backup skips calling PBC.

It's stateless, relying purely on native ZFS tracking. This is in line with the utility's overall approach.

Please check this ZFS property on two datasets that differ in post-snapshot writes:

`zfs get written tank/project-alpha`
 
Updated to v0.8.0 — heads up: includes default behavior changes

This release is a breaking change: behavior on a fresh run is different from v0.7.x, even though nothing about the CLI or config changed. If you upgrade and see datasets being skipped where they previously ran, that's the new default at work.

`zpbs-backup` now consults two native ZFS properties before backing up each due dataset — `written` (bytes since the most recent snapshot) and the latest snapshot's creation time. If `written == 0` and that snapshot is older than the last successful backup (within a 60s safety margin), the dataset is skipped entirely: No PBS round-trip, no chunk walk, nothing.

Given hundreds of datasets, with most dormant on any given day, previously every run did a full PBS handshake per dataset to confirm nothing had changed. Now we inspect ZFS properties directly, then trust the answer.

There's a pre-flight clock skew check against PBS (one HTTPS HEAD, reads the Date: header). If the two hosts disagree by more than 60s, the new skip logic is disabled for that run and the tool safely disables the skip filter with a loud warning. A misconfigured NTP can't cause a silent missed backup.

If you want the old behavior back, `--force` bypasses both checks. There's no flag to permanently disable.

Tag: v0.8.0 · https://github.com/ndemarco/zpbs-backup
 
Last edited:
dumb question, i have my zfs data pool on my proxmox pve, this tool looks exactly what i am looking for. I had been rsyncing to an older unraid xfs array server. If I were to get rid of the unraid server and put PBS on it, would it then need to be zfs? I ask because i have a mismash of drives on unraid and would rather not zfs on it since it's only backed up to nightly
 
I'm pretty sure PBS only backs up to ZFS datastores. Edit: PBS is tolerant of any "normal" filesystem (per UdoB below).

But I don't understand your concern. ZFS is, I think, the best way to make leftover drives into one or more pools, even without redundancy.

PBS includes dedup chunking and other cool benefits. Once you define your retention strategy, it's hands off.
 
Last edited:
Thanks, the only concern was that i have over 50TB of zfs pool data on my pve. on my backup server i don't know if I have enough drives to get a zfs pool that size up, which is why i originally went with unraid (ability to use diff drive sizes). I'll see if I can get a smaller zfs pool going because this tool looks exactly what i wanted
 
Last edited:
  • Like
Reactions: Johannes S