Actual ZFS send/receive backups?

Deleted member 205422 · Nov 10, 2023

So the first google result already got me what I wanted: https://blog.guillaumematheron.fr/2023/261/taking-advantage-of-zfs-for-smarter-proxmox-backups/

The key part there is "the full volume must still be read from disk for each backup", now that's quite funny with NVMe's and 20Gbps migration network connection when the dataset is almost 1TB.

I did try zfs send and receive (with mbuffer) and it worked like a charm, but given how ZFS focused PVE is (clearly favouring it with features supported where the same is not for BTRFS), why can't one simply backup using the zfs send/receive in a streamlined way?

sterzy · Nov 10, 2023

The reason is fairly simple: ZFS is favored over BTRFS, because BTRFS is still considered a technology preview. BTRFS also still has a couple of showstopper issues, which is why we don't support it at the same level (see e.g., [1]). However, backups are a fundamental feature, and they need to work consistently across file systems that are supported. ZFS send/receive does not work for XFS or ext4 and so on. Also, a backup solution around ZFS send/receive comes with its own problems that we'd need to resolve and maintain. So at the moment we don't think this is the ideal approach.

[1]: https://wiki.debian.org/Btrfs#Other_Warnings

Deleted member 205422 · Nov 10, 2023

sterzy said:
The reason is fairly simple: ZFS is favored over BTRFS, because BTRFS is still considered a technology preview. BTRFS also still has a couple of showstopper issues, which is why we don't support it at the same level (see e.g., [1]). However, backups are a fundamental feature, and they need to work consistently across file systems that are supported. ZFS send/receive does not work for XFS or ext4 and so on. Also, a backup solution around ZFS send/receive comes with its own problems that we'd need to resolve and maintain. So at the moment we don't think this is the ideal approach.

[1]: https://wiki.debian.org/Btrfs#Other_Warnings

While I am thankful for the lightning quick answer with directness I value, I was really assuming PVE wants to be tightly integrated with ZFS. It is the only other choice out of the box on the ISO install and it actually does end up with root on ZFS, which even Ubuntu tried, then abandoned (zsys is now dropped from installer I think). The LVM feels anachronistic (do not kill me other contributors) compared what ZFS can provide and I took for granted PVE was heading that direction.

With that said, I would love to see BTRFS and not sure about the tech preview, it feels like aftermath of the some years back publicity fiasco and then RedHat dropping it completely with Fedora stubbornly keeping it (also quoted in your link). The issues linked there are all from before that period if I did not miss anything. But basically I got answer to two questions, one that I had asked and another which was on the back of my mind but did not expect much answer on (because I thought PVE is ZFS focused).

LnxBil · Nov 11, 2023

Esiy said:
The issues linked there are all from before that period if I did not miss anything.

Still no production-ready RAID5 with btrfs ...

Esiy said:
why can't one simply backup using the zfs send/receive in a streamlined way?

You can, but that has nothing to do with PVE itself, it's just a one-liner. We use it for years in production and it works great. Retention is done locally and on the receiving end seperately.

Deleted member 205422 · Nov 11, 2023

LnxBil said:
Still no production-ready RAID5 with btrfs ...

That's public knowledge, but I was under the impression that friends do not let friends use RAID5. I am risking starting something here, so I would just add it takes one instance of seeing a drive fail in RAID5 and another one then fail in the process of rebuilding it due to the extra stress suddenly caused on it to reach that conclusion, of course opinions may vary.

LnxBil said:
You can, but that has nothing to do with PVE itself, it's just a one-liner. We use it for years in production and it works great. Retention is done locally and on the receiving end seperately.

This is a PVE forum, everything PVE does can be accomplished by qemu, lxd, corosync, pacemaker, cronjobs, etc. There's a fine line between having value added or taken away by using a GUI which is opinionated to the point that, on one hand, lets user deploy ZFS out-of-the-box, but on another shadows its best features by half-baked solutions in the interest of - in this case - simpler code base.

To avoid any flame thread on this, I am not saying they have to provide it, but I think an average user will ABSOLUTELY assume he is getting ZFS (when chosen at ISO install time) with all that it has to offer, why else support it (as install option)? It is a confusing user experience.

LnxBil · Nov 11, 2023

Esiy said:
That's public knowledge, but I was under the impression that friends do not let friends use RAID5. I am risking starting something here, so I would just add it takes one instance of seeing a drive fail in RAID5 and another one then fail in the process of rebuilding it due to the extra stress suddenly caused on it to reach that conclusion, of course opinions may vary.

I don't think that this is still the case. A ZFS scrub will indeed just read the same amount of data so there is no "sudden stress", it's regular stress.

Esiy said:
To avoid any flame thread on this, I am not saying they have to provide it, but I think an average user will ABSOLUTELY assume he is getting ZFS (when chosen at ISO install time) with all that it has to offer, why else support it (as install option)? It is a confusing user experience.

There is also no dump based backup for ext4 or rbd export for ceph, so I don't see your point why this should be problem for ZFS.

Deleted member 205422 · Nov 11, 2023

LnxBil said:
I don't think that this is still the case. A ZFS scrub will indeed just read the same amount of data so there is no "sudden stress", it's regular stress.

For me with ZFS everything other than (striped) mirrors have always had performance penalty. But my comment might not anymore apply to NVMes. Somehow I left RAID5 from my toolset long ago, there was also the issue with unrecoverable read error likelihood during a rebuild, but then one should have backups.

LnxBil said:
There is also no dump based backup for ext4 or rbd export for ceph, so I don't see your point why this should be problem for ZFS.

Maybe it's just me being with PVE for 2 weeks total, but I was totally assuming ZFS was special case for PVE ever since I got in the install options. Even it made no sense to me to have root on ZFS and would rather run system disk separately, they went full in, it's why I had to create swap on ZVOL in that case. And I did not see much rationale in it as openzfs goes on to create GPT table anyways and EFI needs to be FAT separate. In which case I'd rather go ext/XFS root and then rest all ZFS, but PVE made a full ZFS install. Anyhow, my assumptions were wrong and I don't think I would be a rarity.

LnxBil · Nov 11, 2023

Esiy said:
Maybe it's just me being with PVE for 2 weeks total, but I was totally assuming ZFS was special case for PVE ever since I got in the install options. Even it made no sense to me to have root on ZFS and would rather run system disk separately, they went full in, it's why I had to create swap on ZVOL in that case. And I did not see much rationale in it as openzfs goes on to create GPT table anyways and EFI needs to be FAT separate. In which case I'd rather go ext/XFS root and then rest all ZFS, but PVE made a full ZFS install. Anyhow, my assumptions were wrong and I don't think I would be a rarity.

Why would you want to have root externally? For me, the big advantage is exactly the ZFS part, e.g. snapshots before updates, smaller files (due to compression), a lot of space (from the pool), but still have quota and refreservation. Swap is another matter that yielded a lof of crashes in the past as it was on a zvol and the system under low-memory. I switched to zram years ago as primary memory and put only secondary swap on the optane drives where the slog lives.

Deleted member 205422 · Nov 11, 2023

LnxBil said:
Why would you want to have root externally?

Partially PTSD of how Ubuntu decided to support it with their first ZFS on root experience, combined with on top of LUKS, etc. The zsys auto-snapshots - no pruning, no nothing - were easily going to fill up everything just before the initramfs was refreshed and bootloader entry updated. Then the general pain in the neck of looking for a live boot of something that has zfs support, correct version at that, import previously unexported pool or faking the deviceid, etc., etc.

But most importantly, preference to keep things simple where the complexity is counterproductive, because ...

LnxBil said:
For me, the big advantage is exactly the ZFS part, e.g. snapshots before updates,

For a hypervisor specifically I really do not see value in being able to rollback anything, not after a botched apt upgrade, it is much simpler/safer to rsync the last known good state and overwrite the bootloader, it can be even there in an extra copy on spare partition and ready to go, also the install is very much identical across nodes, and so should remain after upgrades, so for consistency just keep the base and the node-specific configs to be able to recycle any moment (a total hardware failure strikes). I have yet to find my inner zen with how PVE does like to do its things with pmxcfs and symlinks from everywhere to everywhere.

LnxBil said:
smaller files (due to compression),

Strictly talking about root, this is not important in the scale of things.

LnxBil said:
a lot of space (from the pool),

Yep, still have it, if need be, the ZFS pool is all there, but it's a red flag if needed, as again would like to keep the hypervisor tiny (if a debian install can be still called that).

LnxBil said:
but still have quota and refreservation.

Yes, but e.g. 16G at the beginning of drive allows for anything, it can be taken out of the node and used with a different hypervisor or regular OS install without having to migrate even a large pool back and forth, really anything and if the quotas on the pool were set all wrong, no impact on the hypervisor. Also contingency space there. No point even for LVM, the 16G can be repartitioned copied out and back in anytime, for those in favour of LUKS, can have separate boot partition feeding the passphrase from e.g. network, but nowadays probably using the SED drives, so saving the CPU cycles with SSDs.

LnxBil said:
Swap is another matter that yielded a lof of crashes in the past as it was on a zvol and the system under low-memory. I switched to zram years ago as primary memory and put only secondary swap on the optane drives where the slog lives.

This is a good tip, I would normally not have the luxury of anything slog-worthy, but L2ARC worked beautfilly even with HDDs and a fast NVMe. I started with ZFS very long ago for data storage only, for which it is still my favourite, then with LXD also quite some time ago it was very very convenient for the containers storage pool, but in turn not as nice as BTRFS in how it refers to a parent dataset, but that's another topic. So basically I somehow prefer to use ZFS for what it's best and leave the simple things simple wherever possible.

That all said, still figuring out how best to PXE boot whole hypervisor and keep in RAM. It's very strange to imagine installing 50 nodes (read it somewhere mentioned as realistic deployment size) from ISOs fully attended, when I ran into problem with a couple of nodes because of how the SSH keys are (not) cleaned up after a dead node. This would all be a non-issue even bad implementation if one had a ephemeral take on hypervisor upon every boot.

EDIT: For other systems, i.e. not hypervisor, the ZFS snaphosts on root felt good, but then there's also ostree way of doing things. But it's really to each their own, happy learn new things, pick up what suits, leave the rest behind, let everyone else do the same for themselves...

LnxBil · Nov 12, 2023

Thank you for the detailed answer.

Deleted member 205422 · Nov 13, 2023

Esiy said:
While I am thankful for the lightning quick answer with directness I value, I was really assuming PVE wants to be tightly integrated with ZFS. It is the only other choice out of the box on the ISO install and it actually does end up with root on ZFS, which even Ubuntu tried, then abandoned (zsys is now dropped from installer I think). The LVM feels anachronistic (do not kill me other contributors) compared what ZFS can provide and I took for granted PVE was heading that direction.

Ok, I realised today I was wrong, there it is equally one can choose BTRFS install as well, not sure why I have not seen it before.

Himcules · Jan 9, 2024

sterzy said:
The reason is fairly simple: ZFS is favored over BTRFS, because BTRFS is still considered a technology preview. BTRFS also still has a couple of showstopper issues, which is why we don't support it at the same level (see e.g., [1]). However, backups are a fundamental feature, and they need to work consistently across file systems that are supported. ZFS send/receive does not work for XFS or ext4 and so on. Also, a backup solution around ZFS send/receive comes with its own problems that we'd need to resolve and maintain. So at the moment we don't think this is the ideal approach.

[1]: https://wiki.debian.org/Btrfs#Other_Warnings

Concur.

"Also, a backup solution around ZFS send/receive comes with its own problems that we'd need to resolve and maintain."

Q: If your environment is 100% Proxmox with only ZFS in use and you wish to perform only a ZFS send/receive from one pool to another host remotely or even onto a separate pool on the same host would be possible to enable this in the GUI perhaps under some "Advanced" ZFS only option? On the CLI, we all agree it'd be dead simple i.e.

# zfs snap data/vm-114-disk-0@pve7-8
# zfs send data/vm-114-disk-0@pve7-8 | zfs recv archive/vm-114-disk-0@pve7-8
# zfs send data/vm-114-disk-0@pve7-8 | ssh Proxmox-host2 zfs recv archive/vm-114-disk-0@pve7-8

Point to raise, the language being used in the GUI around backups gave me a different impression as I only use ZFS so on reading in the backup Mode option "snapshot" was my first instinct of how this would be done. I was in for a minor surprise seeing a full vzdump being performed.
Maybe that could be made a bit clearer by some means.

Search

Search

Actual ZFS send/receive backups?

Deleted member 205422

Guest

sterzy

Proxmox Staff Member

Deleted member 205422

Guest

LnxBil

Distinguished Member

Deleted member 205422

Guest

LnxBil

Distinguished Member

Deleted member 205422

Guest

LnxBil

Distinguished Member

Deleted member 205422

Guest

LnxBil

Distinguished Member

Deleted member 205422

Guest

Himcules

New Member