REPLICATION is ZFS only? Why? And what is PVE-zsync?

esi_y

Renowned Member
Nov 29, 2023
2,221
368
63
github.com
When I look at pvesr [1] source [2] it uses PVE::API2::Replication [3] which gets to use PVE::Replication [4] which in the end just calls PVE::Storage::storage_migrate [5] which essentially does pvesm import [6] on the other end and feeds it with pvesm export [7].

Unless I looked wrong, what's the problem supporting replication on any filesystem that has snapshots? There's no zfs send|receive at play.

Also, what is the point pve-zsync (alongside the said replication API), I read the docs [8] but could not quite get the rationale for both co-existing.

[1] https://pve.proxmox.com/pve-docs/pvesr.1.html
[2] https://github.com/proxmox/pve-manager/blob/master/PVE/CLI/pvesr.pm
[3] https://github.com/proxmox/pve-mana...f0590ddeb58fab1ad/PVE/API2/Replication.pm#L39
[4] https://github.com/proxmox/pve-gues...9b3172296c38058d0/src/PVE/Replication.pm#L220
[5] https://github.com/proxmox/pve-stor...3e84d0a18f95de0322168/src/PVE/Storage.pm#L702
[6] https://github.com/proxmox/pve-stor...3e84d0a18f95de0322168/src/PVE/Storage.pm#L820
[7] https://github.com/proxmox/pve-stor...3e84d0a18f95de0322168/src/PVE/Storage.pm#L743
[8] https://pve.proxmox.com/wiki/PVE-zsync#PVE_Storage_Replication_and_PVE-zsync
 
pvesm import | export do in fact use zfs send | recv [1] [2]

Unless I looked wrong, what's the problem supporting replication on any filesystem that has snapshots?
In theory, there is nothing that would prevent supporting replication for other storage types that support similar commands. It simply has not been implemented for other storage types (yet).

[1] https://git.proxmox.com/?p=pve-stor...a2864a2368bf3c3ba436ddca8047a4bf;hb=HEAD#l474
[2] https://git.proxmox.com/?p=pve-stor...2b62df7ebeda572cdf56ef19be872d56;hb=HEAD#l777
 
  • Like
Reactions: esi_y and sterzy
Except the volume_import [1] and volume_export [2] functions, use zfs send/receive.

If you look at the other storage plugins, for example LVM Thin (which does support snapshots), there is no volume_export function. As for BTRFS, which would support something similar, it is still considered a technological preview.

Also, pve-zsync, that's a utility that does some things differently from replication. For example, you can sync snapshots to a storage outside your cluster, and its main purpose is closer to backing up data than replication, which itself is useful for HA purposes. It also can't handle migrating VMs properly for example.

Also note that such functionality doesn't really make sense for distributed storage. RBD supports snapshots, but replicating within the same cluster wouldn't really make sense there. Ceph has RBD Mirroring [3], though, if you need to mirror an RBD image across several Ceph clusters.

[1]: https://git.proxmox.com/?p=pve-stor...PVE/Storage/ZFSPoolPlugin.pm;h=3669fe152#l777
[2]: https://git.proxmox.com/?p=pve-stor...PVE/Storage/ZFSPoolPlugin.pm;h=3669fe152#l736
[3]: https://pve.proxmox.com/wiki/Ceph_RBD_Mirroring
 
Last edited:
  • Like
Reactions: esi_y and shanreich
pvesm import | export do in fact use zfs send | recv [1] [2]

Sloppy me! I did not look there (anymore) assuming wrongly it's doing it its own way - there was another thread I was following yesterday when OP was using manually zfs send|receive across nodes but complained he could not start such VM. I guess that's down to dataset/snapshot naming system then.

In theory, there is nothing that would prevent supporting replication for other storage types that support similar commands. It simply has not been implemented for other storage types (yet).

Thanks!
 
Except the volume_import [1] and volume_export [2] functions, use zfs send/receive.

If you look at the other storage plugins, for example LVM Thin (which does support snapshots), there is no volume_export function. As for BTRFS, which would support something similar, it is still considered a technological preview.

You got me, I asked because I was eying the add it for BTRFS! :)

Also, pve-zsync, that's a utility that does some things differently from replication. For example, you can sync snapshots to a storage outside your cluster,

I figured, unless it migrated away. It feels like pve-zsync was preview of what then became Replication but not complete replacement.

and its main purpose is closer to backing up data than replication, which itself is useful for HA purposes. It also can't handle migrating VMs properly for example.

This is something I would like to know actually, why. I mean I can go read what it's doing but since you already replied ... :D

Also note that such functionality doesn't really make sense for distributed storage. RBD supports snapshots, but replicating within the same cluster wouldn't really make sense there. Ceph has RBD Mirroring [4], though, if you need to mirror an RBD image across several Ceph clusters.

No worries, just for small clusters replication is much better than even eyeing CEPH.
 
This is something I would like to know actually, why. I mean I can go read what it's doing but since you already replied ... :D
Well the short answer: Historical reasons, probably.

The long answer: Replication was introduced with a no longer supported storage backend called DRBD (not to be confused with RBD). pve-zsync was first released as a tech-preview for PVE 3.4, which was also the first release to support ZFS [2]. It was, and is, basically a standalone package that can handle syncing ZFS snapshots, in a way it's just a Perl script that wraps zfs send|recv, ssh and cron and not much else.

As it was intended to be separate of Proxmox VE, it couldn't deeply integrate with it and, hence, it wasn't suitable for HA purposes. As an aside, note that replication still isn't ideal, as you can lose data in every replication scenario. Proxmox VE 4.4 removed DRBD support and replaced it with an external plugin provided by Linbit [4]. Finally, with Proxmox VE 5 we get to ZFS replication [5], which more deeply integrates with Proxmox VE, and, thus, could take the place of previous replication implementations.

Hope that makes sense.

No worries, just for small clusters replication is much better than even eyeing CEPH.
Yeah, in a way replication is what you do when you can't use a shared storage.

[1]: https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_1.4_beta1
[2]: https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_3.4
[3]: https://git.proxmox.com/?p=pve-zsync.git;a=blob;f=pve-zsync;h=98190b208
[4]: https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_4.4
[5]: https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0
 
  • Like
Reactions: esi_y
While having a PVE integrated tool such as pve-sync is nice, there is nothing stopping you from using the other tools at hand to do it with any other snapshot capable source (namely tar and ssh)

I believe most people that ask these questions are after having it neatly in the GUI and most importantly automated (e.g. during HA events). Over time I figured there's not much effort going to be spent on replication anything [1], this was probably decided ever since CEPH was introduced. Simply replicating across many nodes does not scale well.

On the other hand, ZFS is somehow given special treatment by PVE (e.g. installer) but then it's not really that much of a better choice after all. So people want to use something else than anachronistic filesystem (and some even care about the controversial license) that was never meant for NVMe, etc. And then they discover that some features depend on it. On the other hand, other features that were meant to be filesystem agnostic (e.g. PBS backups) make it a complete waste to have e.g. ZFS snapshots readily available because it goes on to deduplicate on its own every single time [2] - and this would be true for all filesystems that support snapshots natively.

So at least communication-wise, it is a fail.

[1] https://forum.proxmox.com/threads/what-is-wrong-with-high-availability.139056/#post-620923
[2] https://blog.guillaumematheron.fr/2023/261/taking-advantage-of-zfs-for-smarter-proxmox-backups/
 
Last edited:
I believe most people that ask these questions are after having it neatly in the GUI and most importantly automated (e.g. during HA events)
That is a possibility. Its also possible that people ask because they have a task they wish to perform. Not all tasks are going to have tools provided by a particular mechanism; its up to the person asking the make the determination of what to do with this information. They can either lift their hands and complain that they want it THIS way. or not.

Over time I figured there's not much effort going to be spent on replication anything [1], this was probably decided ever since CEPH was introduced. Simply replicating across many nodes does not scale well.
Too true. If Proxmox didnt cater to the home/hobby user, its likely they wouldnt have bothered with this feature in the first place.

On the other hand, ZFS is somehow given special treatment by PVE
Not "somehow," ZFS is designed to deal with these kind of functions based on its inherent CoW architecture, and built in toolset (eg, send/receive.) writing a wrapper around their toolset is simple enough making it low hanging fruit for the dev; its also the reason ZFS is the go-to filesystem for non clustered pve deployments.

So at least communication-wise, it is a fail.
not really sure what you're trying to say. communication between who/what?
 
On the other hand, other features that were meant to be filesystem agnostic (e.g. PBS backups) make it a complete waste to have e.g. ZFS snapshots readily available because it goes on to deduplicate on its own every single time [2] - and this would be true for all filesystems that support snapshots natively.
PBS doesnt use snapshots. The price you pay for this is the need for delta caching for io subsequent to backup commencement; this can and does cause severe performance issues on a busy system and you'd need some backup fleecing to mitigate- which is only supported on specific target filesystems-which themselves have to support snapshots.

and if it isnt clear- snapshots are not backups. regardless of whether pbs dedups after the fact or not; snapshots serve their own purposes not withstanding pbs.

Ironically, most backup solutions DO use snapshots as it effectively removes the delta impact mentioned above.
 
not really sure what you're trying to say. communication between who/what?

Looking back when I first started looking at PVE (and brought up these questions, I was the OP in this thread), I remember seeing ZFS in the ISO installer and thought to myself "this is it, great, it must be taking advantage of it wherever possible". And then, not really (e.g. the PBS backups) even though it could. I did like ZFS, but I do not like the mixed bag one gets and it's not clearly communicated. Getting ZFS install through PVE ISO gets you no benefits. It puts ZFS on root, harder to live boot with e.g. regular Debian (PVE has no live boot image), it uses ESP to shove the initramfs images there as if it was using systemd-boot even when it does not and it creates also a pool for VM zvols - which I am better off creating myself post regular (non-ZFS) install.

There's no benefit to a disaster recovery situation, there's no benefit to PBS (very counterintuitive) and then there's the replication topic (arguably SOHO only user relevant). So what I meant to say ... marketing-wise, I would not understand all this from how e.g. often ZFS is discussed here that I do not really get all the benefits, but have to use it to e.g. get replication. And on the other end, when I can install BTRFS (I have not checked ISO since v8 but I think it is there as the only other options besides LVM and ZFS), I would expect at least same feature set as with ZFS since it's, well, COW as well.

Finally, I understand, it's not possible to cater for everything, when I asked a question on the forum regarding f2fs or zoned storage, no response whatsover - so what I was trying to say, when marketing-wise something is pushed forward (such as ZFS), it should support it all, if BTRFS is put alongside with it, it should support it all. And if it's filesytem agnostic, everything should be. It's inconsistent (probably due to history) as of today as it is.
 
snapshots are not backups

I have seen this response on this forum too many times - it is wrong to put it here when that's not the topic. The topic is - I need that snapshot to start backing up live system from and because I was doing it the same way last time, of course I just want to backup deltas between two snapshots.
 
I did like ZFS, but I do not like the mixed bag one gets and it's not clearly communicated.
I see. So, in your view, The Proxmox PVE product is deficient because they (the developers) did not actively educate you? since they do a relatively good job documenting, what would you have liked them to do differently/additionally? not criticizing, genuinely curious. More to the point, EVERY file system is a mixed bag; Thats the reason there are options exposed. I'm not really sure what you're after.

Getting ZFS install through PVE ISO gets you no benefits.
uhh.... it is BY FAR the best option for installation for systems without a raid controller. Since most PVE deployments use an HBA for zfs/ceph purposes, that leave zfs root mirror as the obvious choice for root installation. I would avoid blanket statements that cover your opinion as if its a universal fact.

There's no benefit to a disaster recovery situation
You're doing it wrong ;) didnt you say you make backups?

So what I meant to say ... marketing-wise, I would not understand all this from how e.g. often ZFS is discussed here that I do not really get all the benefits
I guess it goes back to the initial point above. whose responsibility is it to educate you?

There's no benefit to a disaster recovery situation, there's no benefit to PBS (very counterintuitive)
I have seen this response on this forum too many times - it is wrong to put it here when that's not the topic
I hope you can see how I'm confused by your statement. You brought up PBS. I merely mentioned that there is no connection between your choice of filesystem (and its features) to your backup choice. Yes, I'd prefer Veeam's approach (which DOES use snapshots) but PBS doesnt. The devs have their reasons, probably so they can support every source type.
 
  • Like
Reactions: Johannes S
I know I am free to go at lengths into anything as this was all my own thread here, but ... :)

I see. So, in your view, The Proxmox PVE product is deficient because they (the developers) did not actively educate you? since they do a relatively good job documenting, what would you have liked them to do differently/additionally? not criticizing, genuinely curious.

There seems to be no marketing strategy behind the whole PVE/PBS and e.g. ZFS. I suppose that's not an individual developers job, neither my job to read all the docs to discover counter-intuitive.

If you make ZFS your centrepoint (not my favourite in 2024, but it's good enough), you better make the best out of it. If you claim something is filesystem agnostic, then why only that?

This is a topic of labelling things - everyone who reaches for ZFS does it for what comes with it. When I see e.g. PVE supports replication for HA, of course I assume it uses zfs send/receive, if I see PBS is agnostic, of course I assume that's the whole approach, so PVE's replications should be agnostic. PBS only exists because PVE exists.

it is BY FAR the best option for installation for systems without a raid controller.

No it's rubbish for that use case, you are better off with mdadm for e.g. two NVMes in 2024, but there's issues with that (and PVE) as well, so naturally PVE rather pulled that option from even installer altogether. Come to think of it, that's another inconsistency (time to pull BTRFS as an option too).

Since most PVE deployments use an HBA for zfs/ceph purposes, that leave zfs root mirror as the obvious choice for root installation. I would avoid blanket statements that cover your opinion as if its a universal fact.

See above - these "obvious choices" are only obvious because of artificially created constraints. They are not universally obvious. Also, what is the point of RAID for a system drive in HA cluster to begin with ... it's already HA and you too ...

you say you make backups?

Now if multiple people come here asking e.g. why something does not work and it's written somewhere deep in the docs ...

I guess it goes back to the initial point above. whose responsibility is it to educate you?

... it's not really the reader's fault. This is not legal discussion, this is about why are we getting so many questions e.g. why replication does not work on BTRFS. Simple answer: you put BTRFS alongside ZFS into the installer. In fact you should not have put LVM there either. Or not advertised replication as a feature for reasonable HA setup. These are marketing decisions. Now I know there's no marketing team at Proxmox, but that's not my fault.

no connection between your choice of filesystem (and its features) to your backup choice

Of course there is. I choose a filesystem that makes the whole thing work better, not worse. E.g. I do not choose disk encryption based on serpent when the system has AES-NI. One does not choose a COW filesystem for a system drive with snapshots and send/receive functionality only to later run on the same system "agnostic" backup solution that puts load on the hypervisor CPU to determine dirty blocks. It's a no-brainer. That's not not me reading it wrong, I read it, it's bad, I came to tell them. If multiple people come here, they are likely not the ridiculous ones themselves.

Yes, I'd prefer Veeam's approach (which DOES use snapshots) but PBS doesnt. The devs have their reasons, probably so they can support every source type.

It was never well formulated. I accept any reasonable explanation. I can disagree the reasons are there, but at lest they should be given. Those are not in the "reasonably good" docs. Unless something had been added since.

NB I do not like closed source backup solutions, but if you have your reasons (that you could formulate) and it all makes sense to you, that's good. If you provide a solution like PVE, you should formulate them publicly.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!