Latest pve-zsync 2.0-4 conflicting with manual snapshots on destination / PVE 6.3

onlime

Renowned Member
Aug 9, 2013
76
14
73
Zurich, Switzerland
www.onlime.ch
Hi there

We have now used your pve-zsync for ZFS snapshot backups by pulling from remote hosts via SSH. This worked wonderful until your latest commit of pve-zsync 2.0-4 on master branch, namely this change:

pve-zsync: Flip Source and Dest in functions to so jobs can share Dest
https://git.proxmox.com/?p=pve-zsync.git;a=commitdiff;h=5d3ff0f6e4989a8cd3e22c7c72fa751b33c76291

I know, we are probably not quite using pve-zsync as intended, as we had always this small patch in place:

Diff:
--- pve-zsync.DIST-OLD    2020-03-23 18:37:20.000000000 +0100
+++ pve-zsync    2020-11-27 15:27:06.810249041 +0100
@@ -980,7 +980,7 @@
 
     push @$cmd, \'|';
     push @$cmd, 'ssh', '-o', 'BatchMode=yes', "$param->{dest_user}\@$dest->{ip}", '--' if $dest->{ip};
-    push @$cmd, 'zfs', 'recv', '-F', '--';
+    push @$cmd, 'zfs', 'recv', '--'; # patched by Onlime
     push @$cmd, "$target";
 
     eval {

man zfs says:

zfs receive [-Fhnsuv] [-d|-e] [-o origin=snapshot] [-o property=value] [-x property] filesystem

-F Force a rollback of the file system to the most recent snapshot before performing the receive operation. If receiving an incremental replication stream (for example, one generated by zfs send -R [-i|-I]), destroy snapshots and file systems that do not exist on the sending side.

We had to turn off the -F switch as we rotate our backups (frequent/daily/weekly/monthly) with zfs-auto-snapshot which creates intermediary snapshots with @zfs-auto-snap_* prefix. Those snapshots would get destroyed by zfs recv -F -- on each pve-zsync run.

So how come you have decided to suddenly make such a breaking change to pve-zsync in latest PVE 6.3? Why does the script suddenly need to search for the latest snapshot on destination and verify if that same snapshot exists on the source? We get the following error, reproduce:

Bash:
# first run works fine
$ ./pve-zsync sync --source x.x.x.x:rpool/zfsdisks/subvol-198-disk-1 --dest dpool/zfsdisks --maxsnap 6 --name backup

# we create a snapshot on destination dataset (which in my eyes should always be allowed...)
$ zfs snapshot dpool/zfsdisks/subvol-198-disk-1@testsnap

# second run of pve-zsync then fails (worked fine in previous pve-zsync 2.0-3)
$ pve-zsync sync --source x.x.x.x:rpool/zfsdisks/subvol-198-disk-1 --dest dpool/zfsdisks --maxsnap 6 --name backup
WARN: COMMAND:
    ssh root@x.x.x.x -- zfs list -rt snapshot -Ho name rpool/zfsdisks/subvol-198-disk-1@testsnap
GET ERROR:
    cannot open 'rpool/zfsdisks/subvol-198-disk-1@testsnap': dataset does not exist
Job --source x.x.x.x:rpool/zfsdisks/subvol-198-disk-1 --name backup got an ERROR!!!
ERROR Message:
COMMAND:
    ssh -o 'BatchMode=yes' root@x.x.x.x -- zfs send -- rpool/zfsdisks/subvol-198-disk-1@rep_backup_2020-11-27_15:18:21 | zfs recv -- dpool/zfsdisks/subvol-198-disk-1
GET ERROR:
    cannot receive new filesystem stream: destination 'dpool/zfsdisks/subvol-198-disk-1' exists
must specify -F to overwrite it


Best regards,
Philip
 
Hi,
I don't quite understand why you need to create additional snapshots on the destination side. Why would the datasets even be modified there?
It might be a breaking change for the way you used it, but how should we have known about that? The reason for the change is described here, it was to allow sharing the same source for different destinations.
 
Thanks @Fabian_E for explanation. We need daily/weekly/monthly backup rotations (for up to 6mo) and use zfs-auto-snapshot for that which makes it very painless. AFAIK, pve-zsync currently does not offer such a thing (apart from -maxsnap which is not an option - we cannot keep all pve-snapshots for 6 months, that would be way too many snapshots, summing up to ~200'000 snapshots for all datasets in our case).

Is there something like snapshot rotation planned for pve-zsync or do you know another workaround for our use case?
 
Ok, I think I understand your use case now. I don't think there's an optimal solution, but what you can do is either:
  1. Keep the package downgraded.
  2. Extend the patch (I included what I think should be the sufficient changes in the attachment, but please verify it works, before using it on production data).
In the former case you won't get any updates though and in the latter you need to check that the changes still make sense when new versions come out. But development has been rather slow on the tool, so maybe that's manageable. Also, feel free to add a feature request for rotation on our bugtracker.
 

Attachments

Thanks a lot @Fabian_E for this simple patch which does exactly what we needed! It is now tested and already runs on production. We can easily maintain this in the future, as anyway I have implemented a safety check which would bail out a pve-zsync backup run, if unpatched.

But what hinders you guys only selecting the last snapshot from @rep_* prefixed snapshots, ignoring the others? You could still keep the zfs recv -F -- flag, if that's needed. Could the first part of your patch make it into official pve-zsync?

Diff:
--- a/pve-zsync
+++ b/pve-zsync
@@ -717,10 +717,8 @@ sub snapshot_get{
 
     while ($raw && $raw =~ s/^(.*?)(\n|$)//) {
     $line = $1;
-    if ($line =~ m/@(.*)$/) {
-        $last_snap = $1 if (!$last_snap);
-    }
     if ($line =~ m/(rep_\Q${name}\E_\d{4}-\d{2}-\d{2}_\d{2}:\d{2}:\d{2})$/) {
+        $last_snap = $1 if (!$last_snap);
         $old_snap = $1;
         $index++;
         if ($index == $max_snap) {

(I might still not have fully understood this.)
 
The first part of the patch would break having the same destination and same source with different jobs (e.g. weekly and daily). I found the reason in an earlier message on the mailing list. But I think you are right that matching for rep_.* instead of arbitrary snapshot names should work.
 
The -F flag ensures that the next sync will still work if the destination has been (accidentally) modified (and would ensure that replication streams work if we ever want to implement that), so I'd argue against dropping that.

You can of course send patches with the suggested changes to the pve-devel mailing list so they can be discussed in detail there.
 
I personally use proxmox-autosnap (https://github.com/apprell/proxmox-autosnap) to create daily, weekly and monthly local snapshots.
Then I create full backups of each CT using vzdump every day which are sent to S3.

I'm considering (and currently testing) rsync.net (zfs enabled account) to create distant ZFS snapshot using pve-zsync to replace vzdump/s3 which is quite IO consuming.

My goal is to create what @onlime is describing (snapshot retention rotation) on my distant ZFS server similarly what is done locally.
To that that, I've to create 3 cron for each CT (one for the daily rotation, one for the weekly rotation and one for the monthly rotation) which is painful and will create at least 3 full snapshot for each CT.

What could be great is to find (or create?) a tool which work like both proxmox-autosnap (for the rotation) and pve-zsync (for the zfs send) combine.
As far as I see, "pct snapshot" do not return the name of the snapshot so it can't be used with pve-zsync.

My ultimate dream is that pve-zsync evolve with the features of proxmox-autosnap (retention rotation and an argument to say "all vmid except")

My $0.02 but if you have any advice I would be happy to read you

Enjoy your day
 
Hi,
I personally use proxmox-autosnap (https://github.com/apprell/proxmox-autosnap) to create daily, weekly and monthly local snapshots.
Then I create full backups of each CT using vzdump every day which are sent to S3.

I'm considering (and currently testing) rsync.net (zfs enabled account) to create distant ZFS snapshot using pve-zsync to replace vzdump/s3 which is quite IO consuming.

My goal is to create what @onlime is describing (snapshot retention rotation) on my distant ZFS server similarly what is done locally.
To that that, I've to create 3 cron for each CT (one for the daily rotation, one for the weekly rotation and one for the monthly rotation) which is painful and will create at least 3 full snapshot for each CT.

What could be great is to find (or create?) a tool which work like both proxmox-autosnap (for the rotation) and pve-zsync (for the zfs send) combine.
As far as I see, "pct snapshot" do not return the name of the snapshot so it can't be used with pve-zsync.

My ultimate dream is that pve-zsync evolve with the features of proxmox-autosnap (retention rotation and an argument to say "all vmid except")

My $0.02 but if you have any advice I would be happy to read you

Enjoy your day
I've thought about this again and implementing this feature should make it possible to have backup rotation with pve-zsync alone.
 
  • Like
Reactions: zorrobiwan
Amazing @Fabian_E, it's certainly a major improvement (for me at least :-) ) It's still 2 cron per CT but it would be really useful !!

Another improvement would be to be allow to specify multiple vmid and/or "all CT except these ones" (like vzdump with the --all and --exclude parameters). But I know, Christmas is in December ;-)

Thanks again for your help and I'll follow the pve-zsync improvements !
 
Sadly, I don't have time to work on this right now, but I'll keep it on my TODO list and hope to get around to it eventually (or maybe somebody else grabs the bug and implements the feature). Feel free to open another enhancement request for the --all and --exclude feature.
 
  • Like
Reactions: zorrobiwan
This is exactly what I was looking for. I tried to use zrepl to replicate the zfs pool for backup. Unfortunately zrepl picked one of my short lived snapshots and fell over when I tried to do an incremental backup. I use various tools that do snapshots, e.g. like moving a dataset beteen ssd and hdd pools.

Thus I think it is essential that a tool only matches its own shapshots, and ignores everything else.
 
  • Like
Reactions: zorrobiwan
@Fabian_E We have needed this ( https://bugzilla.proxmox.com/show_bug.cgi?id=3351 ) for years and we could finally stop using znapzend. @onlime you should consider using znapzend also. Allows us just that, to have different number of snapshots on different locations.

Well to be frank, with new PBS, that is a more friendly option and works almost as well (unless you reboot VM or migrate it) we stopped using znapzend.
 
Last edited:
  • Like
Reactions: zorrobiwan
Not sure I'll be able to test the patch in the next few days, but thx for the job anyway (and I'll keep an eye on it)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!