Backup solution with ZFS - not really possible with pvesr and snapshots? [Workaround found]

mailinglists

Renowned Member
Mar 14, 2012
641
69
93
Hi,

i really wanted to use pvesr (GUI synchronization) and then ZFS snapshots for backup, but I have one big problem, which I can not overcome.

My backup plan is to sync all VMs to one node and then do snapshots of all volumes on that node.
It works nicely in principle. I set up synchronization in GUI, VMs replicate to backup node, where zfSnap does it's thing.

The problem comes when I do a (live) migration for a VM, even if I migrate it just to reboot the PM host.
After I migrate it back, the synchronization fails. Somehow it forgets that there is already the same disk on the target and tries to start from scratch. In order to have syncing work once again, I have to remove pvesr replication job and re add it.

I don't mind the additional manual work, but the actual problem is, that when job is removed, ZVOL on the target is removed with _all_ of it's snapshots. Also snapshots that zfSync creates, so I actually loose all of my backups for that VM.

I think I will have to go to pve-zsync, set it in pull mode on target and see if it works. :-)

How do you guys do it?


Obviously official vzdump backup solution, with it's idea of copying whole VMs each time it runs, is not really acceptable in 2018 nor feasible with big data sets.
 
  • Like
Reactions: DerDanilo
Hi,
The problem comes when I do a (live) migration for a VM, even if I migrate it just to reboot the PM host.
Live storage migration will not migrate your snapshots.
This means after the migration your zfs volume has no snapshots anymore and the replica can not work anymore.

I don't mind the additional manual work,
You can only do an offline migration.

I think I will have to go to pve-zsync, set it in pull mode on target and see if it works. :)
Same problem.
 
Do your sync to the backup node, have it create backups dumps the Proxmox way once a day to another storage that supports DeDup out of the box (btrfs/zfs).--> could also be a local storage within the backup host. Then you have a daily and/or weekly backup for the worst case scenario.
This keeps the productive host fast and backup time low.
 
Hi,

Live storage migration will not migrate your snapshots.
This means after the migration your zfs volume has no snapshots anymore and the replica can not work anymore.


You can only do an offline migration.


Same problem.

Thank you for your answer Wolfgang.

Whole point of migration is that I do not have to reboot the VM. If I have to shut it down, then there is no point in migrating to another node. It is even worse, because I have to shut it down twice, instead of just once for the host reboot.

<public complaints>
I'm trying so hard to replace our big proprietary clusters with ProxMox, but keep hitting the brick wall year after year.
At least live migrations with local storage now mostly work, and differential backups to, but when you try them together ... not. :-(
I feel we are so close to fulfilling above tasks, so I'm going to put some more thought into this. There should be a solution and probably is. DerDanilo has an idea described below. That's a start.
</public complaints>
 
  • Like
Reactions: DerDanilo
The proposed solution is something I'll setup in our new cluster and test how this performs.
Setting up a backup node is quiet easy. ZFS is fast is dedup capable on the fly and also supported by Proxmox PVE itself. Perfect as backup target.

If one gets really paranoid another offsite backup can take care of backup files once a week or month.
 
Last edited:
I just do not see how would the "proxmox way" of backups work on a backup node where you only have ZVOL disks of VMs.
You think vzdump will work anyway? Can you please elaborate how do you plan to make the backups on backup node "the proxmox way"?
 
With "the proxmox way" I mean that I'll backup the instances using vzdump on the backup node. PVE syncs the instances to the selected node.
Why wouldn't vzdump work if the instances are on the backup node?

@tom
If vzdump doesn't work because of missing configuration files for the instances (since the prod is running elsewhere), this is something that Proxmox should adjust, to allow syncing instances to other nodes in "backup mode", so that the configuration gets also synchronised on the other node.
 
DerDanilo,
i do not see the synced nodes in gui, nor their confs in /etc/pve/qemu-server.
I can see ZVOLS and configs in etc/pve/nodes/originalnodename/qemu-server.
I guess to recover one needs to copy the config file and data manually.

While you think about your option with vzdump, I will think about other options.
There must be a way, to have backups with ZFS as well as live migration at the same time.
I will take into account, that live migrating removes all snapshots for ZVOL, on which pvesr and pve-zsync rely.
 
  • Like
Reactions: DerDanilo
I think i found a simple solution. I have not tried it though, but I see no problems in theory.

Before doing anything to VMs, just rename ZVOL and all of it's (zfSnap made) children on the backup server to another name.
Then when you set up sync again, it should not remove the renamed ZVOL, because it does not know about it.
The only issue is that it will copy whole ZVOL on first sync again and keep snapshots from that on, and you have data for that VM twice, as well it is up to you, to remove the renamed ZVOL once you do not need it anymore.

Will test it out shortly. Wolfgang and others, please comment if you see a problem.
 
Works as expected!

@wolfgang just food for thought: :-)
This could be an awesome solution that PM could implement for future stable and differential backups that work even with live migration!
While it is not perfect, having just one full copy after live migration is much better than having one copy per backup.
 
Step by step for dummies:

1. Live migrate VM with synchronization enabled to wherever.
2. Once you have it (back) at your desired node just rename ZVOL on backup server:
Code:
zfs rename rpool/data/vm-100-disk-1 rpool/data/backup-100-disk-1
(Make sure ZVOLs name prefix chages! )
3. Remove rpool/data/backup-100-disk-1 once the backup gets too old, probably via a script. :-)
 
The proposed solution is something I'll setup in our new cluster and test how this performs.
Setting up a backup node is quiet easy. ZFS is fast is dedup capable on the fly and also supported by Proxmox PVE itself. Perfect as backup target.

If one gets really paranoid another offsite backup can take care of backup files once a week or month.


zfs dedup is ok if you know what you are doing. If and only if the dedup "database" can fit in ram, it is ok. In other cases, you can go into big problems.
 
  • Like
Reactions: DerDanilo
My personal opinion is that, costs of disks are way lower than costs of required RAM for stable deduplication.
 
LOL, i just found out that running pvesr syncing removes all other snapshots from the disk on destination server.
I see no reason why, but OK, will work have to work around that also. :-)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!