Questions about pve-zsync

Alessandro 123

Well-Known Member
May 22, 2016
653
24
58
40
I was reading docs about pve-sync
Currently 4 things are not clear to me:

1. Can i schedule a single cronjob to sync all vm on the server automatically?

2. https://pve.proxmox.com/wiki/PVE-zsync#Recovering_an_VM is wrote that restoring a VM means to use zfs send/receive. But what if the source node is totally unavailable? How can i start the backupped vm from the backup host?

3. Are zsync backups incremental or everything is sent every time?

4. Can i use a PVE installation as backup host? If yes, the PVE GUI located on backup host, would be able to see and manage backupped VM coming from zsync? In other words, can i start a backupped vm from the GUI?
 
Hi,
1) Don't know
2) If it was synced once you can start the VM on the other host (with the state of the last sync)
3) ZSync is incremental
4) Good question - don't know.

Regards, Jonas
 
1. No, for every VM 1 pve-zsync job (VMs are ZFS Datasets)

2. you can also use pve-zsync du restore, but after you must delete all snapshots, but this should not be the problem. When you restore, the whole backupline with snapshots are gone away They do not fit together anymore. And yes you start the VM on the backuphost without any problems. Copy the vmconfig from the pve-zsync directory to the pve direcotry. If the backup storage is not exactly the same (for example: rpool/data), you have to adjust the config for virtual harddrivers.

4. See point two.
 
How can i choose from which snapshot restore the vm in case of multiple snapshots for each vm?

Additionally, any suggested way to backup all VMs or should i create a wrapper script that looks for all VM id and run zsync for each of them ?
 
Hi,

zfs snapshots a read only so you can use send/receive like this

zfs send rpool/backup/vm-100-disk-1@<snapshot> | ssh root@192.168.15.1 zfs receive vm/vm-200-disk-1

All sync jobs will add to the cron.d and there you can handel the backups.
create a job and see /etc/cron.d./pve-zsync
 
I don't want to send snapshot through net. What I'm asking for is how to restore an existing snapshot from the backup PVE node.

In example, we have 2 PVE nodes: node1 and backupNode.

node1 sends snapshots to backupNode, through pve-zsync, on hourly basis.
node1 crash and is unusable. on backupNode, how can I start VM100 from the latest received snapshot or from 4 snapshots ago ?
 
The same way without ssh

full copy

zfs send rpool/backup/vm-100-disk-1@<snapshot> | zfs receive rpool/vm/vm-200-disk-1

or

linked clone

zfs clone rpool/backup/vm-100-disk-1@<snapshot> rpool/vm/vm-200-disk-1

or create a storage with the same name zfs called on node1 and copy the config
 
So, if I want to start up the VM with latest snapshot, this is already possible without zfs send/receive.
If I would like to start from an older snapshot, I have to send the snapshot and receive that to the "current" VM disk ?

Why not adding this in the GUI? Right click on a snapshot to start a new VM from that, like with XenCenter
maxresdefault.jpg
 
What i'm trying to archieve is a backup node (with many GB of RAM and huge CPU) where ALL my servers are backed up to and being able to directly start VM from the PVE gui.

if a node should fail, i'll be able to start up all VMs from the backup nodes. When the original nodes is replaced, i'll move the running VMs to the older node with live migration.

This would avoid hours of downtime while restoring from backups, right click on each snapshot and I'm ready to go (obviously with reduced performance, as backup servers would have SATA disks)

I hope what i'm trying to do is clear....
 
There are plans to integrate it in the GUI but pve-zsync is a standalone packet for WAN replication and not async cluster replica.
 
I'm not talking about integrating zsync in gui, but the ability to start up a VM (that was replicated by zsync) directly from the GUI without manual intervention. The GUI is usefull and could be used remotly even on a mobile phone. When everything gonna bad, restoring a VM from a specific snapshot from the GUI could save the day.
 
short answer: no, it's not. at least not currently - it is on our todo list.

the problem is the PVE does not track the backup copies of the disks at the moment (pve-zsync just needs any SSH-accessable host with zfs as target!). reverting to a snapshot is not a problem in PVE, but reverting to a snapshot of a disk that PVE knows nothing about is not possible ;)

doing this automatically has certain risks which are at the moment left to the admin to care about, e.g.:
  • what about the old disks on the failed nodes? those have the same IDs, but are now invalid?
  • what about (re)moving the guest configuration from the failed node? this is normally only allowed for the HA manager which has it's own special locking mechanism for this..
  • ...
 
Probably is not clear what i'm saying.
Forget about the failed node, it's failed, nobody care about it. It's broken and trashed away, so there is no configuration to remove nor shared IDs.

The only survived node is the backup node, where I would like to start the failed VM from a snapshot.

Why zsync is non syncing VM configuration on backup node ? When starting up the sync process, you could also copy the VM configuration to the backup node. In this case, the PVE on backup node is aware of any VM configuration/disks
 
Probably is not clear what i'm saying.
Forget about the failed node, it's failed, nobody care about it. It's broken and trashed away, so there is no configuration to remove nor shared IDs.

The only survived node is the backup node, where I would like to start the failed VM from a snapshot.

this is something that you need to decide manually, and PVE cannot decide for you (especially in a non-quorate state!). it is also not true for all use cases of pve-zsync, and not true for all failure states where you might want to do disaster recovery.

Why zsync is non syncing VM configuration on backup node ? When starting up the sync process, you could also copy the VM configuration to the backup node. In this case, the PVE on backup node is aware of any VM configuration/disks

we do sync the config (if you give a guest ID as source and not a ZFS dataset).

if you do know that all the prerequisites are fulfilled (most importantly, failed node not running anymore, cluster still quorate), you only need to do two things to get your guests running in the last replicated state:
  • make the config available to PVE (this might be a simply cp of the backed up configuration into /etc/pve/local/(qemu|lxc)/)
  • adapt the disk/volume paths in the config to point to the backup copies or make the backup copies available under the old paths
making this safe enough to do in a one click operation on the GUI is not as trivial as you think (otherwise it would already be implemented ;)). handling all the prerequisites and the potential cleanup when the failed node returns to the cluster makes it even harder.
 
this is something that you need to decide manually, and PVE cannot decide for you (especially in a non-quorate state!). it is also not true for all use cases of pve-zsync, and not true for all failure states where you might want to do disaster recovery.

I still don't understand the problem.
PVE doesn't have to decide anything (or automate anything).
It's the admin that want to start a VM from the backup node. This is already a decision. The admin has clicked on "Start from this snapshot" button and has made the decision.

we do sync the config (if you give a guest ID as source and not a ZFS dataset).

if you do know that all the prerequisites are fulfilled (most importantly, failed node not running anymore, cluster still quorate), you only need to do two things to get your guests running in the last replicated state:
  • make the config available to PVE (this might be a simply cp of the backed up configuration into /etc/pve/local/(qemu|lxc)/)
  • adapt the disk/volume paths in the config to point to the backup copies or make the backup copies available under the old paths
making this safe enough to do in a one click operation on the GUI is not as trivial as you think (otherwise it would already be implemented ;)). handling all the prerequisites and the potential cleanup when the failed node returns to the cluster makes it even harder.

Why are you talking about the cluster? Is pve-zsync working only in clustered configuration?
What i'm talking about is to have a totally isolated backup node (as it should be) where you can start up failed VM as disaster recovery.
You should totally ignore clusters, quorate, old nodes and so on. Just start the VM on the backup node. It's the administrator that should check if this operation could cause other issues or not. You just offer a possibility.
 
Then this would be enough:
  • make the config available to PVE (this might be a simply cp of the backed up configuration into /etc/pve/local/(qemu|lxc)/)
  • adapt the disk/volume paths in the config to point to the backup copies or make the backup copies available under the old paths
-> Start the VM.
Maybe you could also create symlinks so you can just klick "start" in the event of a downed host.

Jonas
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!