Backups between clusters

NdK73

Renowned Member
Jul 19, 2012
93
5
73
Bologna, Italy
www.csshl.net
Hi all.

I currently have a two-node cluster using a Dell MD3200 as LVM shared storage (two LUNs, one for OS and one for data LVs).

For disaster recovery, I'd need to automatically send a backup of some VMs to another Proxmox cluster, that uses iSCSI-backed LVM.

But till now I couldn't find a "good enough" way to automate the copy.

What I have now is a bash script that:
0) checks that source VM is running and dst VM isn't
1) suspends source VM
2) takes a snapshot
3) resumes source VM
4) copies (using dd) from snapshot to dst LV
5) destroys snapshot

But I think it's quite fragile:
- if source VM is moved to another node, backup can't run
- if dest VM is moved to another node, I risk overwriting its data with old (that should never happen: the disaster-recovery VM should never be turned on if the main is active, but...)
- dest LV is accessed from outside its CLVM cluster (I just added iSCSI target and its LVM in the GUI), and I fear that that could lead to data corruption (since metadata is not synced)
- the backup overwrites the only copy of the data: if someting goes wrong, I lose it (maybe I could fix this by taking a snapshot of the dest LV on the dest before starting writes and discarding it after copy, but if backup fails do I have to copy back all the data before being able to restart the machine?)

Suggestions?

Tks.
 
I'm not sure but taking regular scheduled backups (with sufficient keep copies set) and rsyncing those to the other cluster's storage wouldn't work?

Marco
 
Seems I'm starting to understand what you mean.

Trying
vzdump 102 -compress gzip -mode snapshot
that IIUC uses the first storage that allows backups.

But that machine does have a 100GB LV for the SO and a 1TB LV for data. Data is backed up from inside the VM, but vzdump backs it up.
Can't find a switch to turn off backup of /dev/Data/vm-102-disk-1 :( Tried -exclude-path w/o result :( IIUC it's only useful for containers, not KVM.

BTW it seems I was reinventing vzdump :)
 
But that machine does have a 100GB LV for the SO and a 1TB LV for data. Data is backed up from inside the VM, but vzdump backs it up.
Can't find a switch to turn off backup of /dev/Data/vm-102-disk-1 :( Tried -exclude-path w/o result :( IIUC it's only useful for containers, not KVM.

Is backing up only the data from inside of VM is the main goal or saving the entire VM for full recovery on a different cluster?

For 100% full recovery on another cluster during disaster, of course the full KVM backup should be the course of action. If you are trying to just save the data, then the backup of this VM does not need to be copied regularly unless the VM OS changes frequently.

I think backing up entire VM regularly on first cluster then simply copying the backups to second cluster is hassle free logical option. You can use rsync to keep both local node and remote node backup directory in sync at all times. The following command should be a good start to sync local backup directory with remote node:
#rsync -r -a -v -e "ssh -l <user>" --delete /mnt/pve/<store>/dump/ <remote_node>:/mnt/pve/<store>/dump/
 
Seems I'm starting to understand what you mean.
...
But that machine does have a 100GB LV for the SO and a 1TB LV for data. Data is backed up from inside the VM, but vzdump backs it up.
Can't find a switch to turn off backup of /dev/Data/vm-102-disk-1 :( Tried -exclude-path w/o result :( IIUC it's only useful for containers, not KVM.

btw, each VM disk can be set as "no backup": see vm web gui config.
edit vm disk (double click) and check " no backup"

Marco
 
Woa! Tks!
This did it.

Since the VM is running Windows as domain member, I need to back it up quite often to reduce the risk of having a wrong machine key (that requires a domain admin to bring it up again).

Now I have to study the scripting hooks of vzdump to automate copying offsite the data.
 
Woa! Tks!
This did it.

Since the VM is running Windows as domain member, I need to back it up quite often to reduce the risk of having a wrong machine key (that requires a domain admin to bring it up again).

Isnt it much easier and safer to just backup the entire VM intact to ensure total restore? Is the size of the VM concern?
 
Yep. Or, better, the time it takes for a full backup (and restore) is a concern. Who created it used virtual disks "a little" bigger than needed (by a factor of 5!). So it uses 100GB for the OS and 1TB for data.
The test I did yesterday (backing up only the OS LV) ran for about 40 minutes (I thought it should take way less...). Since data changes slowly, I prefer to be able to restore a "working server with possibly old data" asap, then (while the server is working) bring the data up to date.
If restoring a machine takes about 1h, it's inconvenient but acceptable. If it takes 6 or 7 hours it means the whole work day is gone.
I know, it's a worst- case scenario implying the MD3200 becomes completely unusable (else the secondary node could take the extra load), but it already happened and it took 2 days to have it back online!
Maybe a glusterfs on at least three backblaze bricks could prove more resilient (and less expensive), even if maybe "a bit" less performant (network is the bottleneck)...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!