Disaster recovery and backups

Alessandro 123

Well-Known Member
May 22, 2016
653
24
58
40
As wrote in other thread, i'm engineering a new virtual environment based on proxmox (i'm coming from XenServer)

One of the most important things are backups and disaster recovery, so, let me try to expose some cases that could happen:

1) proxmox node totally lost (raid failed or corrupted and so on)

I think to reinstall a new proxmox node from scratch and then restore all VMs previously running from a previous export made with vzdump. Is this ok? Any better workflow?

2) proxmox node available but some VMs lost:
manual restore from vzdump.

3) proxmox available but LVM configuration corrupted/lost:
I don't know how to recover from this

4) proxmox failing (unrecoverable read errors in a degraded RAID):
immediatly backup with vzdump and then restore on a different host. Any better workflow?

Any suggestions or any other cases not taken in consideration? I would like to sleep well and be prepared when all is went bad.

EDIT: i think to schedule the nighly backup from our backup server (nothing to do with proxmox) by using proxmox api. Any docs about this? How can I schedule a remote backup with snapshot using API ? To keep file system consistency (where i'm using databases) i'm thinking about scheduling (from the inside of vm) a mysqldump. In example, i'll schedule a mysqldump to disk at 23:00 and then i'll schedule the snapshot+backup at 00:00. Is possible to execute some commands directly from vzdump ?

In addition to vzdump, i'll also use rsync to sync all files to a remote location. This because If I need to restore a couple of files, i'll be able to do that directly. By only using the vzdump image, to restore 3 files i'll have to restore the whole VM at first and this could take hours.

Any suggestions is welcome.
 
Hi,
I think to reinstall a new proxmox node from scratch and then restore all VMs previously running from a previous export made with vzdump. Is this ok? Any better workflow?
If you have a single node installation and all VM images lay on the disk what is corrupt. Yes then this is the way.
2) proxmox node available but some VMs lost:
manual restore from vzdump.
Yes correct.

3) proxmox available but LVM configuration corrupted/lost:
I don't know how to recover from this
Same as the first one or you can try to repair it with a live Linux. We use LVM for root.

4) proxmox failing (unrecoverable read errors in a degraded RAID):
immediatly backup with vzdump and then restore on a different host. Any better workflow?
Would be ok.

We have an own cluster wide backup scheduler, so there is no need of remote starting it.
This is located in Datacenter->Backup.
If you make also internal backups be sure you do not at the same time.

Is possible to execute some commands directly from vzdump ?
Yes see man vzdump
 
1-4 should he straight forward as any Linux host. We backup the OS files itself and need only to restore them and rebuild grub in a disaster recovery case. VM-Backups are done regularly with onboard Proxmox backup scheduling.

To your EDIT: Please use (i hope you're using KVM) the qemu-quest agent and MySQL hooks. Then you have a crash consistent database which is even better than the shady dump. You can extract the vzdump files and access the files directly, so the rsync step is not necessary.
 
@LnxBil could you point me to some docs about qemu-guest and MySQL hooks?
Rsync is necessary to create incremental backups through rsnapshot. Without it vzdump only create full dump and would be impossibile to have 7 days retention of full backups with vzdump. Too much wasted space
 

I was referring to any ProxMox docs, not on google.
I tought there was some docs in proxmox on how to do a consistent backup even with databases or any other "in-memory" data

If you unpack the vzdumps onto a ZFS you can create differential backups and are not wasting much space

Currently, our backups servers are not on ZFS but on EXT4/XFS and I dont know ZFS at all, thus I'll avoid to put unknown software in production
 
Hi,
We have an own cluster wide backup scheduler, so there is no need of remote starting it.
This is located in Datacenter->Backup.
If you make also internal backups be sure you do not at the same time.

I need to export to a backup server, not shown with an NFS share.
Currently, I'm backing up my XenServer hosts thorugh api directly from the backup servers.
With ProxMox and the backup scheduler, how can I tell proxmox to save to the backup server, if backup server doesn't expose any shared filesystem (NFS or so on)?

Why are you suggesting to not run internal backups and proxmox backups at the same time? For performances or any other issue?
 
I was referring to any ProxMox docs, not on google.
I tought there was some docs in proxmox on how to do a consistent backup even with databases or any other "in-memory" data

So, you want to do a live snapshot and back that one up, anything else is not "in-memory"-consistent. Such a "backup" is still bad. All open non-local files and connections are lost, etc.

A consistent backup is always application specific, so Proxmox cannot solve this problem for you (neither can Xen or VMware stuff - or even SAN replication). So you have to tell the application that it's going to be backuped soon, this is done via the aforementioned hook. It is also possible with other databases, but the MySQL "database" is statistically the most often wanted one, therefore, the guest agent hook exists. mysqldump is also not really safe in "default" mode, please read http://dba.stackexchange.com/questi...on-a-live-system-with-active-reads-and-writes

Be advised: The hooks have to be completed very fast in order to work. There are no long running jobs allowed. You'll get a timeout from the backup to "freeze" the VM.
 
I know that using plain shapshots and backup that are not good, that's why i'll use rsync from inside. Plain shapshot are used as disaster recovery, allowing me to restore the whole VM in less time that having to recreate it from scratch and the restoring the whole backup.

What do you mean with "All open non-local files and connections are lost"? I don't need to preserve the connection status
 
I know that using plain shapshots and backup that are not good, that's why i'll use rsync from inside. Plain shapshot are used as disaster recovery, allowing me to restore the whole VM in less time that having to recreate it from scratch and the restoring the whole backup.

rsync is even worse than qemu-based backup because the files can change on disk while they're copied. qemu is freezing the disk state and backuping the disk. All write operations after the freezing are stored in memory or a separate part on the disk (depending on the disk type).

What do you mean with "All open non-local files and connections are lost"? I don't need to preserve the connection status

But the file status. If a file is written in the moment of backup, it is in an unknown state. Same with read or write locks in the database. They have to wait for timeout and rollback.

Long story short: If you do not have 24/7 operation, just backup your way with vzdump and rsync and you'll be fine. You could run into problems if the database is up 24/7.
 
Every database has a specific utility to make a consistent backup and if you care for your data you should not use any backup method which does not include the specific backup utility for the database in question.
 
Database dump is not an issue, as i'm backing up the "mysqldump" sql file, not the /var/lib/mysql directory
 
Currently, my rsync/rsnapshot backup procedure execute a mysqldump BEFORE starting the copy. /var/lib/mysql is excluded from the backup because i'm backing up the dump. This is consistent.

For 'standard' files, is not a big issue
 
Currently, my rsync/rsnapshot backup procedure execute a mysqldump BEFORE starting the copy. /var/lib/mysql is excluded from the backup because i'm backing up the dump. This is consistent.
This greatly depends on the options given to the mysqldump job. Without --single-transaction option your are not guarantied a consistent backup and with this option it is not 100% sure. Only way to get a consistent database with mysqldump is to use option –lock-all-tables at the expense of a read-only database while the dump is running.
 
There is no need to lock ALL tables and "--opt" (default) automatically add "lock-tables" when exporting.
You just need to lock the table you are exporting and mysqldump does this as default.
 
There is no need to lock ALL tables and "--opt" (default) automatically add "lock-tables" when exporting.
You just need to lock the table you are exporting and mysqldump does this as default.
lock-tables only reflects the tables you explicitly select for backup. The tables in mysql database which holds all metadata is not locked by this option and can, and often will, change while a mysqldump is running. Not mentioning every other table for which the tables you backup holds foreign keys.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!