Disaster recovery and backups

Alessandro 123 · Jul 3, 2016

As wrote in other thread, i'm engineering a new virtual environment based on proxmox (i'm coming from XenServer)

One of the most important things are backups and disaster recovery, so, let me try to expose some cases that could happen:

1) proxmox node totally lost (raid failed or corrupted and so on)

I think to reinstall a new proxmox node from scratch and then restore all VMs previously running from a previous export made with vzdump. Is this ok? Any better workflow?

2) proxmox node available but some VMs lost:
manual restore from vzdump.

3) proxmox available but LVM configuration corrupted/lost:
I don't know how to recover from this

4) proxmox failing (unrecoverable read errors in a degraded RAID):
immediatly backup with vzdump and then restore on a different host. Any better workflow?

Any suggestions or any other cases not taken in consideration? I would like to sleep well and be prepared when all is went bad.

EDIT: i think to schedule the nighly backup from our backup server (nothing to do with proxmox) by using proxmox api. Any docs about this? How can I schedule a remote backup with snapshot using API ? To keep file system consistency (where i'm using databases) i'm thinking about scheduling (from the inside of vm) a mysqldump. In example, i'll schedule a mysqldump to disk at 23:00 and then i'll schedule the snapshot+backup at 00:00. Is possible to execute some commands directly from vzdump ?

In addition to vzdump, i'll also use rsync to sync all files to a remote location. This because If I need to restore a couple of files, i'll be able to do that directly. By only using the vzdump image, to restore 3 files i'll have to restore the whole VM at first and this could take hours.

Any suggestions is welcome.

wolfgang · Jul 4, 2016

Hi,

Alessandro 123 said:
I think to reinstall a new proxmox node from scratch and then restore all VMs previously running from a previous export made with vzdump. Is this ok? Any better workflow?

If you have a single node installation and all VM images lay on the disk what is corrupt. Yes then this is the way.

Alessandro 123 said:
2) proxmox node available but some VMs lost:
manual restore from vzdump.

Yes correct.

Alessandro 123 said:
3) proxmox available but LVM configuration corrupted/lost:
I don't know how to recover from this

Same as the first one or you can try to repair it with a live Linux. We use LVM for root.

Alessandro 123 said:
4) proxmox failing (unrecoverable read errors in a degraded RAID):
immediatly backup with vzdump and then restore on a different host. Any better workflow?

Would be ok.

We have an own cluster wide backup scheduler, so there is no need of remote starting it.
This is located in Datacenter->Backup.
If you make also internal backups be sure you do not at the same time.

Alessandro 123 said:
Is possible to execute some commands directly from vzdump ?

Yes see man vzdump

LnxBil · Jul 4, 2016

1-4 should he straight forward as any Linux host. We backup the OS files itself and need only to restore them and rebuild grub in a disaster recovery case. VM-Backups are done regularly with onboard Proxmox backup scheduling.

To your EDIT: Please use (i hope you're using KVM) the qemu-quest agent and MySQL hooks. Then you have a crash consistent database which is even better than the shady dump. You can extract the vzdump files and access the files directly, so the rsync step is not necessary.

Alessandro 123 · Jul 4, 2016

@LnxBil could you point me to some docs about qemu-guest and MySQL hooks?
Rsync is necessary to create incremental backups through rsnapshot. Without it vzdump only create full dump and would be impossibile to have 7 days retention of full backups with vzdump. Too much wasted space

LnxBil · Jul 4, 2016

Alessandro 123 said:
@LnxBil could you point me to some docs about qemu-guest and MySQL hooks?

Sure, http://bfy.tw/6a3Q

Alessandro 123 said:
Rsync is necessary to create incremental backups through rsnapshot. Without it vzdump only create full dump and would be impossibile to have 7 days retention of full backups with vzdump. Too much wasted space

If you unpack the vzdumps onto a ZFS you can create differential backups and are not wasting much space

Alessandro 123 · Jul 4, 2016

LnxBil said:
Sure, http://bfy.tw/6a3Q

I was referring to any ProxMox docs, not on google.
I tought there was some docs in proxmox on how to do a consistent backup even with databases or any other "in-memory" data

If you unpack the vzdumps onto a ZFS you can create differential backups and are not wasting much space

Currently, our backups servers are not on ZFS but on EXT4/XFS and I dont know ZFS at all, thus I'll avoid to put unknown software in production

Alessandro 123 · Jul 4, 2016

wolfgang said:
Hi,
We have an own cluster wide backup scheduler, so there is no need of remote starting it.
This is located in Datacenter->Backup.
If you make also internal backups be sure you do not at the same time.

I need to export to a backup server, not shown with an NFS share.
Currently, I'm backing up my XenServer hosts thorugh api directly from the backup servers.
With ProxMox and the backup scheduler, how can I tell proxmox to save to the backup server, if backup server doesn't expose any shared filesystem (NFS or so on)?

Why are you suggesting to not run internal backups and proxmox backups at the same time? For performances or any other issue?

LnxBil · Jul 4, 2016

Alessandro 123 said:
I was referring to any ProxMox docs, not on google.
I tought there was some docs in proxmox on how to do a consistent backup even with databases or any other "in-memory" data

So, you want to do a live snapshot and back that one up, anything else is not "in-memory"-consistent. Such a "backup" is still bad. All open non-local files and connections are lost, etc.

A consistent backup is always application specific, so Proxmox cannot solve this problem for you (neither can Xen or VMware stuff - or even SAN replication). So you have to tell the application that it's going to be backuped soon, this is done via the aforementioned hook. It is also possible with other databases, but the MySQL "database" is statistically the most often wanted one, therefore, the guest agent hook exists. mysqldump is also not really safe in "default" mode, please read http://dba.stackexchange.com/questi...on-a-live-system-with-active-reads-and-writes

Be advised: The hooks have to be completed very fast in order to work. There are no long running jobs allowed. You'll get a timeout from the backup to "freeze" the VM.

Alessandro 123 · Jul 4, 2016

I know that using plain shapshots and backup that are not good, that's why i'll use rsync from inside. Plain shapshot are used as disaster recovery, allowing me to restore the whole VM in less time that having to recreate it from scratch and the restoring the whole backup.

What do you mean with "All open non-local files and connections are lost"? I don't need to preserve the connection status

LnxBil · Jul 4, 2016

Alessandro 123 said:
I know that using plain shapshots and backup that are not good, that's why i'll use rsync from inside. Plain shapshot are used as disaster recovery, allowing me to restore the whole VM in less time that having to recreate it from scratch and the restoring the whole backup.

rsync is even worse than qemu-based backup because the files can change on disk while they're copied. qemu is freezing the disk state and backuping the disk. All write operations after the freezing are stored in memory or a separate part on the disk (depending on the disk type).

Alessandro 123 said:
What do you mean with "All open non-local files and connections are lost"? I don't need to preserve the connection status

But the file status. If a file is written in the moment of backup, it is in an unknown state. Same with read or write locks in the database. They have to wait for timeout and rollback.

Long story short: If you do not have 24/7 operation, just backup your way with vzdump and rsync and you'll be fine. You could run into problems if the database is up 24/7.

mir · Jul 4, 2016

Every database has a specific utility to make a consistent backup and if you care for your data you should not use any backup method which does not include the specific backup utility for the database in question.

Alessandro 123 · Jul 4, 2016

Database dump is not an issue, as i'm backing up the "mysqldump" sql file, not the /var/lib/mysql directory

Alessandro 123 · Jul 4, 2016

Currently, my rsync/rsnapshot backup procedure execute a mysqldump BEFORE starting the copy. /var/lib/mysql is excluded from the backup because i'm backing up the dump. This is consistent.

For 'standard' files, is not a big issue

mir · Jul 4, 2016

Alessandro 123 said:
Currently, my rsync/rsnapshot backup procedure execute a mysqldump BEFORE starting the copy. /var/lib/mysql is excluded from the backup because i'm backing up the dump. This is consistent.

This greatly depends on the options given to the mysqldump job. Without --single-transaction option your are not guarantied a consistent backup and with this option it is not 100% sure. Only way to get a consistent database with mysqldump is to use option –lock-all-tables at the expense of a read-only database while the dump is running.

Alessandro 123 · Jul 4, 2016

There is no need to lock ALL tables and "--opt" (default) automatically add "lock-tables" when exporting.
You just need to lock the table you are exporting and mysqldump does this as default.

mir · Jul 4, 2016

Alessandro 123 said:
There is no need to lock ALL tables and "--opt" (default) automatically add "lock-tables" when exporting.
You just need to lock the table you are exporting and mysqldump does this as default.

lock-tables only reflects the tables you explicitly select for backup. The tables in mysql database which holds all metadata is not locked by this option and can, and often will, change while a mysqldump is running. Not mentioning every other table for which the tables you backup holds foreign keys.

Alessandro 123 · Jul 4, 2016

Which metadata are your referring to ?

mir · Jul 4, 2016

Log in to your mysql database server:
1) mysql> use mysql;
2) mysql> show tables;

Alessandro 123 · Jul 4, 2016

"mysql" database and it's tables are locked and backup properly

mir · Jul 4, 2016

How do you do that?

Disaster recovery and backups

Well-Known Member

Proxmox Retired Staff

Distinguished Member

Well-Known Member

Distinguished Member

Well-Known Member

Well-Known Member

Distinguished Member

Well-Known Member

Distinguished Member

Famous Member

Well-Known Member

Well-Known Member

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Famous Member