How do YOU do Disaster Recovery ?

Q-wulf

Well-Known Member
Mar 3, 2013
613
38
48
my test location
Disaster Recovery; How do you handle that at your Org ?

Background:
  • Planning on replacing a SBS/Terminal Server solution (Windows based) at a SMB as a favor
  • They currently use a software called ShadowProtect SPX on their Windows Servers backing to a NAS incrimentally and full, and USB-HDD's (fully backup) every night, which gets switched out once per day and put away until the following week.
  • I fully intend to do a >= 3-2-1 system. (3 copies, 2 separate systems, 1 remote)


Setup:
  1. 2x Standalone Proxmox Nodes (All flash; ZFS)
  2. 1x local NAS (possibly True/FreeNas)
    1. Have it replicate to daily changing USB Device
  3. 1x remote NAS (sync from local NAS)
  4. Virtual Machines
    1. Server ONE
      1. VM for DC, DNS, AD
      2. VM for Exchange (Server one)
      3. 3x APP VM's
    2. Server TWO
      1. VM for File Server
      2. VM for a specific business software
      3. 2x APP VM's




My Questions are:
1) How do you handle Host Backups ?
2) For the VM backups, would you backup the container, or would you backup the virtual sever inside using e.g. Shadowprotect, or both, something more elegant ?
3) Does it make sense to use PVE-Zsync ? Both for speeding up disaster recovery (Can manually start VM's if one server is down on the other) and synching VM-Data to the NAS ?


Basically looking for no nonsense best practice solutions that are stable and reliable;
Any help appreciated,
Q-Wulf
 
Last edited:
1) How do you handle Host Backups ?
I don't. A new host is set up in a few minutes. There are scripts to backup the relevant host data. (Somewhere here in the forums)
2) For the VM backups, would you backup the container, or would you backup the virtual sever inside using e.g. Shadowprotect, or both, something more elegant ?
The VM. It's easy and built-in. You could just save the images as well. I would not work with backup software in the VM if I can help it.
3) Does it make sense to use PVE-Zsync ? Both for speeding up disaster recovery (Can manually start VM's if one server is down on the other) and synching VM-Data to the NAS ?
Yes. It will speed up the backup a lot.

Regards, J.
 
Agree, host backups should be continuity scripts. We still put our OS on mirrored SSDs, just because it's a pain to have downtime to re-OS a system. But when you have to, it should be as fast as possible.

For protecting data, we've got a couple of layers of redundancy, and then the backup. First, there's the ZFS pool of mirrors that the data sits on. The mirrors ensure that if a drive dies, the data isn't lost. ZFS ensures that if data on disk is corrupted, it is fixed.

Then there are ZFS snapshots for if data is accidentally deleted or a crypto virus encrypts all of the data. Snapshots are sent off site in case the building burns down. Snapshots that are sent to the off site location are also backed up to a second backup server.

And finally we use the Proxmox backup feature to backup the VMs whole hog to another pool.

I've not used PVE-ZSync, but it sounds like it's a good way to move VM storage between Proxmox machines. This would be a great way to keep a business running in the event that one Proxmox machine went down for some reason.
 
Hi Q-wolf,

You ask a simple question, but the answer is very complicated. I will detail what I will do if I have your enviroment!

As you maybe know zfs is not 100% compatible bethen linux and non-linux(nx-bsd)/nas. I think that in your case any non-proxmox server(aka-nas) is a bad decision. And you have only 2 PMX nodes, so not a real cluster ;) Fro this reasons, I would erase the nix nas, and I will install PMX and use as a 3rd node. Now you will have a real PMX cluster. And you can have many other advantages: 100% compatibility regardind zfs, the same software, and even HA. If node1 or 2 is broken, then you can split the load using 2 nodes(much better than all in only one node). You can also test new PMX updates from PMX on the 3rd node... and so on.

Backup ? I like 3-2-1 ideea, but I like more 3 differents software backup system. In case of PMX I use: vzdump(local and remote), pve-zsync, and any other tool based on rsync(rock solid.) Cookbook backup:

- pve-zsync(hourly?) for node1 VMs -> node2, and n2 -> node3(n1+n2=total VMs)
- the same for node2
- daily vzdump backup for any VM on node1 and 2
- rsync of vzdump files from node1,2 to node3(one time at 3 days)
- rsync daily from node1,2(vzdump files) ro the remote location(nas, better PMX)

My english is not so good, but as I can understand, you use 7 usb HDD? And each of them is rotate(1 day work, then is in a closet at rest)?
Very nice .... but how you can be confident that your usb HDD are safe? Why do you rotate? In your case, at worst case, would use let say 2 raspberry 3.14(4 x usb hdd), who can be power on (each of them in a day, but next day will be not online) at the desired work-time, and you can make backup on only 1 HDD-usb, and at the same time you can check the rest of HDD-usb with smartctl/badblocks? After the backup /smartcl/ is finished, then you can power off the raspberry 3.141 ! And this 2 raspberry pi can be in different buildings/rooms !!!
Also if in one night, your raspbery pi no1 can not be power on or whatever, then the second raspberry can start an make the backup. Any usb HDD who is in a closet, can not be so usefull if the curent usb-HDD it will be broken during the backup task!

Backup task is simple, the most complicated, is to check/verify your backup, to monitor errors/fails for backup related tasks! - let restore VM xxx from the last backup on node Z1(and test some services who are supose to run on that vm, and so on).


Good luck!
 
I should have prefaced the post with the following disclaimer:
My Dayjob is at a Storage as a Service Company, I use (multiple) Proxmox Clusters on a daily basis, we just don't do proxmox host backups, since no client side data lives on these machines. As such i was asking a very specific question, looking specifically for the most KISS solution :)

I am doing a favour for a company (in my spare time) that can be best defined as a small sized business that is lucky for not having had any serious issues with dataloss yet.

let me make some clarifications:

Hi Q-wulf,
You ask a simple question, but the answer is very complicated.
[...]
As you maybe know zfs is not 100% compatible bethen linux and non-linux(nx-bsd)/nas.
I think that in your case any non-proxmox server(aka-nas) is a bad decision. And you have only 2 PMX nodes, so not a real cluster ;) Fro this reasons, I would erase the nix nas, and I will install PMX and use as a 3rd node. Now you will have a real PMX cluster.
[...]
You can also test new PMX updates from PMX on the 3rd node... and so on.

1) We will NOT be synching from Proxmox to the NAS using ZFS based sync. I would use PVE-Sync for host to host replication. We will however back up the Userdata (VM's) (and now also the Host configs) to these NAS's
2) It does not always make sense to run a fully fledged cluster. For the use-case of the company above 2 standalone systems are less complicated, there is no need for a more complicated setup, as the features provided by a cluster will never be utilised.
3) NEVER do your testing on a live system (i.e. your 3rd proxmox node). That is just bad form. This is what you have a (virtual) Test-Environment for.



Backup ? I like 3-2-1 ideea, but I like more 3 differents software backup system. In case of PMX I use: vzdump(local and remote), pve-zsync, and any other tool based on rsync(rock solid.) Cookbook backup:
[...]

3-2-1 strategy is about minimising the chance of dataloss. It is all about the media you store your data on. It basically says EVERY backup you do, you shall save as 3 separate copies. you shall employ at least 2 different media (e.g. NAS, external drive) and you shall store one of these copies remotely (as in NOT on location).

My english is not so good, but as I can understand, you use 7 usb HDD? And each of them is rotate(1 day work, then is in a closet at rest)?
Very nice .... but how you can be confident that your usb HDD are safe? Why do you rotate? In your case, at worst case, would use let say 2 raspberry 3.14(4 x usb hdd), who can be power on (each of them in a day, but next day will be not online) at the desired work-time, and you can make backup on only 1 HDD-usb, and at the same time you can check the rest of HDD-usb with smartctl/badblocks? After the backup /smartcl/ is finished, then you can power off the raspberry 3.141 ! And this 2 raspberry pi can be in different buildings/rooms !!!
Also if in one night, your raspbery pi no1 can not be power on or whatever, then the second raspberry can start an make the backup. Any usb HDD who is in a closet, can not be so usefull if the curent usb-HDD it will be broken during the backup task!

I never said their current system is perfect. There is a reason why i am changing this. I guess the main focus for them was to reduce the likelihood of a total loss of data to their org in case of fire and a break-in. Using a 5 work day USB-HDD cycle was their way of ensuring that the switch could be done by a regular office worker, while making sure you only loose a couple days of data, in the case their admin does not immediately see the backup solution throwing errors. The way they do backups does not render a backup 100% useless in case of bad block on a hardware level. it just renders a couple of files unusable. since they have 5 copies of said files in their vault the likely hood of them all being destroyed is somewhat smaller. Again, not my design.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!