Quick recovery / "Poor mans HA" advice sought

captn

New Member
Nov 26, 2010
1
0
1
We have 2x Promox VE (1.5) boxes on which we are running a handful (say 15 each) lightly loaded VMs (all using OpenVZ, no KVM).

We are very happy with how this is running and now want to consider how we achieve quick recovery of VM's should one box fail.

Currently the boxes are in a cluster for management ease purposes.

Note that:
- There is enough grunt (and disk) available on each box to handle all the VM's being run from just 1 box
- There is only local storage on each box (no shared)
- We are happy with the VM's that boot being old in the case of an emergency
- We would like to have no more than say 5 minutes of startup of the VMs on the 2nd box should the other one fail

So obviously it is easy for us to script up rsync of /var/lib/vz/private/XXX over to the other box on a nightly basis, and I
guess similarly so for /etc/vz/conf/XXX.

Because we're not overly familiar with proxmox yet, the questions are:

- What else would need copying (if any)
- Is there any gotcha's we need to look out for doing it this way
- What would be the best way to start the VM's on the second box


In an ideal world, we'd receive notification of one of the boxes being offline, and simply logon to the second box and
run a script which would start all the copies of the other VMs (that were rsync'd the night before).

Thanks for any guidance.

Brent
 
There aren't any gotcha's for proxmox because it is just looking at the openVZ config files and folders etc.

1. You should do an rsync of /var/lib/vz/private/. to the second box and /etc/vz/conf/.
NOTE: /etc/vz/conf you should only do once, and then manually copy CTID.conf files over if you add new servers or change IP's of existing ones
2. On the 2nd box you should rename the CTID files under /etc/vz/conf from say 101 to 1101 and so on
3. Have a script replace the following 2 lines in each /etc/vz/conf/CTID.conf file eg for 1101.conf

VE_ROOT="/var/lib/vz/root/$VEID"
VE_PRIVATE="/var/lib/vz/private/$VEID"

with
VE_ROOT="/var/lib/vz/root/101"
VE_PRIVATE="/var/lib/vz/private/101"

4. That way if your 1st server dies then on the web interface on your 2nd server you simply start the 1101, 1102, 1103 etc containers and they will all be up within seconds.
5. If your first server is up but say you want to perform maintenance on it then what you should do is shutdown the containers either one at a time or all at once and run another rsync to the second server of only the /var/lib/vz/private/CTID or the whole folder (if you shut them all down at once) then bring up the containers on the 2nd server. You will then have a copy of the containers exactly as they were on the 1st server.

It's actually a reasonable model to have for DR - you could also be using that model for load balancing - so instead of having all your machines run on the 1st server and the 2nd only being used in the event of failure you can have half running on the 2nd server at the same time and rsyncing back to server 1 each night:

eg (say for 6 containers total for simplicity)

Server 1 sync flow Server 2

101 - running rsync => 1101 - stopped
102 - running rsync => 1102 - stopped
103 - running rsync => 1103 - stopped
104 - stopped <= rsync 1104 - running
105 - stopped <= rsync 1105 - running
106 - stopped <= rsync 1106 - running

I do this also basically and with scripts you can migrate a machine in seconds
 
You can also run NFS server on each server, declare them as backup storage, and make cross backup of each Container of machine A on backup storage on machine B and vis-versa.

It's more clean than rsync, but perhaps less fast.

PS:
With the rsync method, I would not copy /etc/vz/conf directly in the same folder on the other machine.
They will not show in the interface and no risk.
In case of DR, just move the conf files in /etc/vz/conf, the containers will show up, and you can start them.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!