Hot Standby Server?

comcanada

New Member
May 14, 2012
9
0
1
Hi,

We have a 2 server install of Proxmox where server1 has the VM's. Server2 gets the backups from server1.

The idea is if server1 fails, we restore from the last set on server2, and start up the vms there.

Two questions:
1 - How can we automate the restore on server2?
2 - Is there a better way to do this?

-Rick
 
Hi,

Though i'm also newbie here in Proxmox, my suggestion would be:

If you have 2 servers and using Server 2 for just backups, then,
- Install Proxmox on Server 2 also and configure HA-High Availability so that you are still running if server1 fails (i've not done myself,hope i'm correct on HA)
- Throw in 2nd drives on Server1 & 2 for Backups

Hope it helps :)
 
Last edited by a moderator:
Hi,

We have a 2 server install of Proxmox where server1 has the VM's. Server2 gets the backups from server1.

The idea is if server1 fails, we restore from the last set on server2, and start up the vms there.

Two questions:
1 - How can we automate the restore on server2?
2 - Is there a better way to do this?

-Rick

How is the second server going to know the first server is down? It could be the second server that has a communications fault to the first server. Do you really want to automatically have two sets of the same VM's running at the same time?

Fall over needs to be complex like the Proxmox HA or done manually by someone that knows what's going on.

I don't like many of the proposed solutions to the problem as most of them rely on the VM's being on some form of shared storage(iSCSI, DRBD, NFS) I had DRBD break on me big time, and iSCSI and NFS are a single point of failure.

The Proxmox HA cluster is a great idea http://pve.proxmox.com/wiki/High_Availability_Cluster but it needs shared storage, which rules out having a DR site unless I can get DRBD to behave and work faster. (It doesn't know what is used or unused space on the drives so it's really inefficient)

Currently I am looking at ZFS replication between sites instead of shared storage, as ZFS will only copy changes, and not waste time trying to copy garbage blocks on a volume. I am unclear on how to suspend and resume machines at the other site eg. I want to take a zfs snapshot of the running VM's and storage every minute or so, and transfer it to the DR site as an R/O copy unless something goes wrong, in which case I will manually make it the master site.
 
Last edited:
We are in a similar situation here with a SOHO+mild development environment, currently I've decided to just jump in and move everything from KVM to VZ and treat edge cases special (with the aim of eliminating them, eg. windows). For us Windows is for clients and Linux for servers. Currently I have gotten promising results from the vzmigrate script with a few tweaks below. A work in progress, changing a few of the default values. I'm looking into the business logic but worried the preferred migrate command is in /usr/sbin/qm (a little more difficult to read).

Code:
  EXPERIMENTAL ONLY, no edge testing!
 cp /usr/sbin/vzmigrate /root/scripts/vzmigrate-custom

  # Arcfour for performance, look into eradicating encryption on dedicated link.
  SSH_OPTIONS="-c arcfour"

  # -x don't cross FS boundaries
  # -t preserve MTIMEs
  # -p preserve permissions
  # -S handle sparse files efficiently
  # -H preserve hard-links
  # --numeric-ids (guaranteed on both hosts?)
  RSYNC_OPTIONS="-axtpSH --delete"

  # keep the node private filesystem intact
  remove_area=0

  # do not wipe the destination
  keep_dst=1

The data stays intact (mirrored), to manually re-enable mv /etc/pve/nodes/hostname/openvz/###.conf.migrated to [samedir]/###.conf

Hope this is more useful than dangerous!
 
If you have 2 servers... I have 2. This is my experience, I am new to Virt, so someone with deeper experience my have a totally different perspective.

This is what I have learned from much time messing with proxmox. You cant do HA with 2 servers easily, you need at least 3. You need 3 at min plus shared storage, and fencing devices, a lot of other stuff too gigabit switches, and probably additional nics to do it right. If you try to do HA with 2 servers, when one is down, the other is down to. 1 server up on a 2 node cluster does not have quorum, so it is useless. You need to use shared storage and use a quorum disk to let that 1 remaining server that is up be useful. But no where have I found helpfull info on how to set a quarm disk on shared storage, so you have to figure that out, and good luck with that. (If you can get it to work I would love to know how its done.)

So if you have 2 servers. I would not cluster them.
Set 1 up to run your vms, install LVM on the standby (so the prod server backs up to the standby over lvm). If your prod fails. Restore the vm manually on the standby. See this thread http://forum.proxmox.com/threads/9294-add-remote-directory-of-cluster-as-local-storage.

If you have shared storage, backup to it. Restore form that.

With 2 servers you can do this or try drdb but I think drdb has its issues too. So in my opinion keep it simple.
 
consider using drbd . there is a very good wiki page here: http://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster .

I followed those instructions. I had a two node cluster with a 1TB drive as /dev/sdb in each.

/dev/sdb1 DRBD 400GB
/dev/sdb2 ext4 600GB local VM backups.

The initial sync of the 400GB DRBD partition between the servers took ages, it copies all the garbage blocks too. It ran fine until the cron back up. On both machines during the vzdump local backup run, it corrupted /dev/sdb2. I would love to know why. I have since given up on DRBD.
 
I can not think of any way DRBD could corrupt an unrelated partition.

DRBD is the best solution for two node HA. I have 14 Proxmox servers running DRBD, they work great.

There are a few of us who run DRBD and try to help others with problems. Every DRBD issue I am aware of is solved and answers posted in these forums.

If you do try DRBD again you need to setup two DRBD volumes to avoid issues when DRBD split-brain
http://pve.proxmox.com/wiki/DRBD#Recovery_from_communication_failure
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!