High Availability - Hardware Idea

Nov 3, 2011
26
1
23
Tamworth, United Kingdom
Hi Guys,

You are doing a great job with Proxmox 2.0, I just have a few questions as I would like to test High Availability and hopefully use this in a production environment once happy.

1. I have watched the video on setting up a HA cluster but I do not understand why there are 3 nodes, is the 3rd node storage?

2. What are your views on using a QNAP TS-412U as the storage for the HA Cluster using NFS,

3. This is the idea I have for hardware also, what do you think?

2x HP DL360 G5 with 2x 72Gb SAS, iLO (For Fencing), Extra 2x Gigabit Bonded Ethernet (More of these can be added in the future when needed) (Second is for failover)
1x QNAP TS-412U (Dual Gigabit/ Bonded) 4x 2Tb in Raid 10 (For ultimate HA I imagine you would setup two of these in a failover/realtime replication)
1x Netgear GS748T 48port Gigabit Switch (Uses VLAN, good for public/private split) (Ideally in the future, there would be two of these and one of each ethernet bonds would go into each)

If there is anything you can see that is wrong with this or any pointers would be much appreciated.
 
You need to eliminate all single points of failure if you want full HA, the NAS and single switch are SPOF.

I do not have an opinion on any particular model of server but ILO should work well for fencing.

I suggest using a good RAID card in each server, lots of battery backed cache RAM.
IO always seems to be the bottleneck but performance unfortunately costs lots of money.
Four 2TB disks is a lot of storage, if you have IO intensive apps you might want to consider more disks.

If you are using KVM then I suggest using DRBD for real time replication of your VMs.
http://pve.proxmox.com/wiki/DRBD
Setup two volumes as the wiki suggests and save yourself headaches and frustration.
Remember the IO will now be used by both nodes, so you really need to consider the performance you need and build appropriately.
All writes happen on both nodes, reads happen only on the node doing the read.

You can use two dedicated bonded gig ports for the DRBD replication, that will limit your writes to about 200MB/sec. I have never tried more than two gig ports for DRBD but have read that 3 and 4 bonded ports do not seem to help with throughput.
If you need better performance there is 10G Ethernet, 10G(or faster) Infiniband(look on ebay), and Dolphin to choose from.

The third machine is to have quorum, it could be some low powered cheap atom board that simply participates in the cluster.
With three machines a quorum can be reached, two working nodes can communicate with each other:
NodeA says: Hi Node B and C!
NodeB says: Well Hello Back Node A and node C Are you there?
NodeA says: Seems like Node C is missing.
NodeB says: Yep I agree, Node C is not talking to me either but we can talk just fine, I'll turn him off and take over his HA services.
NodeA says: Great!

You want the network that the cluster talks on to be redundant too.
Imagine a switch goes bad, power cable gets bumped, etc, you lost quorum because none of the nodes can talk.

For the record I have not tested HA with DRBD yet but I have been using DRBD with 1.X and 2.0 for a long time.
We plan to dig an old machine out of the storage closet this week so we have a third cluster member to start testing HA with our DRBD Proxmox 2.0 nodes.
 
We have used drbd + heartbeat for 2 years on proxmox 1.9. we use a primary/secondary drbd . overall it has worked great.

e100: are you thinking of using a primary/primary set up?
 
I have 12 nodes (6 pairs) in production that use dual DRBD resources in primary/primary
First one went into production ~March 2010 with Proxmox 1.5
We have upgraded 6 nodes to 10G Infiniband for DRBD replication.

Two Proxmox 2.0 setup the same also with 10G Infiniband.

No heartbeat, not needed.
We do monitor the state of DRBD and send sms if it ever split-brains so we can fix it quickly.
I bet a script could be written to recover automatically, but when it comes to DRBD I am more comfortable letting me make decisions about such important things.

I've been editing the wiki, just added this: http://pve.proxmox.com/wiki/DRBD#Recovery_from_split_brain_when_using_two_DRBD_volumes
primary/primary with two DRBD volumes works very well and is very simple to recover from split-brain as the new section makes clear.

I will use DRBD for HA needs until something like CEPH or Sheepdog is stable enough to trust.
gluster works but performance sucks, works good for storing ISO's in 2.0 tho!
 
I have 12 nodes (6 pairs) in production that use dual DRBD resources in primary/primary
First one went into production ~March 2010 with Proxmox 1.5
We have upgraded 6 nodes to 10G Infiniband for DRBD replication.

Two Proxmox 2.0 setup the same also with 10G Infiniband.

No heartbeat, not needed.
We do monitor the state of DRBD and send sms if it ever split-brains so we can fix it quickly.
I bet a script could be written to recover automatically, but when it comes to DRBD I am more comfortable letting me make decisions about such important things.

I've been editing the wiki, just added this: http://pve.proxmox.com/wiki/DRBD#Recovery_from_split_brain_when_using_two_DRBD_volumes
primary/primary with two DRBD volumes works very well and is very simple to recover from split-brain as the new section makes clear.

I will use DRBD for HA needs until something like CEPH or Sheepdog is stable enough to trust.
gluster works but performance sucks, works good for storing ISO's in 2.0 tho!
Hi e100,
I have requesting to you to give an short explanation how infiniband is configured on pve to use with DRBD (if you find time for this). Perhaps as small howto in the drbd-wiki?
Due that infiniband aren't home-use there is not many information to find on the net (or I found the wrong sites).
Because I do an short test with two Infiniband-HBAs (borrowed) and are not successfull (what the heck must i do to use IP over infiniband?).
Second try: to make a better test i have ordered two HBAs on ebay (you are right, some are very cheap).

Udo
 
Hi e100,
I have requesting to you to give an short explanation how infiniband is configured on pve to use with DRBD (if you find time for this). Perhaps as small howto in the drbd-wiki?
Due that infiniband aren't home-use there is not many information to find on the net (or I found the wrong sites).
Because I do an short test with two Infiniband-HBAs (borrowed) and are not successfull (what the heck must i do to use IP over infiniband?).
Second try: to make a better test i have ordered two HBAs on ebay (you are right, some are very cheap).

Udo

http://pve.proxmox.com/wiki/Infiniband

Do you have an Infiniband subnet manager running?

I am using MHEA28-XTC cards (Dual 4x SDR 10Gb pcie) and TopSpin 120 Switches(Integrated Subnet Manager, 24 port, dual power supply 1U)
Each server is connected to two switches, switches connected to each other.
A switch or a cable can fail and communications to all nodes continue.
Perfect for corosync on Proxmox 2.0
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!