HA configuration questions

NStorm

Active Member
Dec 23, 2011
64
2
28
Russia, Rostov-na-Donu
Hello.

I have two 3.2 Proxmox nodes and I want to setup a HA cluster with them. Both nodes got a dedicated 10Gbps NICs for this purpose. Both nodes run both KVM and OpenVZ CTs. CTs mostly. Currently a local storage is used and no SAN are planned. I want to setup a HA for those. I've read those wiki articles, but most of them are for VE 2.0.
So what do I go with now? DRBD? Ceph? GlusterFS?
I assume DRBD are still the best choice here for 2 nodes. And I'd better setup pair of master/slave volumes for both nodes to easier the split-brains. But will it work for HA or do I have to manually resolve on one node failure?
 
Hello.I have two 3.2 Proxmox nodes and I want to setup a HA cluster with them. Both nodes got a dedicated 10Gbps NICs for this purpose. Both nodes run both KVM and OpenVZ CTs. CTs mostly. Currently a local storage is used and no SAN are planned. I want to setup a HA for those. I've read those wiki articles, but most of them are for VE 2.0.So what do I go with now? DRBD? Ceph? GlusterFS?I assume DRBD are still the best choice here for 2 nodes. And I'd better setup pair of master/slave volumes for both nodes to easier the split-brains. But will it work for HA or do I have to manually resolve on one node failure?
I would stick with DRBD. Setup on 3.x shouldn't be much different if any different. Definitely a good idea to have two drbd resources to prevent split brain confusion. For me the longest part of the setup is the DRBD sink. Although we have now moved away from drbd and only use central storage. Good luck!
 
Thanks for your reply.
In case of double DRBD partitions with primary/secondary configuration, will work with Proxmox HA management? I mean if NodeA with its primary DRBD partition will go down, will Proxmox cluster will be able to recover this on NodeB (i.e. switching DRBD to secondary/primary and restarting CTs) ?
 
Hi NStorm

I use DRBD since very years ago without problems

These are my recomendations:

- NIC(s) dedicated for DRBD, connected in mode "NIC-to-NIC", and if are 2 NICs dedicated for DRBD, then use bonding with balance-rr mode for get double speed of net communication, and set the MTU of this(these) NIC(s) the max (see the hardware manual of your NICs for know until how much is supported).

- For a quick task of maintenance (in case of split-brain or something like), you must have two DRBD partitions. Then the VMs of the first PVE node writes in the first DRBD partition, and the second PVE node writes in the second DRBD partition, of this manner, you can resolve many problems of split-brain quickly and all online, ie without power off nothing.

- If you want more speed, sure that if you have different disks for each DRBD partition will be better (for don't execute in simultaneous the writes and reads of a DRBD partition with the other DRBD partition in the same disk).

- For get more perfomance in DRBD, see the tuning of DRBD of the website of linbit

- Also, you can run a verify online of your replicate DRBD volumes without power off nothing (i have it in automatic mode in the crontab with a personal bash script that report the time of start and end as some other things more)

- In my mini test lab, i have it with only two PVE Nodes, and for get "HA", i use the manual fence that require a minimun of human intervention for get that the VMs starts in the other PVE Node, and all this without lose any data of VMs

- With this scenery, you can have HA for your VMs "and not for your CTs"

Good luck with your proyect

Best regards
Cesar

Red-Edited:
I assume DRBD are still the best choice here for 2 nodes. And I'd better setup pair of master/slave volumes for both nodes to easier the split-brains

With PVE is better to have Active/Active, because:
1- Can you have live migration of VMs
2- "HA" also will work well, and better if you have each DRBD volume for each PVE node
3- Also in the global confguration file of DRBD, you can configure send a mesage immediately for email if a replication of DRBD volume have any fail

But will it work for HA or do I have to manually resolve on one node failure?
With only two PVE Nodes, the best option for apply "HA" is only manually, configuring manual fence in the configuration of PVE cluster, and after, where you have a problem with the PVE node, you will can execute "HA" with only run a command: /usr/sbin/fence_ack_manual <the_name_of_PVE_node_with_problems> (but before, for be sure, you should power off the PVE node disconnecting the electric power)

Notes:
1- DRBD have the best performance that you will can get with only two PVE nodes
2- With only two PVE nodes, apply a automatic fencing is a bad idea, because in this scenery each PVE Node don't get a majority of votes of quorum (then, the PVE node more quick will apply fencing to the other PVE node, and this action don't is the ideal), while that with a manual fence, you can think, analize the situation, and after that you are sure, you can apply manual fencing over the PVE node that you want.
 
Last edited:
Thanks for the replies!
But
does that means it won't work for CTs? Because I have mostly OpenVZ CTs, and only a few KVMs.

Yes, but may be that you can apply some trick, i don't have practices with CTs because the major technologies of storage and virtualization always have a great support to KVM (as CEPH, GlusterFS, IBM, Red Hat , etc.), then, i prefer use KVM.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!