High Availability 2

bread-baker

Member
Mar 6, 2010
432
0
16
We have a couple of containers where data changes a lot. One is a mail system.

Currently for high availability we use drbd and heartbeat. So disk writes always occur at two physical servers. If the primary system fails , the other takes over and has all the emails. But heartbeat does not work with Proxmox 2.0 .

According to http://pve.proxmox.com/wiki/High_Availability_Cluster#System_requirements shared storage is needed to make a Proxmox VE HA Cluster.

With a 3 node cluster, I do not want to set up a 4-th system for Shared storage , as that would be a single point of failure .

We have plenty of disk space on each node.

Is it a good idea to set up shared storage at each node?

If so what method should be considered?

thanks.

 
We have only two 2.0 servers right now.
To have proper quorum I setup a third node using some old 1U we had laying around.
That third node does not do anything other than help provide quorum to the cluster.
Nothing HA could run on it but if it was more powerful it could run non-HA VMs using local storage.

We use DRBD too, no heartbeat, run it in Primary/Primary with two volumes on each node like the wiki suggests.
The LVM on DRBD is marked as being available only on the two nodes that run DRBD.
For HA VMs, we use Failover Domains so we can specify the two nodes that the HA VM can run on and what one is preferred.

If you had three nodes capable of running DRBD you could do this:
Three DRBD volumes a,b,c
Each node has contains two DRBD volumes.

Node1 runs VMS for DRBD A which is replicated to Node2
Node2 runs VMS for DRBD B which is replicated to Node3
Node3 runs VMS for DRBD C which is replicated to Node1

With this you could have any one machine fail and still have all your VMs running.
 
e100: thanks, I think we'll try using a pair of nodes for drbd with ' The LVM on DRBD is marked as being available only on the two nodes that run DRBD.' .

Then the 3-rd node can be used for development and ready to run the VM's from backup.

I've never used drbd in Primary/Primary . Any tips for someone used to Primary/Secondary and thinks that Primary/Primary is not a great idea ;-)

 
The main issues with Primary/Primary is dealing with split brain.

Basically treat it like Primary/Secondary, but take advantage of Primary/Primary when needed.
You did not run VMs that were stored on one DRBD volume on both nodes with Primary/Secondary, do not do it with Primary/Primary either.
Well unless you are live migrating VMs to perform maintenance on one node, just try to get them all moved ASAP.

This section of the wiki should explain the split-bain risks/recovery so it all makes sense:
http://pve.proxmox.com/wiki/DRBD#Recovery_from_communication_failure
 
OK got it.

One more thing - can you tell me a little about ' Failover Domains' ? I googled it and will check further, but a user to user short answer would be appreciated.
 
e100: thanks, I think we'll try using a pair of nodes for drbd with ' The LVM on DRBD is marked as being available only on the two nodes that run DRBD.' .

Then the 3-rd node can be used for development and ready to run the VM's from backup.

I've never used drbd in Primary/Primary . Any tips for someone used to Primary/Secondary and thinks that Primary/Primary is not a great idea ;-)

Hi,
DRBD runs well - also with primary/primary. But your mailhost is an CT or not? In this case it's not so easy with DRBD (you need an cluster-fs on top). This is one reason why my new mailserver is an kvm-vm...

Udo
 
Hi,
DRBD runs well - also with primary/primary. But your mailhost is an CT or not? In this case it's not so easy with DRBD (you need an cluster-fs on top). This is one reason why my new mailserver is an kvm-vm...

Udo

our mail system is in an openvz . moving to kvm if it is better for high availability is something I'd do.

I've a rough idea, but how is kvm better for high availability then using openvz?

ps: which mail system do you use? we use postfix + dovecot + amavis.
 
example config is here:
http://forum.proxmox.com/threads/8693-is-there-an-alternative-to-fencing-for-HA?p=50230#post50230

The HA in 2.0 tries to run the VM on any node in the cluster.
Failover Domains allow you to specify what VMs run on what nodes and assign a priority for the Proxmox servers in the failover domain.

You will want two failover domains, one for VMs you want to run on nodeA and one for VMs you want to run on NodeB.

The only issues I see in relation to DRBD and HA is dealing with split brain.
The HA has no idea if DRBD is split brain, imagine if it was split-brain and HA kicked in..........
I think there might still be some work needed to make HA robust with DRBD.

Having HA VMs running on both NodeA and NodeB might be a recipe for an issue.
I've not had time to really sit down and think through all the logic an possible failure scenarios yet.
For the time being I would suggest only having HA VMs run on one DRBD volume on one node(preferred node set by failover domain priority)

I have tested this in great detail, it worked well.
When a node gets fenced, no split brain happens, this is good!
The VMs are started on the other node just fine too.
My concerns are if the system split-brains(for some unknown reason) and then a node fails.
OR when both nodes are down and split-brain, what happens on startup?

Imagine this:
VM running on NodeA
Split-brain happens at 4AM
At 4PM NodeA fails and is fenced.
VM is started on NodeB, using disk image 12 hours old.

We need some way to stop that from happening.

Image an issue where both nodes are down (so no VMS are running)
When started they are split-brain.
Which node has the most recent data? That is the one you want HA to start the VM on but HA has no clue.
 
our mail system is in an openvz . moving to kvm if it is better for high availability is something I'd do.

I've a rough idea, but how is kvm better for high availability then using openvz?

ps: which mail system do you use? we use postfix + dovecot + amavis.
Hi,
with DRBD and lvm-storage you have two independet nodes with all data - very save. If you try this with openvz you can use an nfs-server, but this is again an SPOF (or you start to use an cluster as nfs-server...much more resources and nor very simple).

Has anybody an better idea for openvz-ha?

Udo
 
Has anybody an better idea for openvz-ha?

Other than being slowish, and not tested by me, NFS mounted glusterFS seems like it would work well for HA OpenVZ machines.
I have used nfs mounted glusterFS on 2.0 for storing ISO images, it works well for that.

I tested it for storing KVM disk files, performance was worse compared to DRBD but it did work and was not too slow for some uses.
If I remember correctly DRBD write speeds were 3X faster than on gluster from within the VM.
 
...
ps: which mail system do you use? we use postfix + dovecot + amavis.
Hi,
dito - extended with roundcube as webinterface and with an additional commercial anti-virus programm and clamav-milter/amavis for both checks (but clamav is good enough). The most part will blocked from postscreen (very nice feature of postfix). All together running on devil-linux.

Udo
 
Hi,
with DRBD and lvm-storage you have two independet nodes with all data - very save. If you try this with openvz you can use an nfs-server, but this is again an SPOF (or you start to use an cluster as nfs-server...much more resources and nor very simple).

Has anybody an better idea for openvz-ha?

Udo

hm, perhaps "sort of"; we run a test platform still based on proxmox 1.9, 2 nodes, 2 DRBD devices (Pri/Pri). On each DRBD device, apart from the LVM storages for VMs, theres also one "static" LV, partitioned and formatted for openvz ("/var/lib/vz"). Of course, each node only mounts its own openvz volume. In case of a node failure, the remaining node mounts the "other" openvz volume additionally.
This is not done automatically, but with some scripts. The failed-over openvz instances are linked into the existing ones.
For us, this works so far; however, manual intervention still is needed...
 
Hi udo, wich mail server you use for your enterprise, citadel, sogo or zimbra, or is it other?
Thanks in advance,
Hector.
 
Hi,
dito - extended with roundcube as webinterface and with an additional commercial anti-virus programm and clamav-milter/amavis for both checks (but clamav is good enough). The most part will blocked from postscreen (very nice feature of postfix). All together running on devil-linux.

Udo

Udo - I've tried to search information about postscreen. I realize that is not a Proxmox question and do not want to so to far in putting non pve topics here.

However could send or post how you use it in postfix config ? Or send an email please . thanks
 
Udo - I've tried to search information about postscreen. I realize that is not a Proxmox question and do not want to so to far in putting non pve topics here.
Hi,
take a look here: http://www.postfix.org/POSTSCREEN_README.html
However could send or post how you use it in postfix config ? Or send an email please . thanks
unfortunately the email-option is not available at this forum - so i post here the config (from main.cf)
Code:
postscreen_cache_retention_time = 28d
postscreen_use_tls = yes
postscreen_dnsbl_treshhold = 4
postscreen_dnsbl_sites = zen.spamhaus.org*2, bl.spamcop.net*2
postscreen_dnsbl_action = enforce
postscreen_greet_banner = $smtpd_banner
postscreen_greet_wait = 4
postscreen_greet_action = enforce
postscreen_non_smtp_command_action = drop
Udo
 
We have only two 2.0 servers right now.

We use DRBD too, no heartbeat, run it in Primary/Primary with two volumes on each node like the wiki suggests.
The LVM on DRBD is marked as being available only on the two nodes that run DRBD.
For HA VMs, we use Failover Domains so we can specify the two nodes that the HA VM can run on and what one is preferred.

I've got a question
when adding the Volume group created on primary/primary drbd to pve storage :

Nodes -- should I choose both of the drbd nodes?

thanks.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!