How many nodes are needed for HA with cluster over two datacenters?

hec · May 3, 2017

Hello,

we like to build up a new cluster distributed over two datacenters. We use a NetApp Metrocluster and connect the storage with NFS. So storage is no problem.

In normal situation there should be at least 3 nodes. How many nodes are needed in a cluster distributed over two datacenters?

best regards
Gregor

hec · May 4, 2017

Any ideas?

I found nothing in the documentation.

Lets say we have following situation:

DC1: 2 Proxmox nodes, NetApp Metrocluster Site A
DC2: 2 Proxmox nodes, NetApp Metrocluster Site B

So we have a cluster with 4 nodes. All should be ok. But then DC1 is going down. No connectivity to the hosts. The NetApp Metrocluster will do a switchover. So we have all the storage resources on one site. The Proxmox cluster will have 2 down and 2 up nodes.

Can the cluster decide what to do?

dcsapak · May 4, 2017

building a cluster over 2 datacenters is not recommended (corosync needs a latency < 2 ms to work reliably, so you should be fine if this holds)

to have quorum you need more than half of the votes (each node provides a vote) so in your case you would not have quorum (and ha enabled nodes would fence themselves, you cannot start any vms, etc.)

a solution could be a corosync qdevice (which is an external service providing a vote for quorum), sadly this is not properly documented at this time

tschanness · May 4, 2017

Yeah you'd need a third Datacenter (with just one host for quorum)

Jonas

adamb · May 4, 2017

I tackled this with two different clusters. Failover isn't automated obviously, but I can manually fail us over to our other data center in a matter of 20-30 minutes.

hec · May 4, 2017

Latency is no problem we have ultra low latency switches.

Here you see latency to local and remote datacenter. We have now 20GBit between the DCs and we will add 2 more links to get 40GBit. This should be enough.

Code:

64 bytes from raptor3.dmz.cubit.at (192.168.61.217): icmp_seq=1 ttl=255 time=0.095 ms
64 bytes from raptor3.dmz.cubit.at (192.168.61.217): icmp_seq=2 ttl=255 time=0.084 ms
--- raptor3.dmz.cubit.at ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1006ms
rtt min/avg/max/mdev = 0.084/0.089/0.095/0.010 ms

PING raptor4.dmz.cubit.at (192.168.61.218) 56(84) bytes of data.
64 bytes from raptor4.dmz.cubit.at (192.168.61.218): icmp_seq=1 ttl=255 time=0.214 ms
64 bytes from raptor4.dmz.cubit.at (192.168.61.218): icmp_seq=2 ttl=255 time=0.198 ms
--- raptor4.dmz.cubit.at ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.198/0.206/0.214/0.008 ms

Ok so i need a small Debian VM which works as corosync qdevice and all should be ok. I think the best is to put this VM on our VMware cluster.

The two cluster version is not possible. I need to migrate the VMs between the two datacenters.

alexskysilk · May 4, 2017

A cluster must have a minimum of 3 members to assure quorum; If your link between your two datacenters is severed or interrupted, which site becomes master?! There is also the matter of storage synchronization; what is your replication mechanism and will it be able to maintain quorum on link interruption?

hec · May 5, 2017

As i said we have a NetApp MetroCluster so all writes are commited synchronous to both sides.

So storage is no problem. The switchover takes less than 60s. Whats about a tiebraker with LTE or something like this. So we could check if one site is down because of a power problem or something like this or the connection between the DCs ist broken.

I'm open for all solutions. Also storage fencing would be fine. But there should be a solution to stretch a cluster over one or more datacenters. Maybe all should work with 3 DCs. Then there should be a majority if one DC is going offline.

I think Proxmox should work on a solution i think i'm not the only one who need this.

ValK974 · Apr 27, 2019

H

hec said:
As i said we have a NetApp MetroCluster so all writes are commited synchronous to both sides.

So storage is no problem. The switchover takes less than 60s. Whats about a tiebraker with LTE or something like this. So we could check if one site is down because of a power problem or something like this or the connection between the DCs ist broken.

I'm open for all solutions. Also storage fencing would be fine. But there should be a solution to stretch a cluster over one or more datacenters. Maybe all should work with 3 DCs. Then there should be a majority if one DC is going offline.

I think Proxmox should work on a solution i think i'm not the only one who need this.

Hi there,

Dis you make any progress on this please?

Cheers,

Search

Search

How many nodes are needed for HA with cluster over two datacenters?

hec

Renowned Member

hec

Renowned Member

dcsapak

Proxmox Staff Member

tschanness

Member

adamb

Famous Member

hec

Renowned Member

alexskysilk

Distinguished Member

hec

Renowned Member

ValK974

New Member

We value your privacy