Medium to Large-Scale Deployment Questions

fstrankowski

Renowned Member
Nov 28, 2016
82
20
73
Hamburg
Dear Members,

we're planning to integrate a Proxmox/Ceph Instance into our production-enviroment. After testing Proxmox for several weeks we think this might be a good idea for some of our services to rely on.

We've accomplished to archive the following so far:
  • Setup 2 Bladecenters in two independend Datacenters for the beginning (Interconnected through 2x40 Gbit)
  • Reserved 6 Nodes in each BladeCenter (each has 2x E5-2620 v3, 128GB Ram, HW-Raid 1TB Intel DC SSDs, 10 GBit LACP)
  • Nodes are interconnected using 4 VLANs for Mgmt, Corosync, VM Migration and CephStorage
  • Setup CEPH-Crushmap to accomplish our storage and redundancy needs
  • We CANNOT use another Datacenter for now, so we have to stick with 2 locations.
We'd like to archive the following with our setup:
  1. Having HA throughout our complete setup (if one Datacenter dies the other 6 Nodes still get quorum)
  2. If we encounter a splitbrain, both BladeCenters with 6 Nodes are getting fenced
  3. We can run with just one-side of our 12 Nodes (so just 6 Nodes active).
We've thought of doing it the following way:
  • Give each Node in DC 1 a Quorumvote +2
  • Give each Node in DC 2 a Quorumvote +1
This helps us if one DC dies that we can still work in production. If the other DC comes back we integrate the other machines one-by-one. If we encounter a Brainsplit, we run all VMs off one DC, power off the other Nodes in the 2nd DC and power them on one-by-one after failure got fixed.

Questiontime:
  • Is it possible to have more Quorumvotes than the number of nodes in our setup?
  • Is this a decent tactic / practice?
  • Is there anyone else who would like to share their experience with a large scale setup?

Appreciated

img_20170123_150149.jpg
 
  • Like
Reactions: adrian0x0
Does anyone maybe, instead of giving an answer regarding my question, can supply me with a sample deployment with 2 dcs / locations? Appreciated.