Architecture multi clusters with ceph

rwanito

New Member
May 25, 2021
14
0
1
25
Hello,

I'm looking for a specific advice about the following architecture.
Untitled-2022-04-29-0848.png

I would like to have 6 proxmox servers with ceph. But it is a bit tricky since the 3 and 3 servers are separated by a slow/er link.
This is why, I'm searching for similar configurations or suggestions to achieve that.

To sum up, between 1-3 and 4-6 the 10 gbits is guaranteed but between them, I don't even know if the link can reach every time the 1gbits.
This link is not maintained by my team, so the traffic needs to be encrypted.

I saw the "multi site" page of ceph, but it seems not to be what I'm looking for.

I will appreciate any help !

Thanks you
Erwan
 
Please note, that any such setups are unsupported and untested by us!

I'm not sure, but maybe a `Stretch Cluster` [0] might be the right tool for this.
And you'd have to make sure that the Corosync network is reliable between both locations, as otherwise neither of them will be quorate and you won't be able to start/stop/change any VMs.

And don't use HA in this kind of setup, especially if the network is not reliable!


[0] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/
 
Hello,

Thank you for this answer.
I'll take a look because it seems to be very interesting !
 
While a Stretch Cluster is a valuable concept, ensuring the verification of virtual machines at the Disaster Recovery (DR) site can be a challenging aspect. This challenge motivated me to explore alternative methods, such as utilizing the "rbd export-diff" and "rbd import-diff" commands, in combination with "rbd snap unprotect," "protect," and "clone" operations at the DR location to assess DR images. You can review the rudimentary scripts I've developed for this testing on my GitHub repository: https://github.com/deependhulla/ceph-dr-sync-tool-for-proxmox.

Nonetheless, it is evident that there is a need for the development of an addon that streamlines the process of Site-wide cluster storage replication with Ceph. This would provide a more robust and user-friendly solution for managing DR scenarios in stretched clusters.
 
why site can be a challenging aspect?what happenede when with your script the cluster primary goes down and after rewake up?
 
  • Using snapshots for sending differences between Primary and DR clusters.
  • Maintaining three copies of data on both Primary and DR.
  • Example: VM 100, disk-0, with snapshots vm-disk-0-snap-first, vm-disk-0-snap-2nd, vm-disk-0-snap-3rd.
  • Primary DC goes down before vm-disk-0-snap-3rd is copied.
  • On DR, vm-disk-0-snap-2nd is made protect, and a new VM is created for production.
  • Primary becomes a backup.
  • vm-disk-0-snap-3rd is destroyed on the Primary, and a snapshot is taken from DR-VM vm-disk-0-snap-2nd as vm-disk-0-snap-dr-1st.
  • Pass vm-disk-0-snap-dr-1st back to DC-Primary as a differential update.
  • Get the VM back up on DC-Primary.
  • Challenges: Limited bandwidth, many VMs, and rebuilding the entire VM in case of DC rebuild.
  • Goal: Create scripts or automation for smooth differential data transfer between DC-DR and DR-DC.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!