Architecture multi clusters with ceph

rwanito · Jun 30, 2022

Hello,

I'm looking for a specific advice about the following architecture.

I would like to have 6 proxmox servers with ceph. But it is a bit tricky since the 3 and 3 servers are separated by a slow/er link.
This is why, I'm searching for similar configurations or suggestions to achieve that.

To sum up, between 1-3 and 4-6 the 10 gbits is guaranteed but between them, I don't even know if the link can reach every time the 1gbits.
This link is not maintained by my team, so the traffic needs to be encrypted.

I saw the "multi site" page of ceph, but it seems not to be what I'm looking for.

I will appreciate any help !

Thanks you
Erwan

mira · Jun 30, 2022

Please note, that any such setups are unsupported and untested by us!

I'm not sure, but maybe a `Stretch Cluster` [0] might be the right tool for this.
And you'd have to make sure that the Corosync network is reliable between both locations, as otherwise neither of them will be quorate and you won't be able to start/stop/change any VMs.

And don't use HA in this kind of setup, especially if the network is not reliable!

[0] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/

rwanito · Jul 1, 2022

Hello,

Thank you for this answer.
I'll take a look because it seems to be very interesting !

aleilia · Sep 4, 2023

i have the same necessity.can you help me

Deepen Dhulla · Sep 5, 2023

While a Stretch Cluster is a valuable concept, ensuring the verification of virtual machines at the Disaster Recovery (DR) site can be a challenging aspect. This challenge motivated me to explore alternative methods, such as utilizing the "rbd export-diff" and "rbd import-diff" commands, in combination with "rbd snap unprotect," "protect," and "clone" operations at the DR location to assess DR images. You can review the rudimentary scripts I've developed for this testing on my GitHub repository: https://github.com/deependhulla/ceph-dr-sync-tool-for-proxmox.

Nonetheless, it is evident that there is a need for the development of an addon that streamlines the process of Site-wide cluster storage replication with Ceph. This would provide a more robust and user-friendly solution for managing DR scenarios in stretched clusters.

aleilia · Sep 5, 2023

why site can be a challenging aspect?what happenede when with your script the cluster primary goes down and after rewake up?

Deepen Dhulla · Sep 8, 2023

Using snapshots for sending differences between Primary and DR clusters.
Maintaining three copies of data on both Primary and DR.
Example: VM 100, disk-0, with snapshots vm-disk-0-snap-first, vm-disk-0-snap-2nd, vm-disk-0-snap-3rd.
Primary DC goes down before vm-disk-0-snap-3rd is copied.
On DR, vm-disk-0-snap-2nd is made protect, and a new VM is created for production.
Primary becomes a backup.
vm-disk-0-snap-3rd is destroyed on the Primary, and a snapshot is taken from DR-VM vm-disk-0-snap-2nd as vm-disk-0-snap-dr-1st.
Pass vm-disk-0-snap-dr-1st back to DC-Primary as a differential update.
Get the VM back up on DC-Primary.
Challenges: Limited bandwidth, many VMs, and rebuilding the entire VM in case of DC rebuild.
Goal: Create scripts or automation for smooth differential data transfer between DC-DR and DR-DC.

Search

Search

Architecture multi clusters with ceph

rwanito

Member

mira

Proxmox Staff Member

rwanito

Member

aleilia

New Member

Deepen Dhulla

Renowned Member

aleilia

New Member

Deepen Dhulla

Renowned Member

We value your privacy