HA Cluster Replication Question

Pablo1732 · Mar 16, 2024

Hello,
I would like to set up a Proxmox HA cluster with 2 nodes. I can't use Ceph with only 2 nodes, so I wanted to try replication.
With replication, you can set the container to be replicated every 15 minutes, for example.

But then I asked myself a question:

If my container is running on Node1 and it fails, it is automatically started on Node2. But what happens to the data on Node1 that was written before the crash? In other words, the data that has not yet been replicated.
Assuming I fix Note1 and migrate the container from Node2 to Node1, will all the data that I wrote shortly before the crash of Node1 be gone, i.e. the data that could not be replicated before the crash? Or will the data then be merged somehow?

As an example to illustrate this:

I have Node1 and Node2, on Node1 there is a cloud as a container. The container is replicated every 15 minutes. I upload a document to the cloud, shortly afterwards the server crashes before the document could be replicated. Then the cloud starts on Node2 without the document of course, which is not so bad at first. But if I get Node1 running again and then migrate the container from Node2 back to Node1, is this document simply gone and overwritten?

Hopefully someone can answer this question.

bbgeek17 · Mar 16, 2024

Pablo1732 said:
If my container is running on Node1 and it fails, it is automatically started on Node2. But what happens to the data on Node1 that was written before the crash? In other words, the data that has not yet been replicated.

As with any asynchronous replication the RPO (recovery point objective) will be T-X. In your case X is 15min (+ extra time if it was mid-replication). The other data is lost.

Pablo1732 said:
Assuming I fix Note1 and migrate the container from Node2 to Node1, will all the data that I wrote shortly before the crash of Node1 be gone, i.e. the data that could not be replicated before the crash? Or will the data then be merged somehow?

The data will not be merged. You would have to reverse replication and in worst case do a full sync, and in best case sync the differences from last known good checkpoint that both sides can agree on. Note, that best case is theoretical - I dont know if ZFS/PVE is capable of it.
You may also elect to throw away any new data and resync from failed node to 2nd node.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Pablo1732 · Mar 16, 2024

bbgeek17 said:
As with any asynchronous replication the RPO (recovery point objective) will be T-X. In your case X is 15min (+ extra time if it was mid-replication). The other data is lost.

The data will not be merged. You would have to reverse replication and in worst case do a full sync, and in best case sync the differences from last known good checkpoint that both sides can agree on. Note, that best case is theoretical - I dont know if ZFS/PVE is capable of it.
You may also elect to throw away any new data and resync from failed node to 2nd node.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Thanks for your answer!
So is the actual real working variant how i don't lose data ceph?

bbgeek17 · Mar 16, 2024

Pablo1732 said:
So is the actual real working variant how i don't lose data ceph?

Ceph and replication serve two different purposes. In enterprise world they are not interchangeable.

For the limited scope of your example in a home/lab environment, ceph might be a better solution, if implemented with supported configuration.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Pablo1732 · Mar 17, 2024

So should I buy a 3rd node, which has hardly any performance, on which there are no VMs and containers, which is only for Ceph?

bbgeek17 · Mar 17, 2024

Pablo1732 said:
So should I buy a 3rd node, which has hardly any performance, on which there are no VMs and containers, which is only for Ceph?

You are confusing 3rd node for PVE which can act as a vote only and does not need any performance, and 3rd node for Ceph which would need to be as powerful as the other two with similar capacity capabilities.
You should spend more time reading about Ceph if you plan to implement it, especially if you plan to run it in serious production.
On the other hand, if its a home lab - just do it and learn the best way, through your own mistakes.

good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Search

Search

HA Cluster Replication Question

Pablo1732

Member

bbgeek17

Distinguished Member

Pablo1732

Member

bbgeek17

Distinguished Member

Pablo1732

Member

bbgeek17

Distinguished Member

We value your privacy