DR Strategy

kellogs

Member
May 14, 2024
111
10
18
MAIN Site (Cluster A)
10 compute nodes
8 storage ceph nodes
1x PBS (source) (100Gbps link 2ms latency)

DR Site (Cluster DR)
10 compute nodes
1x synology (NFS)
1x PBS (remote) (100Gbps link 2ms latency)

Hello Guys,

I would like to pick some brains. for Cluster DR would it be better if i run ceph as well? the reason why I pickup synology NFS is for a quick disaster recovery while fixing MAIN Site.

What do you think?
 
Last edited:
Something seems way off balance here. Why do you have 8 storage nodes and 10 compute nodes? Just setup 10 compute nodes with storage and eliminate the 8 dedicated storage nodes. Are you going to fit all 8 storage nodes worth of data on a single Synology in the DR cluster and expect that to be able to perform fast enough in a failover situation? Where is your Primary PBS going to store it's data?
 
I concur that the setup as described seems pretty wild to me.
But let's address the whole DR question in a broader sense.

If you want guaranteed full recovery, the only GUI-native method offered by Proxmox is CEPH cluster to CEPH cluster replication.
That's not a backup. That's continual data protection.
Its as good as you can get. And it has a heavy performance impact on the VMs.

If you want to run backups and then replicate the backup copy to another datacenter, PBS does that.
You should read about PBS Remotes and Sync Jobs.

Start here.
https://pbs.proxmox.com/docs/managing-remotes.html
 
We do not want to put compute together with storage. Even tho it sounds very efficient but when disaster happens on the compute part, it would caused issues to the storage even tho it was working well.

We want to isolate compute to compute nodes and storage to storage nodes.
 
Your initial question was: "Cluster DR would it be better if i run ceph"?
YES. Of course running CEPH on the DR cluster is better than using a single Synology.

You should really write out your RTO and RPO goals. Then design your DR plan around those goals.
 
  • Like
Reactions: tcabernoch
I have been thinking to run a separate ceph instance in DR site but the initial investment both capex and opex is much higher than a single synology which is only being used when there is any disaster.

Currently the we use only PBS to backup our VMs every night 3am so our RPO is around 24 hours. I have tried to backup every 15mins but i noticed that when PBS is doing backup the VM is almost unsable like there is some short of "freezed"

RTO around 3 hours.

Thanks!
 
I have been thinking to run a separate ceph instance in DR site but the initial investment both capex and opex is much higher than a single synology which is only being used when there is any disaster.
I love Synology units for what they are. Just make sure you actually test that functionality in the DR site.

Currently the we use only PBS to backup our VMs every night 3am so our RPO is around 24 hours. I have tried to backup every 15mins but i noticed that when PBS is doing backup the VM is almost unsable like there is some short of "freezed"
Might I suggest you set a backup speed limit in the Proxmox Datacenter / Option / Bandwidth settings or in PBS under Traffic Control as you may be maxing out IOPS or possibly network bandwidth. Are you using Fast Enterprise grade NVMe SSDs for the CEPH cluster? See my forum reply here for a quick OSD benchmark and some comparable numbers to see where you stand.

RTO around 3 hours.
Make sure you are only doing snapshots to PBS for the dirty-bitmap to function correctly giving you fast incremental backups.

You might want to look into automated restores at the DR site so the critical VMs are already restored and ready to go.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!