Architecture multi site

olange

New Member
Jan 31, 2025
1
0
1
Hello,

Quick question regarding architecture. I use Proxmox for a long time, but I am in a specific architecture. I have one main datacenter (cloud provider), and several site. As a manufacturer, I need to keep some server onSite, I can't move everything to the cloud. And my needs are specifics. Here what I have:
- Full ADVPN solution base on Fortigate, between all site and the cloud
- Each site MUST have on 2 servers, on 2 disctinct server room, with 40Gbps fiber for interconnection
- Cloud site will have 5 server

My question is, can I have (and does someone already do it):
- Main main cluster with Ceph storage in the cloud
- 1 cluster of 2 node on each site, with ceph configuration between this 2 site
- Usine the main cluster as node validation (I now that we never do a cluster of 2 nodes in production, 3 is mandatory, but can I use my main cluster in the cloud as the "validation node", without giving him any capabilites of vm migration or anything else, just in terme of management).

The objective will be, at end, have 6 group of node (1 with 5 node for cloud and 5 with 2 node in the site), managed by the same interface, with ceph and HA solution limited by group of node, and with the respect of capability ?

Thanks.
Olivier
 
Hi Olivier,

I'm not and expert in Ceph Design but a 2 node cluster in Ceph is what we can call a very bad idea.
2 node cluster in Proxmox is not ideal but you can use a Qdevices for quorum (see https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_supported_setups).

For what I know and have see, if you have only one cluster with groups of servers on many physicals sites, Ceph, in is current integration by Proxmox will not see the separation and try to configure it without consideration for it. In result you would have one Ceph cluster with nodes separated with too many ms than what is recommended.

What is mandatory to understand is that Ceph doesn't need a "validation node" for the data but work with X/X replications (2/3 by default) of each data that we make him injest.
So with 2/3 replication, when you give him 1 block of data, as soon as Ceph will have replicated it twice you'll have an "ok it's good" and as soon as the third one is writed, Ceph will considerat it as fully replicated. When one of these 3 replications is gone Ceph continue to let you use the data but if those 2 replications are separated by 10,25,50ms ... It will be catasphophic for the perf. As soon as Ceph see only one replication, it won't autorise modification on it, you're in read only mode.

So, for what I see, there is 2 subject for you :
- what storage can you use for each site (If Ceph, think about a 3 node cluster and think about configure it independantly, not with Proxmox integration)
- centralized management :
* If one Proxmox Cluster, be aware that corosync ask for 5ms or better between nodes
* If one Proxmox Cluster per site, look at Proxmox Datacenter Manager who is in Alpha but with an active roadmap : https://pve.proxmox.com/wiki/Proxmox_Datacenter_Manager_Roadmap

regards,
Damien
 
  • Like
Reactions: UdoB
Hello,

Quick question regarding architecture. I use Proxmox for a long time, but I am in a specific architecture. I have one main datacenter (cloud provider), and several site. As a manufacturer, I need to keep some server onSite, I can't move everything to the cloud. And my needs are specifics. Here what I have:
- Full ADVPN solution base on Fortigate, between all site and the cloud
- Each site MUST have on 2 servers, on 2 disctinct server room, with 40Gbps fiber for interconnection
- Cloud site will have 5 server

My question is, can I have (and does someone already do it):
- Main main cluster with Ceph storage in the cloud
- 1 cluster of 2 node on each site, with ceph configuration between this 2 site
- Usine the main cluster as node validation (I now that we never do a cluster of 2 nodes in production, 3 is mandatory, but can I use my main cluster in the cloud as the "validation node", without giving him any capabilites of vm migration or anything else, just in terme of management).

The objective will be, at end, have 6 group of node (1 with 5 node for cloud and 5 with 2 node in the site), managed by the same interface, with ceph and HA solution limited by group of node, and with the respect of capability ?

Thanks.
Olivier
Hello. My point of view :
- Using remote CEPH is a bad idea form me as this will lead to catastrophic performance.
- Using remote 5 nodes cluster for quorum votes is good only if :
- your internet link is always up (if not, you won't be able to operate the local nodes)
- latency between all the nodes is 5ms or less
Do you need central storage at cloud site ?