To separate ceph cluster or not?

CodeBreaker · Nov 29, 2023

I have a 6 node hyper-converged cluster where 3 nodes handle compute with SSD OSDs and other 3 nodes with HDD OSDs (they handle no compute).

In the years since the first deployment a few patterns emerged:

HDD pool is used exclusively for CephFS
SSD pool is used exclusively for RDB (HA VM storage)
Local (to node) NvME storage is used for database, k8s and other clusters that don't rely on HA storage

My goal is to eliminate any effect that SSD nodes have on HDD nodes and vice versa so if I, for example, shut down all 3 SSD nodes, the HDD nodes should still serve all the clients.
The metadata servers are already running on HDD nodes. I think the only remaining question is the Monitor and Manger services.

Should I just create 6 monitor and 6 manger services?
Should i go the extreme route and separate those nodes into separate proxmox cluster?
Or is there something in between?

sb-jw · Nov 29, 2023

I would distribute the disks all across the 6 nodes and simply create the pools according to the classes (HDD, SSD and NVMe). Two CEPH clusters on one PVE installation probably won't work either. CEPH should also be operated in your size with a minimum and maximum of 3 Mons / Mgr. Anything else is unnecessary for you and is not recommended by CEPH itself.

CodeBreaker · Nov 29, 2023

The HDD nodes have 3.5" bays and SSD nodes have 2.5" bays so that is a no-go on distributing disks across all 6 nodes.
Taking that into consideration how should I distribute Monitors and Managers?

sb-jw · Nov 29, 2023

CodeBreaker said:
My goal is to eliminate any effect that SSD nodes have on HDD nodes and vice versa so if I, for example, shut down all 3 SSD nodes, the HDD nodes should still serve all the clients.

If that's actually your biggest goal, you can't avoid two separate clusters. Instead of two PVE clusters, you could also simply deploy the HDD storage with Croit and save yourself having to update PVE etc.

Then you can do an HCI with your SSD nodes.

Personally, I would use identical systems and no longer use HDDs. I would also try to keep all resources in one HCI and distribute it evenly. This would give me the best reliability and flexibility.

But maybe you can say more about your hardware and your use cases, maybe I'll have another or better idea.

CodeBreaker · Nov 29, 2023

Thanks for the recommendation. Hardware is 3x Dell R620 (4x800GB SAS SSD) & 3x Dell R320 (4x 6TB SAS). The HDD are used for documents, media files, dumps from databases and is used for computers around the office (SMB).

Search

Search

To separate ceph cluster or not?

CodeBreaker

Active Member

sb-jw

Famous Member

CodeBreaker

Active Member

sb-jw

Famous Member

CodeBreaker

Active Member