[SOLVED] Optimal number of Ceph monitor/manager/MDS

godzilla

Active Member
May 20, 2021
78
5
28
44
Hi all,

I'm currently running a cluster with 15 nodes and I plan to add more in the near future. As for Ceph I have 5 monitors, 5 managers and 5 metadata servers which currently manage 60+ OSDs.

Do you advice to add more monitors/mangers/mds? Should I stick with odd numbers because of quorum?

Thank you
 
To answer that question you need to understand what your services are doing.
I try to do a simple overview:

Monitors have 2 tasks:
1) vote on the condition of the cluster. Is the osd in/out and up/down . Do we have the quorum to decide that.
2) Tell clients what the crush map looks like.
So you need 3 or 5 monitors for that task. More are only increasing the overhead with no gain. (I would say 5 is already a multi room setup)
You need to have a odd number of monitors.

Manager: They provide the dashboard, collect and distribute statistic data, doing rebalancing stuff etc.
They are needed for a good running ceph system, but they are not essential. So if all your managers would go down. Then there would be no immediate problem. You just start another manager on a node that is still up.

MDS: metadata server are absolutly depended on your data and your worklaod. For the start, you need 1 active and 1 standby as backup.
Aftre that it is absolutly depended on your data. If you have only very few files in your cephfs, but all the files a very very big. then you will not need another MDS for a long time.
So that's more a performance question:
- How many files do you have,
- How many clients do you have.
- What workload do you have.
I mean how many of the 5 MDS have you active in your configuratoin? The default value would be only one active mds and the others would be standby.
 
  • Like
Reactions: godzilla and jsterr
Thank you very much, @BenediktS !

So:
  1. I'm gonna stick with 5 monitors, maybe I'll distribute them better to have a more resilient configuration
  2. I understand managers are fine
  3. I'm not using CephFS, virtual disks only. So I guess I don't need MDS at all?
 
yes, 3 monitors are fine for small to medium (and 15 OSD nodes is definitely still that category for Ceph ;)) clusters. 5 is plenty, adding more is too much and much more likely to cause issues than help in any fashion. the managers are only there for the dashboard, coillecting status/metrics info, and not as important as the managers. you likely still want to have them running in almost all cases, as a prolonged outage can affect certain parts of cluster behvavior (e.g. autoscaling). if you don't have any cephfs, you also don't need any MDS Instances.