Proxmox-Ceph cluster advice.

Strigi

New Member
Mar 8, 2018
6
0
1
30
Hello there, I've been experimenting with a cluster and I would like some advice.

- 5 hypervisors running the latest proxmox version. (at the moment not running Ceph)
- 2 hypervisors have 256GB RAM, the other 3 182GB.
- 2 hypervisors have 4 X 1.2 TB disks. the other 3 have a bunch of disks totalling 1.6TB/hypervisor.
- All 5 hypervisors have 2 CPU's with 15 cores each.
- 3 NICS for each hypervisor (these are all 1 GB, which will be upgraded to 10 soon hopefully)

These run about 70 - 100 VMs but I'm not given the information about workload (which is weird, I know)

So after some research I feel like the next config should be most useful.
3 OSDs, 1 MON and 1 MGR for each hypervisor.
add an SSD for journaling for each hypervisor (or maybe not, since I would advice BlueStore (which is default))
2 replicas, 2 minimal_size (or is 2/3 better?)

What are your thoughts?
I'm not sure if this is the way to go. Since the amount of storage space isn't that big and I'm not given the information about how much is already used or available in the near future.

EDIT: I know there are 2 ZFS sans around which contain about 35TB each, these disks could be made available for the ceph cluster if their purpose can be taken over by the Proxmox-Ceph cluster.
 
Hm... as you already stated, without the performance and storage data, the planning is pretty useless. Judging alone from the number of 70-100 VMs and assuming diverse server hardware, the hyper-converged approach will not fulfill the expectations.

That said, for ceph alone you should plan:

- 3x MONs, high CPU frequenzy and fast disks are important (sync writes)
- 3x MGR, to make it easier, as only one will be active anyway
- Yx OSDs, same count per host, it is best to the same type, size and speed of disks, makes the placement of data more even
- depending on hardware, you may well use the SSDs as a standalone fast pool (compared to spinners)
- use at least 10 GbE and separate the traffic from any other (also corosync should be on its own physical network!)
- take size 3, min_size 2, as in small clusters the time a recovery needs is far bigger, that increases the likelihood of another disk failure on top

And don't use those SANs for Ceph, they are build for a different purpose and the backups also need a space to go. ;)
 
  • Like
Reactions: ebiss

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!