Shared ceph storage on a 5 node setup

Sep 2, 2019
3
0
21
28
Hi everyone,

Currently we are using SolusVM with 5 nodes each with:

* 64GB of RAM
* 4x 1TB SSD RAID-10
* 2x CPUs

We're evaluating moving to Proxmox and use Ceph, as for the long-run it is future-proof, more scalable and easier to maintain than SolusVM. Also we're having problems with performance. We're considering moving to the following scenario:

* storage consisting in an R720xd with 24x 1TB SSDs using Ceph. 10GBe nic's
* nodes with 128GB of RAM, 2x CPUs and small local storage just for the OS
* SATA storage for ISOs, Backups, etc.

Does anyone have a similar scenario to share their experience with us?

Much appreciated!
 
So your looking at only having one CEPH storage node?

If so that's really not what CEPH is designed for, yes it can run on one node but the benefits of CEPH are lost.

If you want a large single storage server your better running Raid10 on the 720 and then using scsi it some other network based sharing system to share the disks to the VM's.
 
So your looking at only having one CEPH storage node?

If so that's really not what CEPH is designed for, yes it can run on one node but the benefits of CEPH are lost.

If you want a large single storage server your better running Raid10 on the 720 and then using scsi it some other network based sharing system to share the disks to the VM's.

Thanks for the comment!

However, the main reason for us being choosing Ceph is to not need to use iSCSI (such as in ZFS over iSCSI or LVM over iSCSI) because as this will be all-flash storage, having the 80/50% rule is a big cost hit, as 12TB of flash storage is pretty expensive.
 
Thanks for the comment!

However, the main reason for us being choosing Ceph is to not need to use iSCSI (such as in ZFS over iSCSI or LVM over iSCSI) because as this will be all-flash storage, having the 80/50% rule is a big cost hit, as 12TB of flash storage is pretty expensive.

When you say 80/50% hit you do realise CEPH is 3 way replication.

Meaning for every bit of data your saving it 3 times, so technically and overhead of 300%
 
Hi everyone,

Currently we are using SolusVM with 5 nodes each with:

* 64GB of RAM
* 4x 1TB SSD RAID-10
* 2x CPUs

We're evaluating moving to Proxmox and use Ceph, as for the long-run it is future-proof, more scalable and easier to maintain than SolusVM. Also we're having problems with performance. We're considering moving to the following scenario:

* storage consisting in an R720xd with 24x 1TB SSDs using Ceph. 10GBe nic's
* nodes with 128GB of RAM, 2x CPUs and small local storage just for the OS
* SATA storage for ISOs, Backups, etc.

Does anyone have a similar scenario to share their experience with us?

Much appreciated!

Hi rodrigobaldasso

Just to clarify if you are only looking to deploy a single server then just use local storage in a Raid config otherwise its a messy setup and more to trouble shoot.

Ceph is replicated object storage designed to be used across multiple nodes for both Flie and block storage.
It creates a storage pool that’s shared between all hosts in the cluster.

I would highly recommend a little googling about Ceph for more accurate background information.

If you are thinking of more than one host Ceph requires a min of 3 if I remember correctly.

If you are thinking of 2 or more hosts you can use Linstore DRBD for pure clean 1:1 host > host replication.

Maybe I’ve incorrectly understood the preposed design happy to be corrected.

“”Cheers
G
 
Hi rodrigobaldasso

Just to clarify if you are only looking to deploy a single server then just use local storage in a Raid config otherwise its a messy setup and more to trouble shoot.

Ceph is replicated object storage designed to be used across multiple nodes for both Flie and block storage.
It creates a storage pool that’s shared between all hosts in the cluster.

I would highly recommend a little googling about Ceph for more accurate background information.

If you are thinking of more than one host Ceph requires a min of 3 if I remember correctly.

If you are thinking of 2 or more hosts you can use Linstore DRBD for pure clean 1:1 host > host replication.

Maybe I’ve incorrectly understood the preposed design happy to be corrected.

“”Cheers
G

Hi there,

We're thinking to use 2 nodes as a ceph storage to share across multiple hosts. Basically I have a requirement to support snapshots and that the storage can be shared across multiple hosts. So I'm down to three options supported in Proxmox which are:

* CephFS
* Ceph/RBD
* ZFS over iSCSI

My problem with having ZFS would be the overhead of having a protocol that recommends keeping the pool under 80% of total storage and then iSCSI that recommends keeping the usage less than 50% of total storage.

So we were considering to run Ceph under 2-node setup with 2 replicas each, which would be basically like the cost of having a RAID-10 on these storages, which is an acceptable cost.
 
Hi there,

We're thinking to use 2 nodes as a ceph storage to share across multiple hosts. Basically I have a requirement to support snapshots and that the storage can be shared across multiple hosts. So I'm down to three options supported in Proxmox which are:

* CephFS
* Ceph/RBD
* ZFS over iSCSI

My problem with having ZFS would be the overhead of having a protocol that recommends keeping the pool under 80% of total storage and then iSCSI that recommends keeping the usage less than 50% of total storage.

So we were considering to run Ceph under 2-node setup with 2 replicas each, which would be basically like the cost of having a RAID-10 on these storages, which is an acceptable cost.

Running a replica of 2 is never ever suggested, your almost guarantee yourself some data loss in the near future.

You can do a replica of 3 across 2 node's, however your have to accept the extra storage overhead.
 
  • Like
Reactions: velocity08
Hi there,

We're thinking to use 2 nodes as a ceph storage to share across multiple hosts. Basically I have a requirement to support snapshots and that the storage can be shared across multiple hosts. So I'm down to three options supported in Proxmox which are:

* CephFS
* Ceph/RBD
* ZFS over iSCSI

My problem with having ZFS would be the overhead of having a protocol that recommends keeping the pool under 80% of total storage and then iSCSI that recommends keeping the usage less than 50% of total storage.

So we were considering to run Ceph under 2-node setup with 2 replicas each, which would be basically like the cost of having a RAID-10 on these storages, which is an acceptable cost.

Hi rodrigobaldasso

i wouldn't recommend running ZFS over iSCSI there are a lot more memory overheads to consider and managing ZFS can be a beast if you dont have the experience and configure SSD write and read caches properly.

i believe DRBD is supported by Proxmox they have their own module.

https://www.linbit.com/en/linstor-setup-proxmox-ve-volumes/
https://www.linbit.com/en/linstor-controller-proxmox/

DRBD/LINSTOR vs Ceph – a technical comparison
https://www.linbit.com/en/drbd-linstor-vs-ceph/

In advising of this the ProxMox team do a fantastic job of providing enterprise level support for their Ceph deployments.

I highly recommend if you are thinking of using Ceph in production then engage with the ProxMox team on a paid subscription and get the best of their years of experience working with, managing and developing an environment around CEPH.

It would be prudent to get their feedback on the correct hardware deployment and minimums needed to run Ceph as I've done so in the past, better to pay for experience than pay twice for a setup that isn't 100% suited for your use case scenario.

Personally i wouldn't do a 2 node Ceph cluster from all the information i,ve gathered so far, unfortunately nothing in production with Ceph only relating what has been read out on the internet.

hope the above info helps with your decision.
good luck!

""Cheers
G
 
Hi everyone,

Currently we are using SolusVM with 5 nodes each with:

* 64GB of RAM
* 4x 1TB SSD RAID-10
* 2x CPUs

We're evaluating moving to Proxmox and use Ceph, as for the long-run it is future-proof, more scalable and easier to maintain than SolusVM. Also we're having problems with performance. We're considering moving to the following scenario:

* storage consisting in an R720xd with 24x 1TB SSDs using Ceph. 10GBe nic's
* nodes with 128GB of RAM, 2x CPUs and small local storage just for the OS
* SATA storage for ISOs, Backups, etc.

Does anyone have a similar scenario to share their experience with us?

Much appreciated!

Hi, Ceph on proxmox is a great way for VM storage as you already are aware of. It's mainly three reasons, really safe data, shared between hosts and very very flexible. The cost for these super important three advantages are 3 replicas (pricey if going all SSD) and you need a bunch of storage nodes, i recommend 5. If you already have 5 nodes anyway, why limit the ceph to only two? Ceph easily lives on the VM hosts, or you put up a few extra small nodes only for ceph and don't run VMs there. Spread the SSDs along the way on the nodes, no need for a fat 24SSD box or two. The ISOs are fantastic to have on cephfs since they are shared storage always available for the VMs. The guest VMs ofcourse on ceph block pool.
Backups go elsewhere ofcourse.
The reason for me to say 5 nodes is you need as you know at least two nodes. Three does spread the 3 replicas amongst the 3 nodes more nicely since it's hard to spread 3 replicas amongst two nodes, but you have no good failover in production cluster. 4 nodes gives you a good failover, but with one node down you are no longer redundant since you lose quorum if 2 of 4 nodes are down. Only 5 nodes gives you room to do maintenance on one node and STILL be redundant if any goes down while maintenance reboots.
However, even if you get a downtime for any of these various scenarios above, ceph WILL recover as soon as enough nodes are up, replicating to a good storage every time.
My 2 cents.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!