[SOLVED] Ceph (stretched cluster) performance troubleshooting

I’ve gotten quotes for Hammerspace, definitely not free. It’s great for small environments if you don’t want to or have the people to run your own Ceph.

Once you get to a few hundred TB, all these solutions start costing more annually than a pair of seasoned sysadmins and it becomes more about whether you need the features.

If you’re a wannabe hyperscaler with a budget, some of these solutions are great. I’ve looked into many solutions recently, Proxmox is the cheapest as far as supported Ceph, then comes Ubuntu followed by Red Hat, all of these will set you back less than $50/TB/y at scale including the hardware, VAST, Hammerspace, Qumolo and many other SDS proprietary in the $50-150/TB/y range including recommended hardware, other things like HPE Greenlake, Dell PowerStore and other scale-out drop-in local cloud (they give you the hardware upfront, you then pay as you go) easily $150-250/TB/y at which point you get to the Azure, AWS etc at $250-400/TB/y.

Now if you need 10PB, you can see those cost differences are humongous, talking millions per year, for a 20TB cluster, does it matter?
 
Last edited: