Ceph and TBW with consumer grade SSD

logui

Member
Feb 22, 2024
37
3
8
Looking for feedback from experienced Proxmox/Ceph users, I have a 3 node cluster, each node has a dedicated SSD (Samsung 870 EVO 500GB SATA) for Ceph, there is a separate SSD for Proxmox Boot/OS, the Ceph SSD is rated for 300 TBW. This is a brand new home-lab type of cluster, at this point, I only have 2 VMs (Openwrt and AdGuard Home) with very light disk usage, most of their work is in RAM. Both VMs have their respective disks in Ceph, replicated across all 3 nodes.

With the whole cluster almost idle, just with the 2 VMs running with very light usage, using iotop -ao I have measured the write usage, and aggregating only all the Ceph processes, I am getting to around 500MB/hour just for Ceph write usage, I have to asume is mostly logs and some replication across nodes, because I have not migrated the VMs across nodes, or done anything else during the measurement.

At this writing rate, its around 4.5TB/year idle, once I will start adding more VMs and performing migrations and other activities, this could easily jump to 20-30T/year, does it sound reasonable to you? I am worried that the consumer SSDs (I know, I should have purchase comercial grade, but its to late now) will not last very long at this rate, thanks for the feedback.
 
You already know the recommendation regarding Enterprise class devices. Those have reasons. (Plural.)

With a look to my personal homelab I would like to say this:

You only have three nodes. That is the absolute minimum for a cluster. Probably Ceph works with size=3/min_size=2, right? (Never go below that!)

When (not: if) one device or one node fails Ceph is immediately degraded. There is no room for Ceph to heal itself, so "degraded" is permanent. For a stable situation you really want to have nodes that can jump in and return to a stable condition - automatically. For Ceph in this picture this means to have at least four nodes. (In this specific case; in other regards you want to have five or more...)

Note that Ceph is more critical for the cluster than a local SSD: when Ceph goes readonly all VMs in the whole cluster will stop immediately - they can not write any data (including log messages, which is practically always done) and will stall.

If I have six nodes with four OSDs per node and a slow network I would tolerate (some? all?) cheap consumer SSDs: the higher number of slow SSDs is fast enough for a bandwidth limited network, and a dying OSD (or a full node) is immediately compensated by the others.

YMMV! And while I am experimenting with Ceph I am not an expert...

----
Added: regarding your initial question: OSDs and SSDs/NVMe in general are consumeables. They are expected to die some day and having a limited lifetime. You have to live with this - and compensate it with redundancy. With cheap devices you get two things: low performance and a shorter interval between replacements. Choose your poison...
 
Last edited:
I have a 3-node proxmox/ceph cluster with consumer grade NVME SSD and it works fine and I use a dozen or so of different VMs.
Just checked at my 1TB Intel 660p and Crucial P1 that I started to use in 2019, one of them has 108 TB written, the other 126TB. Basically that is less that 2/3 of their expected 200TB life. Good enough in my opinion...
I do use a 2/1 replica pool for some of my less critical lab VMs...
 
I do use a 2/1 replica pool

As long as it works for you it is fine!

Edit: this might be wrong: (( Just to repeat: when one node dies the whole cluster comes to a standstill - it is very probable that some chunks were stored on that one. )) See also https://42on.com/how-to-break-your-ceph-cluster/

Personally I would not run this. Never. Possibly with the exception of the short time of disaster recovery.

:)
 
Last edited:
  • Like
Reactions: Johannes S
As long as it works for you it is fine!
Yep. I run my cluster in this configuration for more than 7 years and still waiting for something to happen... It's a lab anyway, I actually wanted to see what could go wrong as a learning exercise, but it just works...
 
  • Like
Reactions: UdoB

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!