Setting up a 4TB Erasure Coded CephFS for Multiple Applications

luckyMoon

New Member
Apr 29, 2025
2
0
1
Hello Proxmox Community,

I'm looking for some guidance on the best way to set up a Ceph File System (CephFS) within my 3-node Proxmox cluster to provide storage for multiple applications: FileCloud, Paperless-ngx, and Jellyfin.
FileCloud in particular needs a file system with a mountable path, similar to a local physical disk, so I want to use cephFS.

My goal is to achieve a usable storage capacity of around 4TB while also having some level of redundancy, using my existing hardware. I have 3 nodes in my cluster, and each node has a single 2TB SATA SSD dedicated to Ceph, giving me a raw capacity of 6TB.

From my understanding, simply creating a CephFS pool with replication on this setup would likely only yield about 2TB of usable space (with a replication factor of 3 for good redundancy).

Therefore, I'm exploring the possibility of using an Erasure Coded (EC) pool as the underlying storage for my CephFS. With a 3-OSD setup (one SSD per node), I believe a k=2, m=1 configuration might be suitable. This should theoretically give me approximately 4TB of usable space while still allowing for the failure of one SSD without data loss.

My questions are:
  1. Is it feasible and recommended to run a CephFS on top of an Erasure Coded pool with 3 OSDs (one per node) in a Proxmox environment? Are there any significant performance or stability implications I should be aware of?
  2. What is the correct procedure within Proxmox (using shell commands like pveceph or the Web GUI) to create a CephFS that utilizes an existing Erasure Coded pool? I have already created a pool with a k=2, m=1 erasure coding profile with a metadata pool.
  3. How can I ensure that the CephFS I create leverages the capacity and redundancy provided by the Erasure Coded pool to achieve the desired 4TB usable space?
Any insights, best practices, or step-by-step instructions on how to achieve this setup would be greatly appreciated. I want to make sure I'm configuring this correctly for both capacity and data safety.

Thank you in advance for your help!
 
Is it feasible and recommended to run a CephFS on top of an Erasure Coded pool with 3 OSDs (one per node) in a Proxmox environment?
No,

Actually, let me restate.

HELL no. and if you need to understand why not, search for @UdoB's post on ceph.

If I were you, I'd put the three disks in one of my nodes, add one more and make a zpool striped mirror.
 
No,

Actually, let me restate.

HELL no. and if you need to understand why not, search for @UdoB's post on ceph.

If I were you, I'd put the three disks in one of my nodes, add one more and make a zpool striped mirror.
Hi alexskysilk,

Thanks for your initial response. I understand your point about distributing of the SSDs. Unfortunately, due to the hardware limitations of my setup. I'm using three individual mini PCs, each with only a single available SATA SSD connection. I am unable to distribute the SSDs differently across the nodes. Each 2TB SSD is physically installed in one of the three separate nodes.

Given this hardware limitation, I'm still looking for a way to create a shared storage pool from these three SSDs that can provide a mountable path (like a local disk) for applications like FileCloud, while also providing some level of redundancy.

So, my follow-up question is: Given my hardware limitation of one SSD per node, is there an alternative approach to CephFS that would allow me to pool these three SSDs into a single storage volume accessible via a path, ideally with some fault tolerance comparable to an erasure coding setup?

Perhaps there's a different distributed file system or storage solution that might be better suited for this specific hardware configuration?

Any other suggestions or insights would be greatly appreciated.

Thanks again!
 
Given this hardware limitation, I'm still looking for a way to create a shared storage pool from these three SSDs that can provide a mountable path (like a local disk) for applications like FileCloud, while also providing some level of redundancy.
this is somewhat akin to saying "I want to move a ton of sand but I want to do it with a bicycle."

Your rational options for what you want given your proscribed limitations is either doing a replication group (with all admonitions about poor performance, potential pitfalls, etc notwithstanding) OR you can do something using a zfs replication pair (see https://pve.proxmox.com/wiki/Storage_Replication.) this will work to provide nearline HA for your vms. for payload, you can set them up in active/passive nfs- the simplest way would be to have both servers present seperate shares mounted via pvesm and manually switch usage as needed although more automated methods can be deployed (using pacemaker, etc.)

I can already hear you saying how thats a terrible use of your capacity- to which I respond with what gave you the idea that fault tolerance is free? its ONLY costing you 3x capacity for ceph or 2x for zfs replication. good deal in my book.
 
  • Like
Reactions: Johannes S