Proxmox Server setup business - 3 nodes cluster - Suggested storage type

Sarlis Dimitris · Mar 26, 2025

Good day to all,

I am setting up in my company a 3 nodes cluster server for HA and redudancy. My experience with Porxmox is quite ok but the major issue now is the storage type to use.

My thoughts are as simple as it gets, I need the storage to be expandable and with the best possible setup for migration and restoring (in case of fail).

In my previous setup to another project, I build the 3 node setup with Ceph/ OSD, to be able to use HA and have the migration but it didn't not work as I wanted.
The machines where setup correclty, VM disks where located into Ceph pool, monitor was Ok but suddenly the server (HP DL360 g8) had one fan failure and it mentioned as critical so server never came up (always rebooted). The VM was transfered automatically to 2nd node but the machines could not load properly. Even after reboot of VMs they where not operational.
I had to restore my backups to local lvm to work again.

Anyhow, I am trying to say that I want your opinion for the best case scenario in regards with storage and the ability to expand easily.

My setup is in Hardware raid, first array raid1 for Server Proxmox OS, and second array in Raid5 with 4x 1.92 DC600 Kingston
Now to the point, ..
If i wish to add 2 more disks and expand the Raid5 to my PVE, how is this possible? What type of datastorage to use? Should I go with OSD?

Maybe it is better to use a network storage from the very beggining without Ceph/ OSD? Like iSCSI or NFS. This will reduce the speed in overall?

Any other ideas?

I will be really happy to answer any questions you might place for this post helping me out to take the best possible decision.

Thank you all

itNGO · Mar 26, 2025

Ceph is the way to go, but it has to be done "right". Get professional support. Your described behavior is nothing which is normal for ceph.

bbgeek17 · Mar 26, 2025

Hi @Sarlis Dimitris ,
It’s reassuring to hear “this is what you should do,” but reality is rarely that simple. There are companies running multi-petabyte Ceph clusters without issues, while others have had their "weekends ruined" by Ceph problems. Some successfully use iSCSI or NFS, while even high-end vendor SANs have caused billion-dollar companies to lose millions due to downtime.

You shouldn't use RAID with Ceph, as its primary feature is built-in data protection.

Consider your budget, rack space, power capacity, and support capabilities. Assess your business’s tolerance for data unavailability and whether management is willing to invest in proper support.

All the technologies you mentioned are valid choices - given the right investment of time and money.

Cheers.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Sarlis Dimitris · Mar 26, 2025

@bbgeek17 thanks for your answer as also @itNGO.

So in regards to the point where you mentioned , not to use raid with Ceph, what will do? Individual disks adding to system?
If I am right Ceph and OSD is like RAID but with servers instead of disks, am i right?

Can you please describe a possible setup with Ceph and 3 nodes?
In case we need to stick with only 3 nodes without expanding the servers, is OSD data expansion a possible solution? cause I presume that in this case I must have my disks already added with the size I need and calculated for the next upcoming years...

Lastly, the fault tolerance of Ceph /OSD is higher than RAID, is this correct? I just need to power off the server, add the disk and then build it into Ceph?

@itNGO, professional support meaning to have a good subscription for Proxmox servers? Or to address my setup to a company and help me out?
Cause in Greece there are no lots of companies under the proxmox knowledge.

bbgeek17 · Mar 26, 2025

These links may be helpful:

https://www.ibm.com/docs/en/storage-ceph/7?topic=hardware-avoid-using-raid-san-solutions

F

Thread 'CEPH vs RAID'

May 17, 2024

Hello, I would like to ask you a few questions about CEPH. Currently, I'm working on critical infrastructure for my client, and I'm planning to use servers with hardware RAID. My first question is whether I should opt for CEPH or classic RAID on these servers? The second question is whether it's possible to integrate CEPH with hardware RAID on these servers? And the last question pertains to the advantages of CEPH. If you need any additional specifications to provide answers, please feel free to ask, and I'll provide all the information I can.

Thank you so much

https://www.youtube.com/watch?v=7BcSnUz_2zQ
https://docs.ceph.com/en/reef/rados/operations/add-or-rm-osds/
https://www.reddit.com/r/Proxmox/comments/187h33f/how_many_nodes_can_fail_in_a_ceph_cluster/

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

itNGO · Mar 26, 2025

Sarlis Dimitris said:
@itNGO, professional support meaning to have a good subscription for Proxmox servers? Or to address my setup to a company and help me out?
Cause in Greece there are no lots of companies under the proxmox knowledge.

Ceph needs the disks as direct attached without any raid. "JBOD". The redundancy level is then configured inside Ceph. (3 Copies with at least 2 online in a 3-node-Cluster)

About support..... there are companies outside greece which do this. And everything is possible online and by remote.... Having a Proxmox Subscription is always a good addon when you need "support" in case of a failure....

amjid · Mar 26, 2025

Ceph Storage is the best and recommended storage, but ensure that Hardware is not used.
here I have explained what is Ceph this will give you basic idea about Ceph

https://youtu.be/zRuA9Nqd0d8?si=gDDAPoVi0beVKnLG
Why to use Ceph, I have explained here with live example of doing the migration

https://www.youtube.com/watch?v=xYrYZThwoI0

For step by step Installation of Ceph I have also covered that here

https://www.youtube.com/watch?v=7BcSnUz_2zQ

Sarlis Dimitris · Mar 27, 2025

Good morning to all,
Thank you all for your posts.

Still I got this question..
How easy will it be in case I need to expand our storage using Ceph? Do I need to "build" another node with extra storage and add it to ceph?

I do understand and endorse Ceph model as it concerns reliability, tolerance & redundancy but what it will be in case of storage expansion?

There was a previous post from bbgeek17 mentioned not to use RAID with Ceph. Why is that? because we already have the fault tolerance of our build in ceph?

In case I decide to go with SAN storage, what is the preferred setup?

ness1602 · Mar 27, 2025

There are two ways:
1. adding more disks to current nodes.
2. adding more nodes with more disks.

Second one is always better.

Supaman · Mar 27, 2025

Sarlis Dimitris said:
How easy will it be in case I need to expand our storage using Ceph? Do I need to "build" another node with extra storage and add it to ceph?
I do understand and endorse Ceph model as it concerns reliability, tolerance & redundancy but what it will be in case of storage expansion?

Basically CEPH works like a Software Raid Controller over several nodes. That means, no extra layer (Hardware Raid) between drives <-> OS should be involved.

CEPH automatically uses all OSDs for data and redundancey. as soon as you add more OSDs, CPEH will rebalance the data on it for optimal redundancy, depending on the rules you have setup (minimun ODSsfor operate). For example, you can have 5 nodes, but need 3 to operate. If number of online nodes fall to 2, system will switch to read-only mode. the minimuim is 3 nodes, and 2 nodes alive, just as atraditional raid-5 setup.

UdoB · Mar 27, 2025

Sarlis Dimitris said:
I build the 3 node setup with Ceph/ OSD, to be able to use HA and have the migration but it didn't not work as I wanted.

Probably you had found some of the problematic pitfalls?

[TUTORIAL] Thread 'FabU: can I use Ceph in a _very_ small cluster?'

Dec 26, 2024

Ceph is great, but it needs some resources above the theoretical minimum to work reliably. My assumptions for the following text:

you want to use Ceph because... why not?
you want to use High Availability - which requires Shared Storage (note that a complete solution needs more things like a redundant network stack and power supplies)
you want to start as small (and cheap) as possible, because this is... “only” a Homelab

You plan for three Nodes. Each node has s single dedicated disk for use as an “OSD”. This is the documented...

Sarlis Dimitris · Mar 27, 2025

So If I do understand correct, in overall and for my build, to keep safe all data and have a healthy system I need to:

- Build a cluster with at least (min) 4 servers
- Add them to cluster & Ceph of course
- Use the 10GB of network cards
- Have at least 38GB of RAM in each server for Ceph
- Have the initial disks i will use as OSDs and then I can add extra to each one of the servers to expand size (no matter the size but preferable same type ie. SSDs) and of course without adding them in Hardware RAID.

Any additional points for this setup?

UdoB · Mar 27, 2025

Sarlis Dimitris said:
So If I do understand correct, in overall and for my build, to keep safe all data and have a healthy system I need to:

One advantage of Ceph is its flexibility. The goal of my "FabU" was to mention some aspects and pitfalls. Not more.

Sarlis Dimitris said:
- Have at least 38GB of RAM in each server for Ceph

That "38" is the sum of Ram of my example in the cluster. My point was that each and every daemon - be it OSD/MON/MGR or MDS - needs Ram for its own use.

Sarlis Dimitris said:
- Have the initial disks i will use as OSDs and then I can add extra to each one of the servers to expand size (no matter the size but preferable same type ie. SSDs) and of course without adding them in Hardware RAID.

Yes. The Operating System itself is independent and I prefer ZFS in a mirrored setup. How many OSDs are installed on each node is totally arbitrary - and starting from zero

Sarlis Dimitris said:
Any additional points for this setup?

Well..., I would recommend to setup the first install intentionally "for test". Perhaps you'll find some aspects to be suboptimal for your usecase. Plan for time to deconstruct everything and start from scratch.

Both Ceph and ZFS are really, really great tools. Both are complex as soon as you look under the hood...

fko · Mar 27, 2025

Sarlis Dimitris said:
So If I do understand correct, in overall and for my build, to keep safe all data and have a healthy system I need to:

- Build a cluster with at least (min) 4 servers
- Add them to cluster & Ceph of course
- Use the 10GB of network cards
- Have at least 38GB of RAM in each server for Ceph
- Have the initial disks i will use as OSDs and then I can add extra to each one of the servers to expand size (no matter the size but preferable same type ie. SSDs) and of course without adding them in Hardware RAID.

Any additional points for this setup?

Carefully choose your SSDs. We had a case with non-enterprise SSDs which had to be replaced to guarantee a stable setup.

In addition: give ceph its own seperate network to avoid problems during backups oder other "high load"-cases.

Sarlis Dimitris · Mar 27, 2025

fko said:
Carefully choose your SSDs. We had a case with non-enterprise SSDs which had to be replaced to guarantee a stable setup.

In addition: give ceph its own seperate network to avoid problems during backups oder other "high load"-cases.

yeap great point the ceph "independent network"

regarding the ssd's I am working with the Kingston DC600M series

@UdoB yes I totally understand and the numbers are theoretical based on the setup. I am just using them as reference..

Supaman · Mar 27, 2025

You need at least 3 nodes, but you can attach more als you like.

Enterprise SSDs with *real* Power Loss Protection is a must-have ! For Example Micron 7400 or Samsung pm9a3. PLP is important for latency - non PLp drives will have a latency 10-30x higher. You can use u2 to PCIe Adapter cards.

Important: bind Interfaces via MAC to a static name. Else you will mess up the network config, as interface names are dynamic generated in order when found. that means, if a network interface has "enp36s0" or something like that, and you add a PCI(e) device, it may have come up wth "enp40s0" on next reboot, but config in /etc/network/interfaces will still have old names.

Furthermore it is smart to create seperate vLAN / bridges for cluster communication. that makes it easier to keep cluster only traffic inside the cluster.

amjid · Apr 3, 2025

You're on the right track with a 3-node cluster setup. Based on my experience with similar Proxmox HA deployments, here are a few key points to consider:

Ceph is great for HA, but only when it's used with JBOD disks (not hardware RAID). If you're using RAID5, Ceph isn't ideal because it needs direct access to raw disks for proper replication and recovery.
If your previous Ceph setup caused issues during failover, it might be better to avoid it unless you're using proper hardware (10G networking, SSDs, and non-RAID disks).
For your current setup, with RAID1 for OS and RAID5 for storage, you're better off using LVM-Thin for local storage and NFS or iSCSI from a central NAS for shared storage and VM migration.
Expanding a RAID5 array is possible with some RAID controllers, and Proxmox can detect the new space after using pvresize, but plan this carefully with full backups.

If you want a more detailed explanation, I’ve created a complete 1-hour step-by-step video tutorial on setting up a Proxmox cluster with 3 nodes, Ceph, and high availability — including networking, storage, and backup configuration.

Watch the full tutorial here: https://youtu.be/kM0GcMBrfqc

Let me know if you have any questions.

Best,
Amjid Ali

Search

Search

Proxmox Server setup business - 3 nodes cluster - Suggested storage type

Sarlis Dimitris

Well-Known Member

itNGO

Renowned Member

bbgeek17

Distinguished Member

Sarlis Dimitris

Well-Known Member

bbgeek17

Distinguished Member

Thread 'CEPH vs RAID'

itNGO

Renowned Member

amjid

Member

Sarlis Dimitris

Well-Known Member

ness1602

Famous Member

Supaman

Member

UdoB

Distinguished Member

[TUTORIAL] Thread 'FabU: can I use Ceph in a _very_ small cluster?'

Sarlis Dimitris

Well-Known Member

UdoB

Distinguished Member

fko

New Member

Sarlis Dimitris

Well-Known Member

Supaman

Member

amjid

Member

We value your privacy