3 node ceph concept consideration

vRod

Renowned Member
Jan 11, 2012
36
2
73
Hi all,

I am currently considering to do a 3-node ceph concept containing 3 of the following hosts:

- Xeon E5-2690v3
- 128gb 2133mhz DDR4 ecc memory
- 1.6TB P3600 NVMe (ceph storage)
- 4x10GbE networking
- optionally 6x 2.5” SSD’s (1 for bootdrive)

I can expand the memory to 256gb and add another cpu in each host. However, it’s a C6320 system so I am limited to one NVMe drive per host.

Would this configuration be fast enough? I’ve heard that 3-node ceph constructs are not that fast, but would this change it?

Hoping to hear some thoughts and ideas!

Thanks,
Chris
 
Hi,
Would this configuration be fast enough?
For how many VM which does what?
There is a huge difference between a static web server and an SAP instance.
 
Hi Wolfgang,

About 10 VDI machines currently, mysql clusters, web servee clusters, pfsense firewalls, VPS for some friends who rent for a community and some other stuff.. totally about 40 at the moment but it may grow of course.
 
It is hard to say because these VMs are hard to calculate.
How does the network look like?
Are the 4 10GBit exclusive fore Ceph? And what link do they use (RJ45, DAC, Fieber)?
Is this a 2 Sockel server or just on Sockel?
My personal valuation is when you have a single socket System with Fiber or DAC Cables which are directly attached to each other (full-mesh) you will probably fast enough with this setup. You may get 80% of the speed one NVMe is capable.

But in the end, you have to try.
 
Hi wolfgang

I am splitting the ports up in pairs. 2x10gb for vm traffic and 2x10 for ceph on each node. They all connect to a stacked 10gbe switch with DAC.

It’s a dual socket system, however at the moment I only had the need to install one CPU.
 
They all connect to a stacked 10gbe switch with DAC.
This will bump the latency up to factor 2.

It’s a dual socket system, however at the moment I only had the need to install one CPU.
NUMA architectures have higher latency.
 
This will bump the latency up to factor 2.
Wait what, is that really true? Would it be wiser to use 10G SFP's with optic fiber?
NUMA architectures have higher latency.
This I am aware of but what I believe you are saying is that it's better to stay with one CPU as long as possible? :)

EDIT: I decided to google a little bit and Arista has a testing document that says that DACs actually have a lower latency than Fiber? https://www.arista.com/assets/data/pdf/Copper-Faster-Than-Fiber-Brief.pdf
 
Wait what, is that really true? Would it be wiser to use 10G SFP's with optic fiber?
The problem is the (stacked) Switch.
A switch has to process the packages this cost time. time is latency.
 
This I am aware of but what I believe you are saying is that it's better to stay with one CPU as long as possible? :)
You can pin the processes down to a core but you must ensure you do not use more memory than one socket can provide.
EDIT: I decided to google a little bit and Arista has a testing document that says that DACs actually have a lower latency than Fiber?
This is true.
 
The problem is the (stacked) Switch.
A switch has to process the packages this cost time. time is latency.
You are right, let me elaborate. They are not stacked the traditional way, but use the MLAG-technology (Arista switches). To avoid the traffic traversing the stack, would it perhaps be better to create 2 separate ceph networks with a port each, which connects to a different switch, thereby not travelling over the MLAG link? Or can you only use a single ceph network?
 
with three nodes and two 10GBit ports, you can make easy a full-mesh.
This is the fastest and less costly version.
The other two 10 GBit ports you can connect over the MLAG Switches.
 
Ok how would I achieve this? Would I simply bridge the 2 interfaces together on each node?

EDIT: nevermind my question, I found the answer on the proxmox Wiki. :)

But anyways, you believe this would be a good setup?
 
Last edited:
Thank you for your answers so far! Without getting too much into a SATA vs NVMe topic, would SATA SSD's also do well in Ceph? Here I am thinking S4510 480GB SSD's, about 4 of them in each node.
 
The general rule is.
The faster the better.
NVMe vs SATA is only a question of latency.
Here again
The lower the better.
The parallelism does not matter in this case.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!