Ceph networking question

dfcn

New Member
Jan 26, 2016
6
0
1
44
Just a quick question on ceph setup:

we're doing a small 5 node cluster, 2 of the nodes are for vms, the other 3 are for ceph, all are running proxmox as part of the same cluster.

Each of the compute and ceph nodes has 2x 10Gb ports for the storage network.

For such a small setup, would it be better to

A) have a "public" ceph side of 1x10Gb from each of the compute and ceph nodes and then a "private" ceph network of 1x10Gb for just the 3 ceph nodes

B) go the simpler route of just having 1 ceph network for both sets of traffic, but then bonding all the ports on each node, so compute and storage have a 2x10Gb bond going to the same network

For added context: I don't expect the cluster to get much larger, i could see adding a couple storage nodes in the future at most, but any massive size changes, we'd like re-engineer anyways. I don't expect it to more than double in size, but if we do grow it, it would likely be to add more ceph storage nodes.
 
(A) is the "normal" approach for Ceph. You didn't describe your disk configuration but unless you have multiple SSD/host on the OSD hosts you are not likely to saturate the 10Gbe links (or if you do it will only be for short bursts).

You could do the bond/LAG approach, but in practice you won't get much benefit. Firstly you don't really need the performance unless you are saturating your links (which you probably aren't). Second, in most cases you don't really get 20Gbe performance from a bond/LAG (long explanation for another day). You get limited resiliency against certain failures - but since both links probably originate from the same dual-port NIC card and terminate on a single switch this benefit is actually limited to protection against SFP+ module failures (and such a small deployment is probably using passive DAC instead of optics and they just don't really fail).

I'd go with (A). Keep it simple.
 
  • Like
Reactions: dfcn
Just a quick question on ceph setup:

we're doing a small 5 node cluster, 2 of the nodes are for vms, the other 3 are for ceph, all are running proxmox as part of the same cluster.
Hi,
depends on your storage, but I would use ceph on all 5 nodes, because ceph speeds up with every node.
Each of the compute and ceph nodes has 2x 10Gb ports for the storage network.

For such a small setup, would it be better to

A) have a "public" ceph side of 1x10Gb from each of the compute and ceph nodes and then a "private" ceph network of 1x10Gb for just the 3 ceph nodes

B) go the simpler route of just having 1 ceph network for both sets of traffic, but then bonding all the ports on each node, so compute and storage have a 2x10Gb bond going to the same network

For added context: I don't expect the cluster to get much larger, i could see adding a couple storage nodes in the future at most, but any massive size changes, we'd like re-engineer anyways. I don't expect it to more than double in size, but if we do grow it, it would likely be to add more ceph storage nodes.
I would use [C]: bonded 10GB with different vlans. In this case you can switch your configuration if you would change something during production.

Udo
 
  • Like
Reactions: dfcn
Some added info for the replies below:

The storage nodes are 2x SSD for journals with 8x 7200rpm SATA for the actual OSD data. 1 SSD per 4 HDD's. The host OS is on an 11th drive.

The compute nodes have 1 SSD for OS, but it is small.

(A) is the "normal" approach for Ceph. You didn't describe your disk configuration but unless you have multiple SSD/host on the OSD hosts you are not likely to saturate the 10Gbe links (or if you do it will only be for short bursts).
...

I basically wasn't sure if there was sort of a plus/minus situation. For example, in my head, separating the network obviously means guaranteed bandwidth to each. But merging them into one (again, in my head, i dont have tons of real world experience with ceph) could potentially mean during normal use, i might be able to see a bit more than 10Gb provided it was reading from multiple nodes and using multiple links, and if something happened, maybe the same would be true during a rebuild (obviously at the cost of normal access speeds, but i'd be ok with that temporary burden during a failure event). It's not likely an issue now, but if we added 1-2 storage nodes i thought it might matter.

Hi,
depends on your storage, but I would use ceph on all 5 nodes, because ceph speeds up with every node.

I would use [C]: bonded 10GB with different vlans. In this case you can switch your configuration if you would change something during production.

Udo

Info above, but the 3 nodes are significantly different in hardware from the compute nodes.

I never thought of doing C. But I really like the idea.


Thanks for the input guys.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!