Ceph pve hyperconverged networking

davids01

New Member
May 13, 2025
7
2
3
Hi,

Ceph docs suggest single cluster network is sufficient.

https://docs.ceph.com/en/squid/rados/configuration/network-config-ref/

Whilst pve suggests separating the two - which makes sense if in doubt. And I'm assuming is suggested because of situations where hardware capability/nic capability varies. E.g. the public and cluster might not be able to go on equally capable interfaces.

https://pve.proxmox.com/pve-docs/chapter-pveceph.html


I just wanted to double check this makes sense for my setup:

3x Dell servers with 4x1.92TB SSD for Ceph, 4-port 25GbE, 2-port-1GbE:

-separate dedicated corrsoync ring0 on 1GbE copper OOB switch
-2x25GbE ports dedicated to Ceph* private - dedicated network on 100GbE switch with breakouts
-ports 3 and 4 on these 4-port server NICs will be used for other networks required for VM access (not Ceph).
-corrosync ring1 on either port3 or 4, TBC


*these two will be LACP bonded and carry a single Ceph network

Just trying to establish if this will be the right move - going single Ceph network on the 2x25GbE ports. Because the ports are identical in capability i was thinking no need to separate public/cluster Ceph. Since the corrosync is separated from the Ceph and I believe LACP has minimal latency impact i was leaning to this approach for simplicity.

thanks
 
you explicitly named the private CEPH network but where will your Ceph public network be?

FYI: depending on the SSDs, 25 Gb could be the bottleneck. A single PCIe 3.0 NVMe will outperform your 25 Gb link.
 
ah apologies ... i meant single Ceph-public network (no separate ceph cluster network), using private ips on an isolated layer2 vlan

I think the pve install will default to something like 10.10.10.0 with a single ceph network

the SSDs are SATA
 
Last edited:
Not sure whether I ubderstood you correctly:
- Two 1GBit for corosync redundancy
- Two 25 GBit for ceph Private or public network?
- Two 25 GB Ports for vm networking

  • 3x Dell servers with 4x1.92TB SSD for Ceph RBD (VM image storage)
  • switch 1 - 1GbE OOB switch (for corrosync and bmc/idracs)
  • switch 2 - 100GbE switch with breakouts (for Ceph internally and will also carry traffic for 1x VM client access network)
  • no switch redundancy (no MLAG etc), risk accepted

1x corrosync dedicated network on the 1GbE OOB switch (corrosync ring0 )
2x 25GBit for Ceph (currently proposing all Ceph lives here) - am proposing these two ports on NICs are bonded. The question is, should these bonded ports be carrying a Ceph Public+Cluster network, or just a single Ceph Public network (carrying all Ceph traffic internally).
------
2x 25GBit ports are for networks that the VMs require for access .. these aren't for Ceph .. they are for client access .. they will be distinct physical connections. 1x port is going to the 100GbE switch and will be carried via the switch uplink. The other port will be a fibre+SFP from NIC going to a completely different switch.
corrosync ring1 will be placed on one of the two VM access network NIC ports


with 2x25GbE and only 4x SATA SSD per host i think it will be ok for the Ceph traffic, for the initial deployment at least
 
hopefully have clarified sufficiently ...

saw this explanation elsewhere:

The Ceph Public network is mandatory and a lot of traffic is going over it. The optional Ceph cluster network can be used to move the inter OSD replication traffic to a different network to distribute the load more.

In my case im seeking to combine and just have the Ceph Public network ... unless you think there is a good reason not to with my above setup
-reason for having single network is to try to reduce complexity of admin (for others who may step in) .. no other reason ..
-i was thinking the bonded 2x25GbE ports would handle it fine .. but any feedback gratefully received

thanks
 
2x25GbE ports dedicated to Ceph* private - dedicated network on 100GbE switch with breakouts
Slight offtopic (and I might be missing something): If you use a single switch, your cluster will have very reduced availability due to switch being SPOF. Same with that break out cable (SPF cables can fail too).
 
I'm also interested in thoughts here as the advice in the Ceph docs vs the Proxmox docs can beem contradictory regarding separating the two ceph networks. The term "cluster network" is also confusing as this usually this is referencing corosync, but in the context of Ceph it means something else.
 
  • Like
Reactions: Johannes S
Ceph docs recommendations are based on simplicity of deployment and the fact that in a pure Ceph cluster you will have dozens or more servers contributing to the overall cluster network capacity. In a typical PVE+Ceph cluster you usually have a few nodes, so less overall network cluster capacity: adding here a couple of bonded links dedicated to replication traffic have a higher chance of improving Ceph performance as it can use more links.

The term "cluster network" is also confusing as this usually this is referencing corosync, but in the context of Ceph it means something else.
Because it's "Ceph cluster network" and not "PVE cluster network" ;) In fact, even if both run on the same hosts, they are completely independent clusters with their own quorum management (corosync vs Ceph monitors), hence the use each their own "cluster network".
 
  • Like
Reactions: Johannes S
Ceph docs recommendations are based on simplicity of deployment and the fact that in a pure Ceph cluster you will have dozens or more servers contributing to the overall cluster network capacity. In a typical PVE+Ceph cluster you usually have a few nodes, so less overall network cluster capacity: adding here a couple of bonded links dedicated to replication traffic have a higher chance of improving Ceph performance as it can use more links.


Because it's "Ceph cluster network" and not "PVE cluster network" ;) In fact, even if both run on the same hosts, they are completely independent clusters with their own quorum management (corosync vs Ceph monitors), hence the use each their own "cluster network".
That’s very useful and makes sense, thank you.
I think on my “cluster network” point I just meant I’d like the PVE documentation and UI to be more explicit when it’s referencing one or the other.
 
Because it's "Ceph cluster network" and not "PVE cluster network" ;) In fact, even if both run on the same hosts, they are completely independent clusters with their own quorum management (corosync vs Ceph monitors), hence the use each their own "cluster network".
To make it a little bit more complex ;) : the Ceph cluster network is optional and useful if you want to place the replication traffic (storing 2nd and 3rd replica on other hosts) on a different physical network. Otherwise, the Ceph Public network will be used for this as well, as it is used for all the Ceph client traffic (e.g. VMs).

Therefore I try to make it clear which cluster network I am talking about by making it clear if I mean the PVE/Corosync cluster network of the Ceph Public/Cluster network.

I think on my “cluster network” point I just meant I’d like the PVE documentation and UI to be more explicit when it’s referencing one or the other.
Thanks for that feedback. I'll keep that in mind when we are doing work on the docs/GUI. In some situation it should be clear enough though from the context. E.g. in the initial Ceph config wizard if you install Ceph on the first node via the GUI.
 
Last edited:
  • Like
Reactions: mtis and Johannes S