[SOLVED] Ceph cluster network vs. Ceph public network: which data is transferred over which network?


Well-Known Member
Mar 4, 2014
I have completed setup of 6 node cluster running PVE and Ceph.

This is my ceph configuration:
root@ld3955:~# more /etc/pve/ceph.conf
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network =
fsid = 6b1b5117-6e08-4843-93d6-xxxxxxxxxxxx
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
public network =
osd crush update on start = false

keyring = /var/lib/ceph/mds/ceph-$id/keyring

keyring = /var/lib/ceph/osd/ceph-$id/keyring

host = ld3955
mds standby for name = pve

host = ld3976
mds standby for name = pve

host = ld5505
mon addr =

host = ld5506
mon addr =

host = ld5507
mon addr =

The public network is configured on a 10GBit NIC, and the cluster network on a 40GBit NIC.
All OSDs are only connected to nodes

I have mounted a NFS share on ld3955 using a 40GBit NIC.

When I transfer data from the remote location to a RBD I can monitor the max. throughput of 10GBit on this interfaces:
vmbr0 (= public network)
bond0 (= cluster network)

My assumption was that all data that must be distributed on the OSDs will be transferred over cluster network.
But this seems to be wrong, or why can I see this on vmbr0 with iftop?
353Mb 707Mb 1,04Gb 1,38Gb 1,73Gb
ld3955 => ld5505 973Mb 1,17Gb 1,15Gb
<= 881Kb 1,16Mb 1,08Mb
ld3955 => ld5506 1,20Gb 1,11Gb 1,12Gb
<= 1,49Mb 1,61Mb 1,53Mb
ld3955 => ld5508 1,06Gb 1,07Gb 1,08Gb
<= 1,31Mb 1,30Mb 1,25Mb
ld3955 => ld5507 1,13Gb 1,04Gb 1,03Gb
<= 1,34Mb 1,33Mb 1,28Mb

Please comment and advise.

See here, a client sends the data over the public network, as PVE is both client and server, it will transfer on both.

Well, in this particular scenario I tried to outwit Ceph.
This means, the client communicates with the Ceph cluster in the cluster network:
The client's NFS share is mounted over cluster network NIC.

However, this does not have the expected impact if I use host ld3955 which is neither a MON nor a OSD.
Switching the node, means mounting the NFS share on ld5505 which is a MON and a OSD, works better; the throughput is +10GBit/s on bond0.
The cluster network seen in the picture, is only for the OSDs and their replication/heartbeat. Every other communication with Ceph happens on the public network.


The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!