performance, ceph cluster, nvme, to slow: 130 MB instead of 900MB, 6 times slower

like expected

2022-10-13 06:27:42 migration active, transferred 894.6 MiB of 8.0 GiB VM-state, 111.7 MiB/s
2022-10-13 06:27:43 migration active, transferred 1006.7 MiB of 8.0 GiB VM-state, 113.7 MiB/s
2022-10-13 06:27:44 migration active, transferred 1.1 GiB of 8.0 GiB VM-state, 133.0 MiB/s
2022-10-13 06:27:45 migration active, transferred 1.2 GiB of 8.0 GiB VM-state, 119.3 MiB/s
2022-10-13 06:27:47 migration active, transferred 1.4 GiB of 8.0 GiB VM-state, 165.7 MiB/s
2022-10-13 06:27:48 migration active, transferred 1.5 GiB of 8.0 GiB VM-state, 195.2 MiB/s
 
it makes perfectly sense if i put the "public ceph" on 1gb (the public is for the data replication), that the bandwith isnt more as 1gb, which was represented in the numbers (130mb/s), the cluster network on the other hand does monitoring, status and so on, traffic - which needs much less traffic.

This is not how I understand it.

ceph-networks.png
The cluster network relieves OSD replication and heartbeat traffic from the public network.
It is possible to run a Ceph Storage Cluster with two networks:
a public (client, front-side) network and
a cluster (private, replication, back-side) network.
https://docs.ceph.com/en/latest/rad...k-config-ref/#network-configuration-reference

My understanding (for simplicity reduced to the heavy traffic only):
  • Public network = traffic between your guests/clients and a/the OSD(s). (Guests/Clients <=> OSD)
  • Cluster network = traffic between the OSDs themselves for replication. (OSD <=> OSDs)
So every data you write in your guests/clients has to go first through the public network onto one OSD and simultaneously(?!) from this OSD through the cluster network to the other OSDs for replication.

Fully possible, that my understanding is wrong; so someone with knowledge might please correct me, if so!

[0] https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-converged-with-nvme.76516
 
https://docs.ceph.com/en/latest/rad...k-config-ref/#network-configuration-reference

My understanding (for simplicity reduced to the heavy traffic only):
  • Public network = traffic between your guests/clients and a/the OSD(s). (Guests/Clients <=> OSD)
  • Cluster network = traffic between the OSDs themselves for replication. (OSD <=> OSDs)
So every data you write in your guests/clients has to go first through the public network onto one OSD and simultaneously(?!) from this OSD through the cluster network to the other OSDs for replication.

Fully possible, that my understanding is wrong; so someone with knowledge might please correct me, if so!

[0] https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-converged-with-nvme.76516

to be honest / i understood it exactly like you.
which also would mean, that both networks need to have the same bandwidth. if the client can only serve 1gbit datatransfer, it doesnt make sense to have 10gb clusternetwork (except for 10 nodes - replication, depending on the replicasets)
yesterday i have reconfigured the cluster, ceph uses the cluster and public network together with 10gbit, with vlan, but its still slow. it still copies with 130 mb/s roughly. if i cant get full performance, i need to go back to esxi. (its a prop cluster and it just need to work - i spend to much time already for that)
 
i found the mistakle
look at the numbers : 12gb in 20 secs


create full clone of drive virtio0 (ceph_01:vm-102-disk-0)
drive mirror is starting for drive-virtio0
drive-virtio0: transferred 0.0 B of 32.0 GiB (0.00%) in 0s
drive-virtio0: transferred 744.0 MiB of 32.0 GiB (2.27%) in 1s
drive-virtio0: transferred 1.4 GiB of 32.0 GiB (4.49%) in 2s
drive-virtio0: transferred 2.2 GiB of 32.0 GiB (6.81%) in 3s
drive-virtio0: transferred 2.6 GiB of 32.0 GiB (8.13%) in 4s
drive-virtio0: transferred 3.3 GiB of 32.0 GiB (10.41%) in 5s
drive-virtio0: transferred 3.9 GiB of 32.0 GiB (12.12%) in 6s
drive-virtio0: transferred 4.4 GiB of 32.0 GiB (13.82%) in 7s
drive-virtio0: transferred 5.1 GiB of 32.0 GiB (15.82%) in 8s
drive-virtio0: transferred 5.4 GiB of 32.0 GiB (16.86%) in 9s
drive-virtio0: transferred 5.8 GiB of 32.0 GiB (18.24%) in 10s
drive-virtio0: transferred 6.2 GiB of 32.0 GiB (19.43%) in 11s
drive-virtio0: transferred 6.7 GiB of 32.0 GiB (21.07%) in 12s
drive-virtio0: transferred 7.3 GiB of 32.0 GiB (22.96%) in 13s
drive-virtio0: transferred 8.0 GiB of 32.0 GiB (25.08%) in 14s
drive-virtio0: transferred 8.8 GiB of 32.0 GiB (27.41%) in 15s
drive-virtio0: transferred 9.3 GiB of 32.0 GiB (29.17%) in 16s
drive-virtio0: transferred 10.0 GiB of 32.0 GiB (31.13%) in 17s
drive-virtio0: transferred 10.6 GiB of 32.0 GiB (33.11%) in 18s
drive-virtio0: transferred 11.3 GiB of 32.0 GiB (35.23%) in 19s
drive-virtio0: transferred 12.0 GiB of 32.0 GiB (37.44%) in 20s
drive-virtio0: transferred 12.6 GiB of 32.0 GiB (39.25%) in 21s
 
Last edited:
Cluster network is only used for osd->osd replication. All other traffic (vm->osd, vm->mon, osd->mon,...) is going to public network.

if no cluster network is defined, osd->osd replication is going to public network.

That don't make any sense to have different bandwith for public && private network.
you need at least same capacity for cluster network

write path
------------
client-------public(10gbits)------>osd -----private(10gits)------->osd
 
  • Like
Reactions: pille99
i made a mistake in the configuration
i noticed it yesterday - the ceph 10gb subnet was 10.10.10.0, later i created a sdn with a subnet for webservers also 10.10.10.0. after i changed the subnet for webserver it runs lie a charm.
now i face another issues. more in my new post
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!