Yes I think so.Another question, seeing that enterprise nvmes do not saturate a 100Gbe network. Is there any benefit using separate networks for the Ceph
public and cluster networks or is safe to use all ceph traffic in one 100Gbe network? (I´m talking 3 node cluster with minimun 4OSD per node)
8x node
2x 3.2 TB nvme per server (will add more as I need it)
Dual 25G NIC connected to vCP switch setup, bonded with LACP for redundancy and also speed. I will have 50G usable.
I'd think it should be safe with 100G network based on my personal experience. My cluster average of 10~15Gbps read traffic from 48 OSD on 12 NVMe across 5 nodes.Another question, seeing that enterprise nvmes do not saturate a 100Gbe network. Is there any benefit using separate networks for the Ceph
public and cluster networks or is safe to use all ceph traffic in one 100Gbe network? (I´m talking 3 node cluster with minimun 4OSD per node)
If you bond two 25G in LACP you get 50G worth of bandwidth and also redundancy.So if you expect very busy I/O, 12NVMe in theory could give 60 to 80 Gbps of traffic.
The Ceph network is on a pair of 25G LACP bond. However, I don't recall ever seeing transfers go above 25G. It might be the way my network guys set things up or we're doing something wrongIf you bond two 25G in LACP you get 50G worth of bandwidth and also redundancy.
I feel like 50G is more than enough, that's like 5 GB/s throughput, not sure how many iops but that's per node. Across the entire cluster it's going to be way more
With some asterisks/footnotes ;-)If you bond two 25G in LACP you get 50G worth of bandwidth and also redundancy.
"Detail: I plan to use 6 physical ports of each switch for an LAG that will provide data communication between the switches to form the 'vPC' stack." -- At this point, I mean that I will use 6 ports for stacking (vPC). I don't know exactly how to calculate how many ports would be needed for communication between the stacked switches, but I figured that 6 ports (240Gbps) would be enough, since 12 servers with 2 10Gbps ports each would consume at most that bandwidth. So I made this choice. The above-mentioned RBD clients will use the Ceph Pools to store the virtual disks of the VMs. These include boot virtual disks with various operating systems, as well as virtual disks with file servers or databases such as PostgreSQL, Oracle, Microsoft SQL Server, and InterSystems IRIS.@adriano_da_silva the oversubscription option allows you to basically overcommit the available bandwidth and switching capacity.
In a pure Ceph workload, you would unlikely notice the extra microseconds in latency under normal load. Off course if you are going to load this to the edge and consume all the bandwidth at once (eg during a rebuild) you may start to run out of bandwidth.
Did you calculate in your 2-4 stacking ports? (if I’m not mistaken, the Nexus requires you to use ports to stack). I also don’t know if that model has the processing power to also handle LACP and VLAN once you start splitting out to many ports.
Can you post your system your configuration and what command you have used? It looks OK, but If your unsure check 'ceph osd perf' while benchmarking to see if one of your nvme is peforming bad. Ceph tell is also useful to see if one nvme is not working as the others regarding performance. But I guess your fine.Hi Have similar setup with 3 node MESH Configuration.
AMD EPYC 7543P 32-Core Processor 512GB ram
I have only 2 X 7TB 7450 PRO MTFDKCB7T6TFR per node for now. I did the FRR setup on 100Gbps test with Iperf goes about 96Gbps.
NVME goes about with
fio --ioengine=libaio --filename=/dev/nvme1n1 --direct=1 --sync=1 --rw=write --bs=4M --numjobs=1 --iodepth=1 --runtime=60 --time_based --name=fio
View attachment 77464
When I run rados bench from 3 nodes I get max 11 12 GBb/s on cluster read
This the rados bench Read
Is this good score ?
I was expecting read in Rados single instance would be little better at least like NVME 5000-6000 MB/s ?
Sorry it's a Mesh config I have Dual 2x100G but no switch. Which is ok for me. I have also 6x10Gb/s (Also no switch for this) maybe move ceph public network there but I think It would be probably slower ? What do you think ?@tane: Ceph bench simulates multiple clients at once, your bandwidth is 100Gbps and you get ~96Gbps (12*8) throughput, that seems to be acceptable. If you have the capacity to do 2x100G per server, you may be able to get better results.