Howdy,
So we've got a relatively small CEPH NVMe cluster consisting of 4x nodes, each with a Samsung datacenter 3.8TB M.2 SSD inside. Each node has 4x 10g connections, 2x in LACP for normal traffic, 2x in LACP for CEPH traffic. Connected using a pair of S4048-ON switches.
We're looking at moving towards 25G, as two of these nodes have Intel 25G NICs in them, and are running at only 10G with the current switches. However, we're not ready to move to full-25G on all NICs yet. This means we'd have two nodes on 10G, and the other two nodes on 25G.
My question is, how well does CEPH handle situations like this? If a 25G node starts sending traffic to a 10G node at >10Gbps, that obviously risks buffer issues. If we decide to go with a switch that does not have a very large packet buffer (models seem to either have ~16MB or like 8GB, but not in between), are we likely to have notable issues?
Thanks!
So we've got a relatively small CEPH NVMe cluster consisting of 4x nodes, each with a Samsung datacenter 3.8TB M.2 SSD inside. Each node has 4x 10g connections, 2x in LACP for normal traffic, 2x in LACP for CEPH traffic. Connected using a pair of S4048-ON switches.
We're looking at moving towards 25G, as two of these nodes have Intel 25G NICs in them, and are running at only 10G with the current switches. However, we're not ready to move to full-25G on all NICs yet. This means we'd have two nodes on 10G, and the other two nodes on 25G.
My question is, how well does CEPH handle situations like this? If a 25G node starts sending traffic to a 10G node at >10Gbps, that obviously risks buffer issues. If we decide to go with a switch that does not have a very large packet buffer (models seem to either have ~16MB or like 8GB, but not in between), are we likely to have notable issues?
Thanks!