Anyone using 10gbe mesh/ring network for Ceph?

gkovacs

Renowned Member
Dec 22, 2008
512
50
93
Budapest, Hungary
So there is a howto on the wiki that details the setup of a 10 Gbit/s Ethernet network without using a network switch:
http://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server

If I understand correctly, you would need a two port 10 Gbe NIC (or two NICs) in each of your nodes, you connect each network port to two different adjacent nodes (thereby connecting all nodes in a circle). The wiki article recommends it for a 3 node cluster, but states that it should work the same for a 5 node cluster.

Then according to Method 1, you simply set up a broadcast mode bond on each of your nodes between your two ports, which would mean all traffic to / from your network would eventually propagate to all nodes. (Or according to Method 2, you set up routes for all destinations).

So in case of a 5 node cluster, if your packet is lucky then it would reach its destination in one hop, if unlucky then 2 hops.

My questions:
- Has anyone tried to build the successfully?
- If yes, how many nodes, and which method?
- What kind of NICs and cabling have you used?
- Is the performance good for Ceph?
- Is there a performance difference between Method 1 (broadcast bonds) and Method 2 (up/down routing)?
 
you misunderstood the article. a full mesh means one link to every other node - the article explicitly states that you need "n-1" ports per node, where "n" is the number of nodes. so a 5 node cluster that is "fully" connected needs 4 ports on each node, 1 for connecting to every other node. what you are describing is a "ring" network topology (a dual ring to be exact). a ring is vastly inferior to a (full) mesh, except for the number of ports and cables involved.
 
  • Like
Reactions: gkovacs
you misunderstood the article. a full mesh means one link to every other node - the article explicitly states that you need "n-1" ports per node, where "n" is the number of nodes. so a 5 node cluster that is "fully" connected needs 4 ports on each node, 1 for connecting to every other node. what you are describing is a "ring" network topology (a dual ring to be exact). a ring is vastly inferior to a (full) mesh, except for the number of ports and cables involved.

Ok, thanks for clearing that up. So let's say I want to build a dual ring topology, because my test cluster consists of 5 nodes and connecting every node to every other node would be unpractical (cabling and available PCIe slots), and also not much cheaper that switched. Also, with 5 nodes a dual ring should be similar in performance compared to a switched one.

- Will the dual ring topology work with the same broadcast bond method?

- Do I need to manually prevent loops in this setup somehow?
- Can I build this with Linux bonds, or would I need OpenVSwitch for this?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!