Ceph over hyper-converged network

Nov 15, 2015
32
3
28
Hi good people.

I have question about ceph deployment. My hardware configuration is:
x4 servers with dual head 40gbps mellanox NIC
x2 mellanox 40gbps switches

Can i bond NIC's and use this NIC's as main network for ceph, cluster and virtual machines access? I made such deployment using microsoft hyper-v 2013 and 2016 with storage spaces. But as i see in kernel documentation connecting one server to different switches can give me only active-backup mode. But this is not important part of the question. Active - Backup mode is acceptable. More important to know - can i use single 40gbps network to route traffic from ceph, cluster and virtual machines access? In ceph documentation is noted that ceph needs separate 10 gbps network. I think maybe it will good to bond NICs, and separate traffic using vlans, but use for all this traffic one 40gbps link? Main idea is to get fully high availability sheme on servers, switches and network links. If sheme with one 40 gbps link to transfer ceph data and another 40 gbps for cluster and virtual machines access will work, my high availability will be lost (if one switch is down my ceph or my cluster will not available - anyway cluster is down). I can't put in servers more NICs - no space left (and no money left =)).

Thx.

PS Sample sheme is attached. Maybe there is another way?
 

Attachments

  • cluster.PNG
    cluster.PNG
    50.7 KB · Views: 18
Hi,

But as i see in kernel documentation connecting one server to different switches can give me only active-backup mode
You can if your switch support it also use LACP (IEEE 802.3ad).

The general problem with VLAN are latency problems not the bandwidth.
Corosync and Ceph are very sensible about high latency.
In the worst case your cluster is crashing and your lose data.
So you have to monitor your switch very well.
 
As i know LACP is supported only on same switch, but i need to use different switches to get failover.
Can you explain, what you mean "In the worst case your cluster is crashing and your lose data." On one switch - yes, if switch will be lost - cluster will be lost, therefore i need two switches to get failover network.
 
LACP over multiple switches can be achieved with
IEEE 802.1aq called M-LAG at Mellanox.