[SOLVED] Configurating cluster networks with Ceph

ph0x

Renowned Member
Jul 5, 2020
1,327
224
73
/dev/null
Hi everyone!
I recently read a lot of stuff about hyperconverged PVE/Ceph clusters and now struggle to see the forest through all the trees. ;-)
I (plan to) have a three node Proxmox cluster with 4 NICs in each node (2x 1Gbe, 2x 10 Gbe) and would like to have some advice on which NIC should be dedicated to which service.
The first one is easy: one 1 Gbe for corosync, maybe with a fallback on one 10 Gbe NIC (which is connected to another switch)
Next, I would dedicate one 10 Gbe for VM traffic to the LAN
And last, I would dedicate the other two NICs for Ceph, one external and one cluster network.

Now, does this make sense? If so, would the 10 Gbe NIC become dedicated either for the external or for the cluster network? Would an active/backup bond between those two NICs maybe make more sense, given that they are connected to two different switches?

I appreciate any input and also further reading material. :)

Regards
Marco
 
Last edited:
You want more ports in your servers. 2x 10 GbE (or better) for Ceph, 10 or 1 GbE for VM/CT and 2x 1 GbE for Corosync and mgmt. And some extra for backup / migrations if needed.
 
Please keep in mind that this is a home setup.
The boards are what they are, I just want to get the optimal configuration out of it.
Plus, I'm not very far from your recommendation. GUI managenent will be done via vlan, OOB managenent will be done via IPMI.
So, which one of the ceph networks has to be the 10 Gbe? External or cluster?

Best regards
Marco
 
Okay, thanks for the image, that clears things up.
Will make up my mind about a solution, probably some VLAN on the public network.
 
This is my solution for documentation purposes:

2x 10 GbE form a roundrobin bond and ceph gets two VLANs for public and cluster communication in it.
Another bond, active-backup this time, is formed out of said roundrobin bond and another 1 GbE NIC.
Corosync will run over the second 1 GbE interface and have a separate fallback VLAN in the 10 GbE bond.
This way I don't need another NIC, Corosync is redundant, Ceph utilizes the maximum my hardware can provide and still has a fallback in case the 10 GbE switch dies.

A post on the Ceph mailing list also suggests a bond rather than separate links, so I will give this a try.