Optimal Ceph Configuration

n7qnm

New Member
Jan 4, 2024
17
2
3
Prosser, WA, USA
www.n7qnm.net
I've been running Proxmox about a year and have just added a 5th node to my cluster.

Currently:
Each node in the cluster has a 1Tb SSD and boots from a 512Gb NVME. The 1Tb SSDs are all one CEPH Pool
Each node also has two 2.5Gb NICs and one 1Gb NICs. The ceph public network is 129.168.75.0 (1Gb) and cluster network is 192.168.70.0 (2.5Gb). The third NIC (2.5Gb) is configured but unused. The rest of my network is all 1Gb.

I'd like to:
1) Boot the nodes from a USB NVME, which I have, and then add the NVME drives to my storage pool. Are there any downsides to just creating and adding 5 new OSDs to the current pool? Is there a "better" way?

2) Make use of 3rd NIC to further isolate the disk traffic from "other" traffic. Is the even worth the hassle, and if so, what's the optimum configuration and how do I get there from here.

Thanks in advance!

Clay Jackson
 
One larger pool would spread I/O out more. With two smaller pools you could classify types of storage for certain VMs.

Not sure how to move storage to a new subnet.
 
Update - I have 2 networks 192.168.70.0, 192,168.71.0 and 192.168.75.0 70 and 71 are 2.5Gb. local to my Proxmox cluster. .75 is my 1Gb "main" network.



When I moved the Ceph "cluster network" from .75 to .71; I got HUGE performance improvements - i/o latency as reported by Proxmo down from 10-15% to .5-1.5%



Currently, my cluster is configured like this

Promox ring0 .70

Ceph Public .75

Ceph Cluster .71



Given only 3 NICS, is this optimal, or could I move the Ceph public to either .70 or .71 (both of which are 2.5gb as opposed to 1Gb)
 
Ceph cluster network is used for replication traffic only. All traffic from clients, MON, MDS, OSD flows through the public network. I would bond both 2.5 nics in an LACP LAG to two stacked/MLAG switches and set both public and cluster networks as vlans through the bond. That would get you the most out of your nics. Corosync ring0 should not be shared with the storage nics, although you could add a second corosync link as another vlan over the 2.5g bond.
 
Can Corosync Ring0 be shared with my “admin” network?
Sure, why not?

Corosync can handle multiple "rings". (Approximately nine of them, iirc.) Just give it one ring per physical NIC (and ignore VLANs).