Ceph with SSDs vs spindle

tycoonbob

Member
Aug 25, 2014
67
0
6
Hi guys. So I've been doing some testing with a 3 node pve 3.3 cluster with Ceph. I currently have 1 512GB SSD in each box, and the io is just disappointing. As I'm learning more about Ceph, I'm finding that I just need more OSD's, and I'm trying to determine how many OSD's I need to get optimal performance.

I have 3 Dell R610's, each with 4 available drive bays, for a max of 12 OSD's. Obviously, 12 512GB SSD's would be costly (~$2400), so not an option at this time. However, I could get 3 more of these 512GB SSDs giving me a total of 6 SSD's for OSD's.

Now with the cost I would have to invest in getting 3 more SSD's (at ~$200 each), I could get 12 300GB 10K drives (WD VelociRaptors would be cheapest, at about $35/each -- around $425 total). 300GB 10k SAS drives would cost me more, but would be more reliable. With the reliability aside, would I see better performance from 6 SSD's as OSD's, or 12 10k drives as OSD's? None of my VM's will be hogging serious i/o, but lets say I'll have 20-25 CentOS-based VM's, each with a 10GB drive running a low load. Clearly I don't need a large amount of storage (backups and ISO's will live on a NFS share), and suspect ~500GB will be plenty. So yeah...

1) 12 10K OSD's vs 6 SSD OSD's -- which will I see better performance with?

2) Would I see a decrease in performance by using a replication factor of 3, versus a replication factor of 2?

Thanks!
 
>>1) 12 10K OSD's vs 6 SSD OSD's -- which will I see better performance with?
>>
>>2) Would I see a decrease in performance by using a replication factor of 3, versus a replication factor of 2?

Hi, on my test cluster, with 6 ssd osd (intel s3500)

i can reach

random 4k read: 100000 iops
random 4k write : replication x1 : 25000 iops
: replication x2 : 12000 iops
: replication x3 : 8000 iops

write bottleneck is osd cpu at 100% (8core
CPU E5-2603 v2 @ 1.80GHz).


Now, this is for random ios.

if you are doing sequential ios, maybe 10K could perform great.
 
>>1) 12 10K OSD's vs 6 SSD OSD's -- which will I see better performance with?
>>
>>2) Would I see a decrease in performance by using a replication factor of 3, versus a replication factor of 2?

Hi, on my test cluster, with 6 ssd osd (intel s3500)

i can reach

random 4k read: 100000 iops
random 4k write : replication x1 : 25000 iops
: replication x2 : 12000 iops
: replication x3 : 8000 iops

write bottleneck is osd cpu at 100% (8core
CPU E5-2603 v2 @ 1.80GHz).


Now, this is for random ios.

if you are doing sequential ios, maybe 10K could perform great.

This is very interesting. My R610's have dual Xeon L5520's, which are less powerful than what you tested with. If the OSD's are using the same CPU's as my VM's, it seems the OSD's are hogging all the CPU (or at least a lot of it). Does this seem right?

I had some thoughts last night about what I can do. I can add 3 more SSD's, get 300GB 10k drives, or replace my three Dell R610's with three Dell C2100. Now I'm concerned about the CPU usage when using Ceph.

Why replace my R610's with C2100? To completely redo my setup and eliminate my dedicated storage server. My thought is that each C2100 has 12 3.5" bays, so 2 drives of OS (RAID 1), 8 2-4TB 7200RPM drives for OSD's, and 2 60GB drives for journal (each SSD would journal for 4 of the spindle drives). The big question with this setup is whether I can run CephFS on top of this and share out NFS and CIFS, or if I can run a virtual file server (say FreeNAS or OpenMediaVault) and deliver storage that way. I'd probably only start with 12 of those drives (4 per server) to give me the storage I need for everything. I'd still want to use Proxmox for the OS on everything, but not sure if CephFS can be used in this case (even though CephFS is still beta), or if something like this would even be supported. Thoughts?