Ceph large file (images) resync

dietmar

Proxmox Staff Member
Staff member
Apr 28, 2005
17,124
522
133
Austria
www.proxmox.com
Relating to Ceph performance, I really have no idea if your benchmarks are ok or not, just started using CEPH last week.
I have 4 CEPH nodes, each has 4 SATA disks. One disk is used for journals, the other 3 disks are used for OSDs for a total of 12 OSDs.


What kind of disk do your use? SSD for journal? How fast is it without extra journal disk?
 

symmcom

Renowned Member
Oct 28, 2012
1,084
36
68
Calgary, Canada
www.symmcom.com
What kind of disk do your use? SSD for journal? How fast is it without extra journal disk?

I have 8 Sata HDD for OSDs. Journal is on the same OSD. No separate SSD. I was told in CEPH IRC that when large number of OSDs are used, it is a goo didea to put journal on same disk instead of SSD. I only have 8 OSDs now, but it is going to grow very soon with some addition of big data proxmox VM.
 
Last edited:

e100

Renowned Member
Nov 6, 2010
1,250
35
68
Columbus, Ohio
ulbuilder.wordpress.com
I originally setup my CEPH cluster with 16 OSDs with journals on the OSD disks. Writes were horrible. Dedicating one disk in each node for journal and having only 12 OSD performed better.

I am confident that an SSD journal would perform even better. Reading over CEPH mailing list archives I got the impression that a single SSD can act as a journal for a few OSDs but having too many journals on a single SSD hurts performance.
 

mir

Famous Member
Apr 14, 2012
3,559
120
83
Copenhagen, Denmark
I am confident that an SSD journal would perform even better. Reading over CEPH mailing list archives I got the impression that a single SSD can act as a journal for a few OSDs but having too many journals on a single SSD hurts performance.
Doesn't this depend on the specs of the SSD?
 

dietmar

Proxmox Staff Member
Staff member
Apr 28, 2005
17,124
522
133
Austria
www.proxmox.com
I originally setup my CEPH cluster with 16 OSDs with journals on the OSD disks. Writes were horrible. Dedicating one disk in each node for journal and having only 12 OSD performed better.

Strange - so you are using a normal disk for 3 journals?
 

e100

Renowned Member
Nov 6, 2010
1,250
35
68
Columbus, Ohio
ulbuilder.wordpress.com
Yes normal SATA disk with a partition for each journal.

Performance is not great but is better than having journals on the OSDs.

With the price of 16g and 32g SSD being so cheap I was thinking of using one ssd journal per OSD disk.

But I am still stuck with read speeds being limited, if I cannot fix this I cannot use CEPH.
Maybe that aio issue you suspect exists is also reducing CEPH read performance.
 

symmcom

Renowned Member
Oct 28, 2012
1,084
36
68
Calgary, Canada
www.symmcom.com
Doesn't this depend on the specs of the SSD?
It did not seem to me that the specs of SSD mattered that much when we are talking about large number of OSDs lets say 36 or more of them. Even with the fastest SSD out there performance will not get much benefit after certain point.

My personal preference though is Journal on OSDs even though it hurts performance a little. It eliminates extra layer of RAID for Journal SSDs. With Journal on OSD, if one hdd fails i only lose data for that one. If SSD/SSDs Journal fails, i lose entire host of OSDs and longer downtime.

Following a trail left by e100, i came across some articles where i found some tweaking that i can still do to the CEPH Cluster. Going to try them and see where my performance goes.
Proxmox VM seem to have slightly better performance when they are on CephFS. Somewhat promising. None of the CephFS VM crashed so far.
 

symmcom

Renowned Member
Oct 28, 2012
1,084
36
68
Calgary, Canada
www.symmcom.com
Below is a comparison of the same VM on CephFS and RBD. I ran several benchmarks on each and had about the same numbers. Cache enabled.

VM on CephFS:
cephfs-1.PNG


VM on RBD:
rbd-1.png
 

symmcom

Renowned Member
Oct 28, 2012
1,084
36
68
Calgary, Canada
www.symmcom.com
I have used all of these:
Mellanox MHGA28-XTC (20Gbps dual port,pcie)

I have not played around Infiniband yet. Does anybody know if i can use a 20Gbps Infiniband card with a 10Gbps Infiniband switch?
A 1Gbps Ethernet card works with lower bandwidth LAN Switch. Does the same logic applies to Infiniband?
 

tarax

Member
Apr 2, 2010
43
5
8
New tech for me too... thank u very much for sharing this experience with us all !
My question is: is a switch needed or can crossover/back to back links be created between a pair of servers ?
 

symmcom

Renowned Member
Oct 28, 2012
1,084
36
68
Calgary, Canada
www.symmcom.com
Just wanted to confirm that increasing the number of OSDs does increase the performance of CEPH Cluster. I went from 6 OSDs to 10 OSDs and the increase of performance was immediate without changing anything else.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!