is CEPH stable for environment production critical?

cesarpk · Feb 22, 2014

Hello people

I had read this link:
http://pve.proxmox.com/wiki/Ceph_Server

But as on this link say that is as technology preview, i think by the moment i will test ceph running independently to the PVE Nodes

In this terms:
1- I want to know what so stable is CEPH for use in environment production critical?
2- Can i run BB.DD. in the KVM VMs without problems of perfomance by access to disk? (always thinking to use a single NIC 10 Gb. ethernet or dual 10 Gb. ethernet over LACP 802.3ad, both options are for use exclusive of CEPH network communication)

Please, I welcome your comments, especially the negative part of CEPH

Best regards

wahmed · Feb 22, 2014

Just to clarify, are you asking if CEPH itself stable enough to run in a production environment ? Or are you asking about the stability of the new upcoming release of Proxmox CEPH Server where both Proxmox and CEPH can be on same node and all CEPH management can be done from Proxmox GUI?

If you are asking the stability of CEPH on its own, then my comment on is, it is very much stable to be used in Production environment. I have 3 setups where CEPH is used as storage backbone and serving about 30 users avg. in each setup. Its stability and resiliency is hard to match with other solutions out there. But i will suggest if you need faster IO access such as database server, it helps to have separate CEPH pools of SSD and put the virtual server there. Increases much performance. Things also speeds up if used minimum 3 nodes and 12 OSDs. None of my setup has 10GB backbone, File transfer is still acceptable.

If your interest is in the new Proxmox CEPH Server, i would say hold off to that and give time to let it mature a bit. That way it can gaurantee safe operations after few bugs has been taken care of.

udo · Feb 22, 2014

cesarpk said:
Hello people

I had read this link:
http://pve.proxmox.com/wiki/Ceph_Server

But as on this link say that is as technology preview, i think by the moment i will test ceph running independently to the PVE Nodes

In this terms:
1- I want to know what so stable is CEPH for use in environment production critical?
2- Can i run BB.DD. in the KVM VMs without problems of perfomance by access to disk? (always thinking to use a single NIC 10 Gb. ethernet or dual 10 Gb. ethernet over LACP 802.3ad, both options are for use exclusive of CEPH network communication)

Please, I welcome your comments, especially the negative part of CEPH

Best regards

Hi,
ceph is stable, but you can also run in trouble with ceph.

This happens normaly due fails of the admin ;-)
e.g. if your disks are to full, or you change to much at one time (reweighting of disks, ingreases of pgs...).

I have do some test with the new pve-ceph solution and it's looks good. On the other side, I have an ceph-cluster in production, where the performance is not good egough (but this is no generally ceph-problem).
Try since weeks to isolate the issue, but it's not so easy to find (the pve-ceph from pvetest performs better (and use only Gigabit) than the 10Gbit Ceph-cluster...).

Udo

cesarpk · Feb 23, 2014

Thanks symmcom and udo for your comments

@udo:
what is your isolate issue?

@experienced user:
please give your feedback

Best regards
Cesar

dietmar · Feb 24, 2014

Just wanted to note that there will be a new ceph release (firefly) in a few weeks. This is expected
to be even more stable, and it is marked as 'long term' supported.

mir · Feb 24, 2014

Will proxmox in the future follow the long term support road with ceph?

dietmar · Feb 24, 2014

mir said:
Will proxmox in the future follow the long term support road with ceph?

What exactly is the 'long term support road'?

jleg · Feb 24, 2014

udo said:
Hi,
ceph is stable, but you can also run in trouble with ceph.

This happens normaly due fails of the admin ;-)
e.g. if your disks are to full, or you change to much at one time (reweighting of disks, ingreases of pgs...).

I have do some test with the new pve-ceph solution and it's looks good. On the other side, I have an ceph-cluster in production, where the performance is not good egough (but this is no generally ceph-problem).
Try since weeks to isolate the issue, but it's not so easy to find (the pve-ceph from pvetest performs better (and use only Gigabit) than the 10Gbit Ceph-cluster...).

Udo

Hi,

may i ask what "not so good" means in figures? I'm running a test ceph installation, 3 nodes, each with 4 SATA disks with 4 OSDs/node, using 2GBit links (bonding). I'm getting up to 120MB/s, and i'm wondering if 10GBit links would still improve rates in this case...

mir · Feb 24, 2014

dietmar said:
What exactly is the 'long term support road'?

Meaning only release with ceph-lts or release with ceph-nlts to be able to support newer ceph features not propagated to lts yet.

nlts = not long time support

dietmar · Feb 24, 2014

mir said:
Meaning only release with ceph-lts or release with ceph-nlts to be able to support newer ceph features not propagated to lts yet.

We will include LTS libraries by default. But the user is free to run newer version.

tom · Feb 24, 2014

jleg said:
Hi,

may i ask what "not so good" means in figures? I'm running a test ceph installation, 3 nodes, each with 4 SATA disks with 4 OSDs/node, using 2GBit links (bonding). I'm getting up to 120MB/s, and i'm wondering if 10GBit links would still improve rates in this case...

tell exactly what benchmark do you run and I can run a similar one in our lab.

udo · Feb 24, 2014

jleg said:
Hi,

may i ask what "not so good" means in figures? I'm running a test ceph installation, 3 nodes, each with 4 SATA disks with 4 OSDs/node, using 2GBit links (bonding). I'm getting up to 120MB/s, and i'm wondering if 10GBit links would still improve rates in this case...

Hi,
in my setup I have 4 storage-nodes with 52 4TB-hdds (13 in each host). An separate 10GB-Ceph cluster connection and 10GB to the pve-hosts. My reads are only app. 40 MB/s (inside VM) - with enabled scrub and deep-scrub 25MB/s only...
If I clear the VM cache and read the File again (from the cache of the osd-hosts) I got 177MB/s - that is ok for one thread.

If I test the single components, all looks ok - network speed 9.7GB/s with iperf; reading of an single disk is fast and if I move hdds from one node to the other the "rebuild" has values of 400MB/s, sometimes 1237 MB/s and so on...
I switch the mon-host from the osd-host, switch the osds from self-formating xfs to chephdeploy (little different format parameter (inodes)), change the OS from debian to ubuntu (perhaps an driver problem?) and at this time I move all disks from one node to an new one, because the old has an strange format of the omap-files ( /var/lib/ceph/osd/ceph-26/current/omap/003543.ldb instead of 003543.sst) - perhaps after that all it's better?!

Udo

bradmc · Feb 25, 2014

Is there a reason for staying with 0.67.x instead of going with 0.72.2 right now? I've been running a 0.72.2 cluster and it seems quite stable.

dietmar · Feb 25, 2014

bradmc said:
Is there a reason for staying with 0.67.x instead of going with 0.72.2 right now?

Yes. Users claimed that official ceph supports contracts needs 0.67.

jleg · Feb 25, 2014

udo said:
Hi,
in my setup I have 4 storage-nodes with 52 4TB-hdds (13 in each host). An separate 10GB-Ceph cluster connection and 10GB to the pve-hosts. My reads are only app. 40 MB/s (inside VM) - with enabled scrub and deep-scrub 25MB/s only...
If I clear the VM cache and read the File again (from the cache of the osd-hosts) I got 177MB/s - that is ok for one thread.

If I test the single components, all looks ok - network speed 9.7GB/s with iperf; reading of an single disk is fast and if I move hdds from one node to the other the "rebuild" has values of 400MB/s, sometimes 1237 MB/s and so on...
I switch the mon-host from the osd-host, switch the osds from self-formating xfs to chephdeploy (little different format parameter (inodes)), change the OS from debian to ubuntu (perhaps an driver problem?) and at this time I move all disks from one node to an new one, because the old has an strange format of the omap-files ( /var/lib/ceph/osd/ceph-26/current/omap/003543.ldb instead of 003543.sst) - perhaps after that all it's better?!

Udo

interesting figures; do you use SSDs for journals? i am using one SSD per 4 OSDs - which is the recommended maximum. Without these separate journals, my throughputs are significantly smaller; i also increased journal size to 10GB/OSD (default was only 1GB, if i remember correctly).
I used a Win2k8r2 for testing; image caching also yielded in differents speeds - i found "writeback" to be fastest, and it also "scaled" quite good: running crystaldisk on 3 different Ceph-Volumes in parallel, each Volume still has > 60MB/sec (Random Read/Write Test, Block Size = 512KB). I found this to be quite promising...

udo · Feb 25, 2014

jleg said:
interesting figures; do you use SSDs for journals? i am using one SSD per 4 OSDs - which is the recommended maximum. Without these separate journals, my throughputs are significantly smaller; i also increased journal size to 10GB/OSD (default was only 1GB, if i remember correctly).
I used a Win2k8r2 for testing; image caching also yielded in differents speeds - i found "writeback" to be fastest, and it also "scaled" quite good: running crystaldisk on 3 different Ceph-Volumes in parallel, each Volume still has > 60MB/sec (Random Read/Write Test, Block Size = 512KB). I found this to be quite promising...

Hi,
yes I'm using SSD for journaling (3GB file, not partition), but only 2 SSDs for 13 osd-disks. But the performances issue is also during reads and the SSDs are wonly for writes, so I guess it's has nothing to to with this...

Udo

e100 · Feb 25, 2014

I've not tested CEPH with SSD disks for journals or for OSDs.
In my setup of 4 nodes and 12 crappy (old, used, ready to fail ) SATA disks for OSDs performance was barely acceptable.

I did find it promising that I was unable to permenatly break CEPH even when disks failed (as expected would happen)

Search

Search

is CEPH stable for environment production critical?

cesarpk

Renowned Member

wahmed

Famous Member

udo

Distinguished Member

cesarpk

Renowned Member

dietmar

Proxmox Staff Member

mir

Famous Member

dietmar

Proxmox Staff Member

jleg

Member

mir

Famous Member

dietmar

Proxmox Staff Member

tom

Proxmox Staff Member

udo

Distinguished Member

bradmc

Member

dietmar

Proxmox Staff Member

jleg

Member

udo

Distinguished Member

e100

Famous Member

We value your privacy