CEPH Cluster & Proxmox Cluster

felipe · May 7, 2014

Hi,we are allready running some proxmox hosts. no its is finally time to get ceph running in production

- testing with simple small 3 ceph node cluster went very well....we will use 3 identical machines with 1 proxmox disk, 3 ssd for journals (firefly without journal is experimental only..) and 15 spinners each. plenty of ram & cpuall server will have 4 1g network cards and 2 10g (dual) cardsso we will make bond with the 4 1g cards for virt trafficone bond from the 2 10g cards for osd traffic with rrone bond from the 2 10g cards for monitors with rrconnected with link aggregation to 2 10g switches.should provide best performance?what size of ssd do you use? what kind of ssds? which speed did work for you (one ssd for 5 spinners) ?do you use slightly different ssd's? -> because with similar load etc. equal ssd's would fail more or less the same time... killing everything... i know this will happen in 1,2,3 years so nobody maybe still has experience with it (when they reach thier tbw)some more suggestions?best regardsphilipp

udo · May 10, 2014

Hi,
I use an ceph-cluster with different SSDs (but only initial).

Test your SSDs like discribed here: http://forum.proxmox.com/threads/16715-ceph-perfomance-and-latency?p=86408#post86408 - there are huge differences!

We also monitor the ssd-life-time with icinga to avoid an suddenly break.

Udo

wahmed · May 11, 2014

@felipe, If you are using 15 OSDs per node, i would highly suggest to use OSD and journal on same HDD. With that many spinners you will probably get similar performance as using SSDs, without single point of failure. If a HDD dies, you will only lose the osd and the journal on that HDD instead all of your OSDs going down.

felipe · May 11, 2014

i think my first setup will just be a single host no network to check speed of pure spinnes vs ssd & spinners and the ratio etc. firefly has the new option not to use journals anymore. but they write it is experimental... so not an option for production in 2014 .... :-(

@udo -> your ssd Corsair Force GS has not TBW given. i cant find it somewhere. so maybe they have just very very bad twb and will die soon. i would never user any ssds with tb under 150TB... with a lot of writes in a cluster they will die very fast...
did you try Samsung SSD SM843T Data Center Series ?

udo · May 11, 2014

felipe said:
...
@udo -> your ssd Corsair Force GS has not TBW given. i cant find it somewhere. so maybe they have just very very bad twb and will die soon. i would never user any ssds with tb under 150TB... with a lot of writes in a cluster they will die very fast...
did you try Samsung SSD SM843T Data Center Series ?

Hi,
no I didn't tried the SM843T but the corsair was used since the beginning of my ceph-cluster and all data goes more than one time through the cache-disks (because extension of the osd-disk, reformat with ceph-deploy, moving VMs inside pools and so on). The ceph-cluster hold 126TB data (2*63TB) and the corsairs works without trouble yet...
I would not say, that they "die very fast"... but first if one SSD die, I can say which vendor/type is better.

Udo

felipe · May 12, 2014

so many options... ufff
but if i use the option of 3 osd per ssd i think with more then 10 osd's its not woth to use ssds. because for 1 ssd i can buy 2-3 1TB spinners. so i will have tice the amount of spinners for the same money but with less speed because of journal on same disk.
but then i still have more HDD space in total. still i will make some tests with ssd's. but using 20 spinner osds per server with journal on it maybe will have the same speed as 9 spinners as osd and 3 ssds as journal for the same price. (but more TB storage)

Search

Search

CEPH Cluster & Proxmox Cluster

felipe

Well-Known Member

udo

Distinguished Member

wahmed

Famous Member

felipe

Well-Known Member

udo

Distinguished Member

felipe

Well-Known Member

We value your privacy