ceph tuning

RobFantini

Famous Member
May 24, 2012
2,042
110
133
Boston,Mass
I'm going to make a new ceph test cluster and have some tuning questions.

We'll be using a 4 disk raid-10 on each node + a hot spare . There will be one OSD per node. We do not need super fast I/O , instead our priority is high availability + always great keyboard response. We'll use this ssd for journal: Intel DC S3700 Series 200GB .

Does anyone know how we could set these:

* replica of 2

*
permanently set OSD noout ?

*the "mon osd downout subtree limit" set to "host"



thank you and best regards
Rob





 
Does anyone know how we could set these:

permanently set OSD noout ?
Hi,
this has the disadvantage, that your cluster isn't healthy...

I use an simple monitoring script, which check how much osds are down and if "enough" osd down, the noout flag will be set.
So I have an healthy cluster and if one node fails the osds are not reorginized - but I must bring up the failed node again ;)

icinge check every minute with this script:
Code:
cat /usr/local/scripts/ceph_set_noout.sh
#!/bin/bash
#
# ceph_set_noout.sh setzt automatisch das noout-flag (damit kein recovery auf andere Nodes stattfindet).
# wenn mehr als unter max_osd_down definierte OSDs down sind.

max_osd_down=5
osdmap=`/usr/bin/ceph --keyring /var/lib/icinga/ceph.keyring -c /var/lib/icinga/ceph.conf -s | grep osdmap`
osd=`echo $osdmap | awk '{print $3 }'`
osd_up=`echo $osdmap | awk '{print $5 }'`
osd_in=`echo $osdmap | awk '{print $7 }'`
down=`echo "$osd - $osd_up"| bc`
perfdata="|osd=$osd;up=$osd_up;in=$osd_in"
if [ $down -gt $max_osd_down ]
  then
    echo "$down osd are down; ceph osd set noout $perfdata"
    /usr/bin/ceph --keyring /usr/local/icinga/ceph.keyring -c /usr/local/icinga/ceph.conf osd set noout
    exit 2
fi
if [ $down = 0 ]
  then
    echo "all $osd osd are up $perfdata"
    exit 0
  else
    echo "$down osd are down $perfdata"
    exit 1
fi
Udo
 
Udo, what % od osd's should this be: max_osd_down=5 ?

and it looks like use icinga ? as I don't will attempt your script from cron.
Hi,
yes this script will be run from icinga - to get also an alarm if one nodes fails.

If less (or equal) osds down then max_osd_down then the noout flag isn't set. In my case each osd-node has 12 osds - only if more then 5 osds are down the noout flag will prevent an automatic resync.

Udo
 
So if I have one OSD per node and 3 nodes , = 3 OSD's

what should max_osd_down be set to ? 1 ?

If 5 osd's then 3?
 
Last edited:
Udo: OK I made the changes..... Next question is:

If I take a node off line for maintenance , when it is put back on is there anything that needs to be done?