ceph tuning

RobFantini

Famous Member
May 24, 2012
2,009
102
133
Boston,Mass
I'm going to make a new ceph test cluster and have some tuning questions.

We'll be using a 4 disk raid-10 on each node + a hot spare . There will be one OSD per node. We do not need super fast I/O , instead our priority is high availability + always great keyboard response. We'll use this ssd for journal: Intel DC S3700 Series 200GB .

Does anyone know how we could set these:

* replica of 2

*
permanently set OSD noout ?

*the "mon osd downout subtree limit" set to "host"



thank you and best regards
Rob





 
Does anyone know how we could set these:

permanently set OSD noout ?
Hi,
this has the disadvantage, that your cluster isn't healthy...

I use an simple monitoring script, which check how much osds are down and if "enough" osd down, the noout flag will be set.
So I have an healthy cluster and if one node fails the osds are not reorginized - but I must bring up the failed node again ;)

icinge check every minute with this script:
Code:
cat /usr/local/scripts/ceph_set_noout.sh
#!/bin/bash
#
# ceph_set_noout.sh setzt automatisch das noout-flag (damit kein recovery auf andere Nodes stattfindet).
# wenn mehr als unter max_osd_down definierte OSDs down sind.

max_osd_down=5
osdmap=`/usr/bin/ceph --keyring /var/lib/icinga/ceph.keyring -c /var/lib/icinga/ceph.conf -s | grep osdmap`
osd=`echo $osdmap | awk '{print $3 }'`
osd_up=`echo $osdmap | awk '{print $5 }'`
osd_in=`echo $osdmap | awk '{print $7 }'`
down=`echo "$osd - $osd_up"| bc`
perfdata="|osd=$osd;up=$osd_up;in=$osd_in"
if [ $down -gt $max_osd_down ]
  then
    echo "$down osd are down; ceph osd set noout $perfdata"
    /usr/bin/ceph --keyring /usr/local/icinga/ceph.keyring -c /usr/local/icinga/ceph.conf osd set noout
    exit 2
fi
if [ $down = 0 ]
  then
    echo "all $osd osd are up $perfdata"
    exit 0
  else
    echo "$down osd are down $perfdata"
    exit 1
fi
Udo
 
Udo, what % od osd's should this be: max_osd_down=5 ?

and it looks like use icinga ? as I don't will attempt your script from cron.
 
Udo, what % od osd's should this be: max_osd_down=5 ?

and it looks like use icinga ? as I don't will attempt your script from cron.
Hi,
yes this script will be run from icinga - to get also an alarm if one nodes fails.

If less (or equal) osds down then max_osd_down then the noout flag isn't set. In my case each osd-node has 12 osds - only if more then 5 osds are down the noout flag will prevent an automatic resync.

Udo
 
So if I have one OSD per node and 3 nodes , = 3 OSD's

what should max_osd_down be set to ? 1 ?

If 5 osd's then 3?
 
Last edited:
Udo: OK I made the changes..... Next question is:

If I take a node off line for maintenance , when it is put back on is there anything that needs to be done?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!