When node "reboot" automatic "ceph osd set noout" command ?

FrancisS

Well-Known Member
Apr 26, 2019
61
8
48
60
Hello,

On a cluster with hyper-converged Ceph, is it possible to execute automatically the command "ceph osd set noout" when a node is rebooted ? (or shutdown (?)

Best regards.

Francis
 
Hello FrancisS,

Can you explain why you would like to do that?

Automatically setting ceph osd set noout whenever a node is rebooted or shut down would also mean that Ceph would not react appropriately if a node unexpectedly goes offline. In that case, OSDs would remain marked as in, which is usually not desirable.

What problem are you trying to solve, and how long does the reboot process take in your environment?

For reference, a node is typically only considered out after approximately 10 minutes. Therefore, a normal reboot should usually complete long before Ceph marks the OSDs as out.
 
Hello Torbho,

Thank you, I want to execute "ceph osd set noout" only when I reboot a node, not when a node "crash".

The problem, some times the administrator do not execute the "ceph osd set noout" and after a node update "reboot" the node.

After the updates he have to execute manually "ceph osd unset noout".

When a node is rebooted all the OSD of the node go out...

Best regards.
Francis
 
Last edited:
Hello Francis,

I think there may be a misunderstanding.

With a normal reboot, setting noout is usually not required. When the OSDs on a node stop, they are first marked as down. They are only marked as out after the mon_osd_down_out_interval expires (600 seconds by default).

Therefore, during a normal reboot, the OSDs should come back before they are marked as out, making noout unnecessary in most environments.

Are the OSDs actually being marked as out during your maintenance reboots? If so, how long does the reboot take, and what is your configured mon_osd_down_out_interval?

Understanding why the OSDs become out would help determine whether automating noout is the right solution or whether there is an underlying issue to investigate.
 
  • Like
Reactions: UdoB
Hello torbho,

You are right after a reboot the OSDs go "down"... and the system restart before the OSDs go "out".

For the configuration "mon_osd_down_out_interval" we have the default 10mn.

So no "noout" necessary, thank you.

Francis
 
  • Like
Reactions: torbho