Ceph maintenance question

Tmanok

Well-Known Member
Hi everyone,

Quick and simple set of similar questions:
  1. Before restarting a CEPH OSD, should you mark it as "Out"?
  2. Before restarting a monitor, should you do anything?
  3. Before restarting a node, should you do anything?
  4. Before restarting a manager or manager node, should you do anything?
Thanks! I'm just a little concerned that I've been working on production equipment without enough caution lately, feeling nervous.

Tmanok
 
Before restarting a CEPH OSD, should you mark it as "Out"?
Marking an OSD as out, tells Ceph that this OSD should not be part of the cluster anymore. This in turn will cause a rebalancing, as Ceph will recreate the data located on that OSD on other OSDs in the cluster.
This is only something that you want to do if you need to replace the OSD or want to destroy and recreate it.

Normally you don't need to do anything if you just restart a service (OSD, MON, MGR, MDS) as OSDs have redundancy, you might see a warning for a short time until the OSD is back up, but as long as there are copies on 2 other OSDs, restarting one OSD at a time is not a problem. The same goes for monitors as you should always have enough to form a majority (min 3. and usually 3 are enough). MGR and MDS work with fallbacks on the remaining nodes. So if you stop those, another node will take over and become active.

If you plan a longer maintenance on a node where the node might be down for some time, and you want to avoid the rebalancing of data, you can set the "noout", "norecover", "norebalance" global OSD flags for that time. The "noout" should be enough, but the others are just to be on the safe side.
The default timeout is 10 minutes. If an OSD is not back up within that time, Ceph would automatically set it to out.

Once you are done, don't forget to disable these OSD flags again to give back the self-healing possibilities that Ceph has.
 
  • Like
Reactions: BruceX and Tmanok
1)

when an osd process stop, it's first go in "down" state, then after 10min (by default), it's going to "out" state.

when out state is reached, the datas are rebalanced in the cluster.

if you are doing a long maintenance (>10min), and you don't want to reblaance, you can force the "noout" flag.

2)
for monitor, you just need to check that you still have monitor quorum. (so, just restart monitor one by one, and check that you have quorum between each restart)

3) always check that cluster is healty before restart a node. (check that no others nodes osd,mon,...is also down)

4) manager is not important for data access, but you can check than another manager daemon is still running in the cluster.
 
  • Like
Reactions: Tmanok

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!