Ceph OSD disk replacement: Prevent multiple re-balancing of data

7thSon

Member
Feb 26, 2018
13
3
23
38
Hi

We are running a Ceph cluster with 3 nodes and 2 OSDs per node on SSD drives.
One of the nodes has recently been causing problems and both OSDs in this node sporadically report "log_latency_fn slow operation observed for _txc_committed_kv" in the log and the write performance is also poor in some cases. I can reproduce this by repeatedly creating 100MB files via dd, for example. This is usually relatively fast with 200-300 MB/s, but sometimes I reach only under 10 MB/s and exactly in these moments the mentioned message is logged).

I would now like to replace the two affected OSDs / their SSDs.
What is the best way to avoid multiple re-balancing of the data (due to the changes to the crush-map when adding/removing OSDs)?
I would have imagined the following procedure:

1. set norebalance flag
2. stop OSDs of the faulty discs and set them to out
3. install new discs and create new OSDs
4. deactivate norebalance

Does this make sense or does anyone have a better suggestion?
 
Assuming this is 3X replication (it isn’t explicitly stated):

1. set norebalance and no backfill flags
2. stop OSDs of the faulty discs, set them to out, and then destroy OSD to remove from crushmap.
3. install new discs and create new OSDs
4. deactivate Global OSD flags, begin rebalance
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!