In a 3-node cluster, how will a disk pool be affected if inserted an older HDD clone in it?

FSNaval

New Member
Jan 13, 2024
23
0
1
Hi redditers,

In my 3 node CEPH cluster, i have a NVME pool in of which shows 35% weardown. I took this node down, removed the NVME drive and my plan is the following, in order to be able to utilize my drives until they reach their maximum life:

1. Clone this drive to another of same or bigger capacity

2. Install the old drive back to the cluster until complete failure.

3. Replace the failed drive with the new cloned drive.

The drawback of the above method is that when install the old drive in the pool and until it fails completely, data will be written in the pool.

If my understanding is correct, when i insert the new cloned drive, CEPH will start rebuilding the pool; i.e. will start adding the missing data from the other two drives of the pool to the new one.

Is my understanding correct? Or i am going to screw everything?
 
This isnt really how ceph works. each OSD contains fractions of placement groups which are not really useful in and of themselves. The first layer of fault recovery is rebalancing to available space on other osds so "spares" dont have any meaning in context.

Just use your drives, and replace them when dead and quit worrying about it ;)
 
This isnt really how ceph works. each OSD contains fractions of placement groups which are not really useful in and of themselves. The first layer of fault recovery is rebalancing to available space on other osds so "spares" dont have any meaning in context.

Just use your drives, and replace them when dead and quit worrying about it ;)

Understood your point.

With my method above, i want to avoid all the hassle of removing/adding the OSD and maybe save some time on rebalancing, since some of the data will already be there (my lan connection is only 1GB). So you don't think the above will work/has any practical usefulness?
 
Consider that any write on a pg that has an "old" member and a "new" member will effectively make the "old" one not relevent, so the only way that this makes any sense is if you dont write anything to the affected pgs. which, of course, makes the whole point moot.
 
Consider that any write on a pg that has an "old" member and a "new" member will effectively make the "old" one not relevent, so the only way that this makes any sense is if you dont write anything to the affected pgs. which, of course, makes the whole point moot.

Will my proposed method not save me time by not having to remove the failed drive, wait for rebalance, insert the new drive, wait for rebalance again? Will i not gain one ceph rebalance by inserting the cloned drive?