Removal of an OSD in Ceph

Straightman

New Member
Feb 15, 2025
14
4
3
I have a 4 node proxmox cluster on which I setup ceph squid 19.2.0. I am just setting things up so I only have proxmox and ceph running on the nodes so far. I created an osd on each of nodes 1, 2, and 3 and ceph was reporting health OK. I did not go any further so there are no pools created. Only the ceph monitors, manager and osd's. Before I go further I wish to change the harddrive on node 2 so I started the process of removing the osd. I used this series of commands:

Before I started osd.0 (the one I am workign on) shows a status of active+clean and in/up
commands issued:
ceph osd out osd.0 (result was that the osd was marked as out, and up so good so far)
systemctl stop ceph-osd@osd.0 (result is the osd status still shows as out and up and active+clean+remapped)

I read that it can be helpful in small clusters for weight to be set to zero before stopping the osd, so I marked it as back in, a status change which was successfully verified by ceph however the (i guess the PG) is reporting active+clean+remapped.

The problem is I cannot seem to take the status to down after the osd is changed to out.
In this setup there has been no stored data so far and no services actively generating any data and there are no ceph pools created yet.
Hopefully someone can help me correct things so I can cleanly remove this osd.
 
  • Like
Reactions: nbhlp25
I think I have made some progress: I have one osd on each of three pve nodes the default number of replicas is set to 3 with a min of 2. Since I set one of those osd's to out, the system only has two osd's onto which to remap data and since the number of replicas is three then the system cannot complete the remapping. This I think is responsible for the active+clean+remapped pg status.
To resolve this I set the osd in question back to a status of in, followed by a change to the default replicas size to 2 with min of 2. I then changed the osd node status back to out, which reported a placement group status of active+clean which is healthy.

Good progress however I still cannot stop the osd, (systemctl stop ceph-osd@osd.0 does not produce an error and does not change the node status).
Any ideas to try or sources to read are appreciated.
 
  • Like
Reactions: nbhlp25