Unable to remove improperly added OSD

jefm

Member
Nov 15, 2021
8
5
8
47
Hi, I tried to add an osd.17 this morning.
Unfortunately I had a pre-existing entry in ceph.conf containing networking for a previous osd.17, and I hadn't cleaned up after deleting that. I have a back network for OSD traffic so each OSD gets an ceph.conf entry. This resulted in my first ghost OSD, since my handrolled networking config was probably pointing at some other host.

I couldn't remove the OSD from gui since it was present as ghost/not in the OSD screen.
I ran two commands from another post on this form to remove the OSD (page forgotten and my cli scrollback did not get them)

I thought i was in good shape, per the Ceph configuration page, osd 17 does not seem to be in the Crush map.

However:
In Disks, unable to wipe the osd.17 via gui, disk "has a holder"
ceph-osd@17.service is present but red when i run systemctl, tried to disable but no change.
directory /var/lib/ceph/osd/ceph-17 is present on the host, and cannot be deleted/moved, it is "busy".

I had other ideas but figured I should stop messing with it and ask for help.

Thanks in advance, I can post any command output needed.
 
In Disks, unable to wipe the osd.17 via gui, disk "has a holder"
There might be an underlying LVM on that device. Look under <vmid> --> Disks --> LVM and remove it. Then you can "wipe".
 
  • Like
Reactions: jefm
There might be an underlying LVM on that device. Look under <vmid> --> Disks --> LVM and remove it. Then you can "wipe".
Hello and thanks for writing.
Within Node, I did go under Disks/LVM and there was an entry for /dev/sdf that I was able to blow away.

However systemctl still showed an osd.17, and there was still the untouchable directory in /var
I wanted to clean that up but was feeling spicy and just added the OSD anyway, probably not a good idea but it worked.
Then I added the other OSD that I wanted and put them both on fast network. So far good.
And not just good, but better: the systemctl output and /var directories make sense now.