Recfreating ceph, cannot create OSD

OrdinarySteve

New Member
Jul 16, 2020
5
0
1
50
Hello!

I've been playing with ceph and after initially getting it working, I realised I had everything on the wrong network. So I deleted and removed everything, uninstalled and reinstalled ceph and began recreating the cluster using the correct network. I had to destroy / zap some drives on two servers and mostly, I've got things working again. However, when I try to create a new OSD on one of the servers, I get

unable to open file '/var/lib/ceph/bootstrap-osd/ceph.keyring.tmp.1694' - No such file or directory (500)

I have two out of three OSDs installed, I just cannot figure out how to proceed with this final server. Is this a leftover from the previous installation, or did someting not get installed properly when I tried to re-create everything?
 
Hi!

For the next time you want to change the network: that really does not need a full reinstallation. You can simply edit the cluster_network and public_network keys in /etc/pve/ceph.conf. If you only change the private_network it's normally enough to restart just all OSD once, if you change public or both networks also restart all monitors and other ceph daemons (or do a brute full cluster reboot if it's not yet a live production system).


unable to open file '/var/lib/ceph/bootstrap-osd/ceph.keyring.tmp.1694' - No such file or directory (500)

Sounds like the bootstrap directories are gone, does /var/lib/ceph/bootstrap-osd/ exists? If not use mkdir to create it and retry adding the OSD.
 
thank you for your help! I'm about to walk away from the PC for a while but I'll give this a try tonight and see how it goes.
 
Hi!

For the next time you want to change the network: that really does not need a full reinstallation. You can simply edit the cluster_network and public_network keys in /etc/pve/ceph.conf. If you only change the private_network it's normally enough to restart just all OSD once, if you change public or both networks also restart all monitors and other ceph daemons (or do a brute full cluster reboot if it's not yet a live production system).




Sounds like the bootstrap directories are gone, does /var/lib/ceph/bootstrap-osd/ exists? If not use mkdir to create it and retry adding the OSD.
This exactly solved my problem. Not sure why it gets created on some servers but not others