Add previously OSD to a new installation of a 3 nodes Proxmox CEPH

Gilberto Ferreira

Renowned Member
Hi

I have 3 nodes Proxmox Cluster with CEPH, and after a crash, I have to reinstall Proxmox from scratch, along with Ceph.
OSD are intact.
I already did ceph-volume lvm activate --all and the OSD appears with ceph-volum lvm list and I got a folder with the name of the OSD under /var/lib/ceph/osd.
However is not appear in ceph osd tree or ceph -s or even in the web gui.
Is there any way to re-add this OSD to Proxmox CEPH?

Thanks a lot for any help.


Best Regards
 
Is there any way to re-add this OSD to Proxmox CEPH?
Everything is possible in the Web-Gui, no CLI-fu required: 1) remove it "officially" from Ceph 2) wipe the disk 3) add it as a new OSD.

At least that's what I would do.

If you think to re-add the content of that OSD: Ceph should have repaired/re-balanced itself a short while after the crash by shuffling around the actual content. The content of that old OSD is probably irrelevant and can not get re-integrated into the Ceph pool, afaik.

Disclaimer: I am not using Ceph currently!
 
  • Like
Reactions: Johannes S
Everything is possible in the Web-Gui, no CLI-fu required: 1) remove it "officially" from Ceph 2) wipe the disk 3) add it as a new OSD.

At least that's what I would do.

If you think to re-add the content of that OSD: Ceph should have repaired/re-balanced itself a short while after the crash by shuffling around the actual content. The content of that old OSD is probably irrelevant and can not get re-integrated into the Ceph pool, afaik.

Disclaimer: I am not using Ceph currently!
Thank you for you replay @UdoB
But I think if I do a wipe disk I will permanently lose the data, isn't?
is there a way to do without lose the data?

Perhaps there is a misunderstanding...
I lost all 3 server, but the OSD's disks.
After I re-install the PVE and CEPH on the 3 node, I have had recreated the ceph cluster.

Thanks
 
Last edited:
But I think if I do a wipe disk I will permanently lose the data, isn't?
Yes - all data stored on that OSD.

But: "If you think to re-add the content of that OSD: Ceph should have repaired/re-balanced itself a short while after the crash by shuffling around the actual content. The content of that old OSD is probably irrelevant and can not get re-integrated into the Ceph pool, afaik."

Someone else should confirm (or confute) my statement...
 
No, you can't wipe the disks if you want to use the data in them. Don't remember the exact steps ATM, can't check them out and isn't super trivial to carry out. You are essentially in disaster recovery scenario. From the top of my mind, you need to deploy one MON and MGR. Export the monmap/crushmap from one of those OSD and import it in the MON. Then you could use something like ceph-volume lvm activate --all on every node to activate the LVM in each disk so Ceph can see it. There are probably more steps involved.

The only practical recommendation I can give you right now is to setup a testing scenario using PVE as VMs and practice the needed steps before messing with the real data.
 
  • Like
Reactions: Gilberto Ferreira
No, you can't wipe the disks if you want to use the data in them. Don't remember the exact steps ATM, can't check them out and isn't super trivial to carry out. You are essentially in disaster recovery scenario. From the top of my mind, you need to deploy one MON and MGR. Export the monmap/crushmap from one of those OSD and import it in the MON. Then you could use something like ceph-volume lvm activate --all on every node to activate the LVM in each disk so Ceph can see it. There are probably more steps involved.

The only practical recommendation I can give you right now is to setup a testing scenario using PVE as VMs and practice the needed steps before messing with the real data.
No need for rush! Is a test environment... But if you guys, could help me that I will much obliged.
Thanks
 
No, you can't wipe the disks if you want to use the data in them. Don't remember the exact steps ATM, can't check them out and isn't super trivial to carry out. You are essentially in disaster recovery scenario. From the top of my mind, you need to deploy one MON and MGR. Export the monmap/crushmap from one of those OSD and import it in the MON. Then you could use something like ceph-volume lvm activate --all on every node to activate the LVM in each disk so Ceph can see it. There are probably more steps involved.

The only practical recommendation I can give you right now is to setup a testing scenario using PVE as VMs and practice the needed steps before messing with the real data.
Perhaps that's the way?
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds
 
No, you can't wipe the disks if you want to use the data in them. Don't remember the exact steps ATM, can't check them out and isn't super trivial to carry out. You are essentially in disaster recovery scenario. From the top of my mind, you need to deploy one MON and MGR. Export the monmap/crushmap from one of those OSD and import it in the MON. Then you could use something like ceph-volume lvm activate --all on every node to activate the LVM in each disk so Ceph can see it. There are probably more steps involved.

The only practical recommendation I can give you right now is to setup a testing scenario using PVE as VMs and practice the needed steps before messing with the real data.
Hi
Do I need to add a mon/mgr AFTER or BEFORE I issue the command
Code:
ceph-volume lvm activate --all
?
Because as far I can tell, the old OSD are available only after ceph-volume lvm activate --all, right?
So in order to get some information from those OSD I need to trigger the command ceph-volume lvm activate --all beforehand.
 
Last edited:
Hi again...
I have reinstall all Proxmox nodes and install ceph on each node.
Create the mons and mgr on eatch node.
I have issue the command ceph-volume lvm activate --all, on each node, in order bring up the /var/lib/ceph/osd/<node>
After that I ran this commands:
ceph-volume lvm activate --all
mkdir /root/mon-store
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --no-mon-config --op update-mon-db --mon-store-path mon-store/
ceph-monstore-tool mon-store/ rebuild -- --keyring /etc/pve/priv/ceph.client.admin.keyring --mon-ids pve01 pve02 pve03
mv /var/lib/ceph/mon/ceph-pve01/store.db/ /var/lib/ceph/mon/ceph-pve01/store.db-bkp
cp -rf mon-store/store.db/ /var/lib/ceph/mon/ceph-pve01/
chown -R ceph:ceph /var/lib/ceph/mon/ceph-pve01/store.db

I did every above commands on each node.

But now I got nothing!
No monitor, no manager, no osd, none!

Perhaps somebody point me what I did wrong.

Thanks
 

Attachments

  • Captura de imagem_20250820_133155.png
    Captura de imagem_20250820_133155.png
    99.6 KB · Views: 1
Last edited: