Howto recover from single ceph node?

jsterr

Renowned Member
Jul 24, 2020
796
228
68
33
Hi Community,

I tried to recover (testing setup only) from a single ceph node, after loosing 2 nodes of a 3-node-ceph cluster (completly, nodes wont ever come back again)

I managed to fix PVE-Corosync-Quorum but cant get ceph running again, as I cant do any commands or actions via commandline. Also not reading out the current mon-map etc. I managed to get two new pve nodes in the cluster, but cant get ceph running again.

Does anyone now how I can recover from a single ceph node (which still has my data) to a fully working 3 node ceph again?

Thanks for your help,
Jonas
 
First, what about the MONs? That is a bit unclear to me. If you have only one MON left, you need to modify the monmap.

Check the mon status first to know what state it is in, https://docs.ceph.com/en/latest/rad...hooting-mon/#using-the-monitor-s-admin-socket ( and the "understanding mon status section right after).

If it has the other MONs still in the monmap, extract the monmap and remove them. See https://docs.ceph.com/en/reef/rados.../#removing-monitors-from-an-unhealthy-cluster

After that, the single remaining MON should start and be quorate. Then you can add more nodes and new MONs, OSDs and so on to get back up a good cluster.
Assuming that the single remaining node has one replica of all PGs, as in, there are no "unfound" PGs in the cluster.


I hope this helps :)
 
  • Like
Reactions: jsterr
First, what about the MONs? That is a bit unclear to me. If you have only one MON left, you need to modify the monmap.

Check the mon status first to know what state it is in, https://docs.ceph.com/en/latest/rad...hooting-mon/#using-the-monitor-s-admin-socket ( and the "understanding mon status section right after).

If it has the other MONs still in the monmap, extract the monmap and remove them. See https://docs.ceph.com/en/reef/rados.../#removing-monitors-from-an-unhealthy-cluster

After that, the single remaining MON should start and be quorate. Then you can add more nodes and new MONs, OSDs and so on to get back up a good cluster.
Assuming that the single remaining node has one replica of all PGs, as in, there are no "unfound" PGs in the cluster.


I hope this helps :)

Hi Aaron! Thanks! It did help! Steps: Recover PVE-Quorum, edit Monmap on last node, delete old osds, add 2 new nodes, create mon, create osds, recover.
 
  • Like
Reactions: aaron