HELP! Deleted everything under /var/lib/ceph/mon on one node in a 4 node cluster

shadyabhi

Member
Feb 14, 2023
12
0
6
I'm stupid :/, and I really need your help. I was following the thread to clear a dead monitor here https://forum.proxmox.com/threads/ceph-cant-remove-monitor-with-unknown-status.63613/post-452396

And as instructed, I deleted the folder named "ceph-nuc10" where nuc10 is my node name under folder /var/lib/ceph/mon. I know, I messed it up.

Now, I get a 500 error checking any of the Ceph panels in Proxmox UI. Is there a way to recovery?

Code:
root@nuc10:/var/lib/ceph/mon# ceph status
2025-02-07T00:43:42.438-0800 7cd377a006c0  0 monclient(hunting): authenticate timed out after 300

[errno 110] RADOS timed out (error connecting to the cluster)
root@nuc10:/var/lib/ceph/mon#

root@nuc10:~# pveceph status
command 'ceph -s' failed: got timeout
root@nuc10:~#

screenshot.png
 
Last edited:
@ness
Can you log onto proxmox and ceph on other 3 nodes?
Thanks for the response. Yes. On bad node, this is what i see:-

Code:
root@nuc10:~# systemctl | grep ceph
  var-lib-ceph-osd-ceph\x2d1.mount                                                     loaded active     mounted   /var/lib/ceph/osd/ceph-1
  ceph-crash.service                                                                   loaded active     running   Ceph crash dump collector
  ceph-mds@nuc10.service                                                               loaded active     running   Ceph metadata server daemon
  ceph-mgr@nuc10.service                                                               loaded active     running   Ceph cluster manager daemon
● ceph-mon@nuc10.service                                                               loaded failed     failed    Ceph cluster monitor daemon
  ceph-osd@1.service                                                                   loaded active     running   Ceph object storage daemon osd.1
  system-ceph\x2dmds.slice                                                             loaded active     active    Slice /system/ceph-mds
  system-ceph\x2dmgr.slice                                                             loaded active     active    Slice /system/ceph-mgr
  system-ceph\x2dmon.slice                                                             loaded active     active    Slice /system/ceph-mon
  system-ceph\x2dosd.slice                                                             loaded active     active    Slice /system/ceph-osd
  system-ceph\x2dvolume.slice                                                          loaded active     active    Slice /system/ceph-volume
  ceph-fuse.target                                                                     loaded active     active    ceph target allowing to start/stop all ceph-fuse@.service instances at once
  ceph-mds.target                                                                      loaded active     active    ceph target allowing to start/stop all ceph-mds@.service instances at once
  ceph-mgr.target                                                                      loaded active     active    ceph target allowing to start/stop all ceph-mgr@.service instances at once
  ceph-mon.target                                                                      loaded active     active    ceph target allowing to start/stop all ceph-mon@.service instances at once
  ceph-osd.target                                                                      loaded active     active    ceph target allowing to start/stop all ceph-osd@.service instances at once
  ceph.target                                                                          loaded active     active    ceph target allowing to start/stop all ceph*@.service instances at once
root@nuc10:~#


On other nodes, I see that all services are also running:-

Code:
root@r730:~# ceph status
^CCluster connection aborted
root@r730:~#

root@r730:~# systemctl | grep ceph
  ceph-crash.service                                                                               loaded active     running   Ceph crash dump collector
  system-ceph\x2dvolume.slice                                                                      loaded active     active    Slice /system/ceph-volume
  ceph-fuse.target                                                                                 loaded active     active    ceph target allowing to start/stop all ceph-fuse@.service instances at once
  ceph-mds.target                                                                                  loaded active     active    ceph target allowing to start/stop all ceph-mds@.service instances at once
  ceph-mgr.target                                                                                  loaded active     active    ceph target allowing to start/stop all ceph-mgr@.service instances at once
  ceph-mon.target                                                                                  loaded active     active    ceph target allowing to start/stop all ceph-mon@.service instances at once
  ceph-osd.target                                                                                  loaded active     active    ceph target allowing to start/stop all ceph-osd@.service instances at once
  ceph.target                                                                                      loaded active     active    ceph target allowing to start/stop all ceph*@.service instances at once
root@r730:~#
 
Just delete that monitor from other nodes, and start clean.
The issue is, `ceph commands` are not working on other hosts, they simply hang.

Code:
root@r730:~# ceph auth get mon
^CCluster connection aborted
root@r730:~# ceph mon remove
^CCluster connection aborted
root@r730:~#

Do you have a suggestion on how to go forward? The UI also gives a 500.
 
>Are the backups working? Do all VM/CTs backups if they are.

I do have backups available from last week. I was dumb, trying to save electricity and my PBS VM is also on Ceph. /facepalm. For now, I'm setting up a new server to act as PBS so I can see if I'll be table to restore from backups.

>Does your cluster not work fine when you turn off node 4?

I'm scared to turn off, thinking all the VMs won't come back up. Currently, I'm currently doing migrations from this node so I can issue a restart.

Till that, if there are more ideas, please do let me know. All VMs are functional, ceph is functional, it's just that we've a broken UI and Proxmox not aware of all these agents.
 
...

was it your ONLY monitor?! post the contents of your /etc/pve/ceph.conf

No, I had 3-4 monitors. Also, `/var/lib/ceph/mon/` is somehow empty on other nodes too :/

Code:
root@r730:~# cat  /etc/pve/ceph.conf
[global]
        auth_client_required = cephx
        auth_cluster_required = cephx
        auth_service_required = cephx
        cluster_network = 192.168.1.3/24
        fsid = c3c25528-cbda-4f9b-a805-583d16b93e8f
        mon_allow_pool_delete = true
        mon_host = 192.168.1.4 192.168.1.6 192.168.1.7 192.168.1.8
        ms_bind_ipv4 = true
        ms_bind_ipv6 = false
        osd_pool_default_min_size = 2
        osd_pool_default_size = 3
        public_network = 192.168.1.3/24

[client]
        keyring = /etc/pve/priv/$cluster.$name.keyring

[client.crash]
        keyring = /etc/pve/ceph/$cluster.$name.keyring

[mds]
        keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.beelink-dualnic]
        host = beelink-dualnic
        mds_standby_for_name = pve

[mds.hp800g9-1]
        host = hp800g9-1
        mds_standby_for_name = pve

[mds.nuc10]
        host = nuc10
        mds_standby_for_name = pve

[mon.beelink-dualnic]
        public_addr = 192.168.1.6

[mon.hp800g9-1]
        public_addr = 192.168.1.8

[mon.nuc10]
        public_addr = 192.168.1.4
 
looks like you should still have 2 living monitors, so this should be recoverable. the admonition to back everything up is a good one.

Once you have that done, turn EVERYTHING off and then turn on all nodes EXCEPT nuc10. your cluster should be back.
 
Hey guys, today is a sad day. When I rebooted, I wasn't able to recover the system, Ceph was still in bad state. I ended up reinstalling Proxmox and then using PBS to recover VMs, but many VMs had their "chunk data" corrupted :/

Thanks again, and I'll go ahead and close this.
 
Hi @ness1602 and @alexskysilk, I've posted a new thread, asking for help on how to recover from existing OSDs. I'll really appreciate some direction.

 
Thank you for response. Alex. That threading is pending approval. :/ Pasting here. That data is important for me.


This 3-node cluster also had a 4th node, which didn't have any OSDs assigned and I've ceph.conf and ceph.client.admin.keyring available from that node (node r730).

Now, I've 3 proxmox nodes reinstalled, brand new cluster. Now, I want to revive the ceph cluster with existing OSDs.

Overall goal: How can I recover the VM images only? That way, I can start them up as a new VM. For recovery, I'm open to adding the "r730" node again, if it simplifies things for us.

In order to validate that the OSDs do exist, I've verified that OSD with following commands.

So far, I've only tried the following commands on one node only, and it gives me a hint that there is a possibility of recovering.

Code:
root@hp800g9-1:~# sudo ceph-volume lvm activate --all
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
--> Activating OSD ID 0 FSID 8df70b91-28bf-4a7c-96c4-51f1e63d2e03
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-a7873caa-1ef2-4b84-acfb-53448242a9c8/osd-block-8df70b91-28bf-4a7c-96c4-51f1e63d2e03 --path /var/lib/ceph/osd/ceph-0 --no-mon-config
Running command: /usr/bin/ln -snf /dev/ceph-a7873caa-1ef2-4b84-acfb-53448242a9c8/osd-block-8df70b91-28bf-4a7c-96c4-51f1e63d2e03 /var/lib/ceph/osd/ceph-0/block
Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-0
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0
Running command: /usr/bin/systemctl enable ceph-volume@lvm-0-8df70b91-28bf-4a7c-96c4-51f1e63d2e03
Running command: /usr/bin/systemctl enable --runtime ceph-osd@0
Running command: /usr/bin/systemctl start ceph-osd@0
--> ceph-volume lvm activate successful for osd ID: 0
root@hp800g9-1:~#

root@hp800g9-1:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op update-mon-db --mon-store-path /mnt/osd-0/ --no-mon-config
osd.0   : 5593 osdmaps trimmed, 0 osdmaps added.
root@hp800g9-1:~#

How do we proceed?

This is the cluster state right now. I've only installed ceph packages so far, nothing else.

Ceph Status:-
Code:
root@hp800g9-1:~# ceph -s
  cluster:
    id:     9c9daac0-736e-4dc1-8380-e6a3fa7d2c23
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum hp800g9-1 (age 17h)
    mgr: hp800g9-1(active, since 17h)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

root@hp800g9-1:~#

Nodes:-
Code:
root@hp800g9-1:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 hp800g9-1 (local)
         2          1 intelnuc10
         3          1 beelinku59pro
root@hp800g9-1:~#
 

Attachments

Thanks. for now, I'm not even sure how to mount OSDs from instructions. I'll keep digging. Thank you @alexskysilk
Code:
root@hp800g9-1:~# ceph-volume lvm list


====== osd.0 =======

  [block]       /dev/ceph-a7873caa-1ef2-4b84-acfb-53448242a9c8/osd-block-8df70b91-28bf-4a7c-96c4-51f1e63d2e03

      block device              /dev/ceph-a7873caa-1ef2-4b84-acfb-53448242a9c8/osd-block-8df70b91-28bf-4a7c-96c4-51f1e63d2e03
      block uuid                s7LJFW-5jYi-TFEj-w9hS-5ep5-jOLy-ZibL8t
      cephx lockbox secret
      cluster fsid              c3c25528-cbda-4f9b-a805-583d16b93e8f
      cluster name              ceph
      crush device class
      encrypted                 0
      osd fsid                  8df70b91-28bf-4a7c-96c4-51f1e63d2e03
      osd id                    0
      osdspec affinity
      type                      block
      vdo                       0
      devices                   /dev/nvme1n1