[SOLVED] Ceph OSD recovery

starblazer · May 25, 2020

Hi! So, I lost two server boot drives and I need to recreate my cluser and get ceph started again.

I apparently need to recreate the /var/lib/ceph/osd/ceph-* directories and get them mounted... however, I cannot figure out for the live of google and me to get them mounted.

I see the lv/vgs
I activate them..
they still don't mount.

If I didn't have reduced data availability, I'd just blow the other OSDs away and recreate.

Ideas?

Alwin · May 25, 2020

starblazer said:
Hi! So, I lost two server boot drives and I need to recreate my cluser and get ceph started again.

From the same node? And did you have more than one MON? What does ceph -s & ceph osd tree show?

starblazer said:
I apparently need to recreate the /var/lib/ceph/osd/ceph-* directories and get them mounted... however, I cannot figure out for the live of google and me to get them mounted.

If ceph packages are installed the base directories will be created.

starblazer · May 25, 2020

Two separate servers in a three server cluster.

From the only machine that lived:

Code:

root@supermicro:~# ceph -s
  cluster:
    id:     d62464d5-4e1f-4167-8177-c82896881270
    health: HEALTH_WARN
            1 filesystem is degraded
            insufficient standby MDS daemons available
            1 MDSs report slow metadata IOs
            7 osds down
            1 host (7 osds) down
            2 pool(s) have non-power-of-two pg_num
            Reduced data availability: 250 pgs inactive
            Degraded data redundancy: 3481906/5222859 objects degraded (66.667%), 215 pgs degraded, 250 pgs undersized
            73 pgs not deep-scrubbed in time
            1 daemons have recently crashed
            too few PGs per OSD (20 < min 30)

  services:
    mon: 2 daemons, quorum supermicro,pve2 (age 92m)
    mgr: supermicro(active, since 92m), standbys: pve2
    mds: cephfs:1/1 {0=supermicro=up:replay}
    osd: 18 osds: 5 up (since 96m), 12 in (since 46m)

  data:
    pools:   2 pools, 250 pgs
    objects: 1.74M objects, 6.6 TiB
    usage:   6.6 TiB used, 21 TiB / 27 TiB avail
    pgs:     100.000% pgs not active
             3481906/5222859 objects degraded (66.667%)
             215 undersized+degraded+peered
             35  undersized+peered

root@supermicro:~#

Code:

root@supermicro:~# ceph osd tree
ID CLASS WEIGHT   TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       62.76027 root default
-3       19.09940     host pve2
 0   hdd  2.72849         osd.0         down  1.00000 1.00000
 1   hdd  2.72849         osd.1         down  1.00000 1.00000
 2   hdd  2.72849         osd.2         down  1.00000 1.00000
 3   hdd  2.72849         osd.3         down  1.00000 1.00000
 9   hdd  2.72849         osd.9         down  1.00000 1.00000
10   hdd  2.72849         osd.10        down  1.00000 1.00000
11   hdd  2.72849         osd.11        down  1.00000 1.00000
-7       16.37091     host pve3
12   hdd  2.72849         osd.12        down        0 1.00000
13   hdd  2.72849         osd.13        down        0 1.00000
14   hdd  2.72849         osd.14        down        0 1.00000
15   hdd  2.72849         osd.15        down        0 1.00000
16   hdd  2.72849         osd.16        down        0 1.00000
17   hdd  2.72849         osd.17        down        0 1.00000
-5       27.28996     host supermicro
 4   hdd  5.45799         osd.4           up  1.00000 1.00000
 5   hdd  5.45799         osd.5           up  1.00000 1.00000
 6   hdd  5.45799         osd.6           up  1.00000 1.00000
 7   hdd  5.45799         osd.7           up  1.00000 1.00000
 8   hdd  5.45799         osd.8           up  1.00000 1.00000
root@supermicro:~#

Alwin said:
If ceph packages are installed the base directories will be created.

The base directory is created but none of the OSD disks /var/lib/ceph/osd/ceph-* wasn't created from the disks.

starblazer · May 26, 2020

Okay, after the 123rd time switching up my google terms.... "mount ceph lvm recovery" isn't useful lol.

Code:

ceph-volume lvm activate --all

fixed it.

Alwin · May 26, 2020

Glad that you could solve it.

starblazer said:
The base directory is created but none of the OSD disks /var/lib/ceph/osd/ceph-* wasn't created from the disks.

They are only created once a OSD was started.

starblazer said:
ceph-volume lvm activate --all

A reboot should do this as well.

starblazer · May 26, 2020

Alwin said:
A reboot should do this as well.

Well, that didn't happen. On two servers, both with a fresh install of 6.2.

I really should test to make sure it comes up after a reboot. I'm sure it will now once it was activated on the new machine.

GTA_doum · Apr 26, 2024

starblazer said:
Okay, after the 123rd time switching up my google terms.... "mount ceph lvm recovery" isn't useful lol.

Code:

ceph-volume lvm activate --all

fixed it.

Like you, rebooting was not enough, only this command restarted the osd! Thanks a million, it's been three days I was looking for a way to start the ceph osds!

Search

Search

[SOLVED] Ceph OSD recovery

starblazer

New Member

Alwin

Proxmox Retired Staff

starblazer

New Member

starblazer

New Member

Alwin

Proxmox Retired Staff

starblazer

New Member

GTA_doum

New Member