[SOLVED] Problem with first osd after upgrade and restart to "octopus" version 15

hape

Renowned Member
Jun 10, 2013
75
5
73
Hello all,

as recommended i've updated CEPH and the first osd of a 3 member cluster in a 6.X PVE installation. I want to upgrade to PVE 7 and just want to upgrade CEPH first. After executing the command

Code:
systemctl restart ceph-osd.target

as described in the wiki, now the osd is shown as down for about 20 hours.

The command "ceph status" shows currently the following:

Code:
root@pmx-xx-02:~# ceph status
  cluster:
    id:     b72d333f-88b6-4d41-aff0-d286a62b3c56
    health: HEALTH_WARN
            clock skew detected on mon.2
            noout flag(s) set
            1 osds down
            1 host (1 osds) down
            Degraded data redundancy: 171244/513732 objects degraded (33.333%), 65 pgs degraded, 65 pgs undersized
 
  services:
    mon: 3 daemons, quorum 0,1,2 (age 10h)
    mgr: pmx-mm-03(active, since 20h), standbys: pmx-mm-01, pmx-mm-02
    osd: 3 osds: 2 up, 3 in
         flags noout
 
  data:
    pools:   2 pools, 65 pgs
    objects: 171.24k objects, 665 GiB
    usage:   1.9 TiB used, 2.9 TiB / 4.9 TiB avail
    pgs:     171244/513732 objects degraded (33.333%)
             65 active+undersized+degraded
 
  io:
    client:   2.9 MiB/s rd, 27 KiB/s wr, 24 op/s rd, 2 op/s wr

Is it possible to see the state of the conversion process, or is that process died?
 
Hello,

the status is unimproved.

Do i have to remove the osd and add it as a new one?

Any idea, what shall i do?
 
Hello all,

i got it. After the first OSD was going to convert to the new format, this process hung. I have to clean the physical disk of all lvm/ceph information. After that the recreate was successfully.

Btw i installed all the 3 nodes new with version 7.1 and made a restore of all VMs into the new CEPH-cluster.

Now all looks fine and the CEPH-Storage is running well.