[SOLVED] The new released Ceph Luminous 12.2.8 on "debian stretch pve-no-subscription" cannot Activate OSD

judexzhu

Member
Aug 22, 2018
12
0
6
45
Hi everyone,

I freshly installed the proxmox+ceph today with Ceph Luminous 12.2.8 three times. (I believed it was just released recently.)

Installing proxmox , create and join cluster, pveceph install and pveceph create mon seems all successful.

But whatever I did, I cannot bring the osd up and in with "pveceph createosd /dev/sdX". It was failed for all the three times.

The OSD status always shows down and out.

But the Ceph shows healthy.

Code:
root@sspve01:/etc/pve# ceph -v
ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)

Code:
root@sspve01:/etc/pve# ceph -s
  cluster:
    id:     9ed02f4c-f2d2-452a-a8a2-868c7456d691
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum sspve01,sspve02,sspve03
    mgr: sspve01(active), standbys: sspve02, sspve03
    osd: 3 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   0B used, 0B / 0B avail
    pgs:

Code:
root@sspve01:/etc/pve# ceph osd tree
ID CLASS WEIGHT TYPE NAME    STATUS REWEIGHT PRI-AFF
-1            0 root default
 0            0 osd.0          down        0 1.00000
 1            0 osd.1          down        0 1.00000
 2            0 osd.2          down        0 1.00000

Code:
root@sspve01:/etc/pve# ceph osd status
+----+------+-------+-------+--------+---------+--------+---------+------------+
| id | host |  used | avail | wr ops | wr data | rd ops | rd data |   state    |
+----+------+-------+-------+--------+---------+--------+---------+------------+
| 0  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,new |
| 1  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,new |
| 2  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,new |
+----+------+-------+-------+--------+---------+--------+---------+------------+

Code:
root@sspve01:/etc/pve# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon osd.0
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-osd@.service.d
           └─ceph-after-pve-cluster.conf
   Active: failed (Result: exit-code) since Thu 2018-09-13 20:05:20 CST; 54min ago
  Process: 11905 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 0 (code=exited, status=1/FAILURE)

Sep 13 20:04:59 sspve01 systemd[1]: ceph-osd@0.service: Unit entered failed state.
Sep 13 20:04:59 sspve01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Service hold-off time over, scheduling restart.
Sep 13 20:05:20 sspve01 systemd[1]: Stopped Ceph object storage daemon osd.0.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Start request repeated too quickly.
Sep 13 20:05:20 sspve01 systemd[1]: Failed to start Ceph object storage daemon osd.0.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Unit entered failed state.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.


Code:
root@sspve01:/etc/pve# tail -f /var/log/ceph/ceph-osd.0.log
2018-09-13 19:47:28.329246 7fce30773e00  1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD) mkfs already created
2018-09-13 19:47:28.329257 7fce30773e00  1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD) _fsck repair (shallow) start
2018-09-13 19:47:28.329316 7fce30773e00  1 bdev create path /var/lib/ceph/tmp/mnt.bZfwdD/block type kernel
2018-09-13 19:47:28.329327 7fce30773e00  1 bdev(0x563691f12b40 /var/lib/ceph/tmp/mnt.bZfwdD/block) open path /var/lib/ceph/tmp/mnt.bZfwdD/block
2018-09-13 19:47:28.329620 7fce30773e00  1 bdev(0x563691f12b40 /var/lib/ceph/tmp/mnt.bZfwdD/block) open size 1199532126208 (0x11749afb000, 1.09TiB) block_size 4096 (4KiB) rotational
2018-09-13 19:47:28.329729 7fce30773e00 -1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.bZfwdD/block fsid c4e63d4c-b31e-4a33-8a7a-fb791d420709 does not match our fsid a0ee6c99-ffb0-47a8-8d2a-0b5945e88048
2018-09-13 19:47:28.329746 7fce30773e00  1 bdev(0x563691f12b40 /var/lib/ceph/tmp/mnt.bZfwdD/block) close
2018-09-13 19:47:28.628763 7fce30773e00 -1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD) mkfs fsck found fatal error: (5) Input/output error
2018-09-13 19:47:28.628810 7fce30773e00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
2018-09-13 19:47:28.628914 7fce30773e00 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.bZfwdD: (5) Input/output error

It looks it cannot create files in the /var/lib/ceph/osd folder

Code:
-- Unit ceph-osd@0.service has begun starting up.
Sep 13 21:01:34 sspve01 ceph-osd-prestart.sh[16446]: OSD data directory /var/lib/ceph/osd/ceph-0 does not exist; bailing out.
Sep 13 21:01:34 sspve01 systemd[1]: ceph-osd@0.service: Control process exited, code=exited status=1
Sep 13 21:01:34 sspve01 systemd[1]: Failed to start Ceph object storage daemon osd.0.
-- Subject: Unit ceph-osd@0.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@0.service has failed.
--
-- The result is failed.
Sep 13 21:01:34 sspve01 systemd[1]: ceph-osd@0.service: Unit entered failed state.
Sep 13 21:01:34 sspve01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.

The /var/lib/ceph/osd does emtpy. But the privileges seems okay.

Code:
root@sspve01:/etc/pve# ls -al /var/lib/ceph/osd/
total 8
drwxr-xr-x  2 ceph ceph 4096 Sep  5 16:18 .
drwxr-x--- 11 ceph ceph 4096 Sep 13 19:38 ..

Does anyone have the same situation just like this?

Is this a bug? If yes, when it can be fixed ? For I have been install proxmox and ceph a lot of times before. this is the first time that I failed to bring ceph osd up and in.

I would like to thank you for any idea or suggest, if you need me to provide more information, I would be love to.

Thanks again.
 
pls post your:

> ceph versions
 
Sorry, here is it.

Code:
root@sspve01:/etc/pve# ceph versions
{
    "mon": {
        "ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)": 3
    },
    "mgr": {
        "ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)": 3
    },
    "osd": {},
    "mds": {},
    "overall": {
        "ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)": 6
    }
}
 
Try to overwrite the first ~200MB of the disk with dd, especially if it was used before, as there may be some leftovers.
 
@Alwin. Sorry for reply late.

You're awesome!

Code:
dd if=/dev/zero of=/dev/sdb bs=200M count=1

fixed my problem.

Thank you so much.

PS: Is there any way I can mark this thread as Solved"?
 
Last edited: