[SOLVED] The new released Ceph Luminous 12.2.8 on "debian stretch pve-no-subscription" cannot Activate OSD

judexzhu

Member
Aug 22, 2018
12
0
6
43
Hi everyone,

I freshly installed the proxmox+ceph today with Ceph Luminous 12.2.8 three times. (I believed it was just released recently.)

Installing proxmox , create and join cluster, pveceph install and pveceph create mon seems all successful.

But whatever I did, I cannot bring the osd up and in with "pveceph createosd /dev/sdX". It was failed for all the three times.

The OSD status always shows down and out.

But the Ceph shows healthy.

Code:
root@sspve01:/etc/pve# ceph -v
ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)

Code:
root@sspve01:/etc/pve# ceph -s
  cluster:
    id:     9ed02f4c-f2d2-452a-a8a2-868c7456d691
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum sspve01,sspve02,sspve03
    mgr: sspve01(active), standbys: sspve02, sspve03
    osd: 3 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   0B used, 0B / 0B avail
    pgs:

Code:
root@sspve01:/etc/pve# ceph osd tree
ID CLASS WEIGHT TYPE NAME    STATUS REWEIGHT PRI-AFF
-1            0 root default
 0            0 osd.0          down        0 1.00000
 1            0 osd.1          down        0 1.00000
 2            0 osd.2          down        0 1.00000

Code:
root@sspve01:/etc/pve# ceph osd status
+----+------+-------+-------+--------+---------+--------+---------+------------+
| id | host |  used | avail | wr ops | wr data | rd ops | rd data |   state    |
+----+------+-------+-------+--------+---------+--------+---------+------------+
| 0  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,new |
| 1  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,new |
| 2  |      |    0  |    0  |    0   |     0   |    0   |     0   | exists,new |
+----+------+-------+-------+--------+---------+--------+---------+------------+

Code:
root@sspve01:/etc/pve# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon osd.0
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-osd@.service.d
           └─ceph-after-pve-cluster.conf
   Active: failed (Result: exit-code) since Thu 2018-09-13 20:05:20 CST; 54min ago
  Process: 11905 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 0 (code=exited, status=1/FAILURE)

Sep 13 20:04:59 sspve01 systemd[1]: ceph-osd@0.service: Unit entered failed state.
Sep 13 20:04:59 sspve01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Service hold-off time over, scheduling restart.
Sep 13 20:05:20 sspve01 systemd[1]: Stopped Ceph object storage daemon osd.0.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Start request repeated too quickly.
Sep 13 20:05:20 sspve01 systemd[1]: Failed to start Ceph object storage daemon osd.0.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Unit entered failed state.
Sep 13 20:05:20 sspve01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.


Code:
root@sspve01:/etc/pve# tail -f /var/log/ceph/ceph-osd.0.log
2018-09-13 19:47:28.329246 7fce30773e00  1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD) mkfs already created
2018-09-13 19:47:28.329257 7fce30773e00  1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD) _fsck repair (shallow) start
2018-09-13 19:47:28.329316 7fce30773e00  1 bdev create path /var/lib/ceph/tmp/mnt.bZfwdD/block type kernel
2018-09-13 19:47:28.329327 7fce30773e00  1 bdev(0x563691f12b40 /var/lib/ceph/tmp/mnt.bZfwdD/block) open path /var/lib/ceph/tmp/mnt.bZfwdD/block
2018-09-13 19:47:28.329620 7fce30773e00  1 bdev(0x563691f12b40 /var/lib/ceph/tmp/mnt.bZfwdD/block) open size 1199532126208 (0x11749afb000, 1.09TiB) block_size 4096 (4KiB) rotational
2018-09-13 19:47:28.329729 7fce30773e00 -1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.bZfwdD/block fsid c4e63d4c-b31e-4a33-8a7a-fb791d420709 does not match our fsid a0ee6c99-ffb0-47a8-8d2a-0b5945e88048
2018-09-13 19:47:28.329746 7fce30773e00  1 bdev(0x563691f12b40 /var/lib/ceph/tmp/mnt.bZfwdD/block) close
2018-09-13 19:47:28.628763 7fce30773e00 -1 bluestore(/var/lib/ceph/tmp/mnt.bZfwdD) mkfs fsck found fatal error: (5) Input/output error
2018-09-13 19:47:28.628810 7fce30773e00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
2018-09-13 19:47:28.628914 7fce30773e00 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.bZfwdD: (5) Input/output error

It looks it cannot create files in the /var/lib/ceph/osd folder

Code:
-- Unit ceph-osd@0.service has begun starting up.
Sep 13 21:01:34 sspve01 ceph-osd-prestart.sh[16446]: OSD data directory /var/lib/ceph/osd/ceph-0 does not exist; bailing out.
Sep 13 21:01:34 sspve01 systemd[1]: ceph-osd@0.service: Control process exited, code=exited status=1
Sep 13 21:01:34 sspve01 systemd[1]: Failed to start Ceph object storage daemon osd.0.
-- Subject: Unit ceph-osd@0.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@0.service has failed.
--
-- The result is failed.
Sep 13 21:01:34 sspve01 systemd[1]: ceph-osd@0.service: Unit entered failed state.
Sep 13 21:01:34 sspve01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.

The /var/lib/ceph/osd does emtpy. But the privileges seems okay.

Code:
root@sspve01:/etc/pve# ls -al /var/lib/ceph/osd/
total 8
drwxr-xr-x  2 ceph ceph 4096 Sep  5 16:18 .
drwxr-x--- 11 ceph ceph 4096 Sep 13 19:38 ..

Does anyone have the same situation just like this?

Is this a bug? If yes, when it can be fixed ? For I have been install proxmox and ceph a lot of times before. this is the first time that I failed to bring ceph osd up and in.

I would like to thank you for any idea or suggest, if you need me to provide more information, I would be love to.

Thanks again.
 
pls post your:

> ceph versions
 
Sorry, here is it.

Code:
root@sspve01:/etc/pve# ceph versions
{
    "mon": {
        "ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)": 3
    },
    "mgr": {
        "ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)": 3
    },
    "osd": {},
    "mds": {},
    "overall": {
        "ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous (stable)": 6
    }
}
 
Try to overwrite the first ~200MB of the disk with dd, especially if it was used before, as there may be some leftovers.
 
@Alwin. Sorry for reply late.

You're awesome!

Code:
dd if=/dev/zero of=/dev/sdb bs=200M count=1

fixed my problem.

Thank you so much.

PS: Is there any way I can mark this thread as Solved"?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!