ceph create osd

cnu80

New Member
Apr 20, 2016
2
0
1
56
Hi, I am playing with ceph and proxmox.

I installed it several times and documented the commands to get "up and running" fast in our IT environment.

But now I have the problem, that one "create osd command" does not work anymore.

I cannot create the second "osd" at /dev/sda (/dev/sdb --> proxmox):

output from the pveceph commands:
ZAP:
root@cmprox01:/# ceph-disk zap /dev/sda
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! One or more CRCs don't match. You should repair the disk!

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
CREATE
root@cmprox01:/# pveceph createosd /dev/sda
create OSD on /dev/sda (xfs)
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
Setting name!
partNum is 1
REALLY setting name!
The operation has completed successfully.
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
meta-data=/dev/sda1 isize=2048 agcount=4, agsize=121766917 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=487067665, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=237826, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The operation has completed successfully.
But the disk is not mounted and osd not created. Any ideas how to debug?

OSD on another disk works (osd.0) at device /dev/nvme0n1

root@cmprox01:/# ceph -s
cluster 4452a4bf-aa32-4b4b-927f-4923f7f9ac16
health HEALTH_WARN
64 pgs degraded
64 pgs stuck degraded
64 pgs stuck unclean
64 pgs stuck undersized
64 pgs undersized
monmap e1: 1 mons at {0=10.255.150.131:6789/0}
election epoch 2, quorum 0 0
osdmap e5: 1 osds: 1 up, 1 in
pgmap v9: 64 pgs, 1 pools, 0 bytes data, 0 objects
33832 kB used, 233 GB / 233 GB avail
64 active+undersized+degraded
root@cmprox01:/#

root@cmprox01:/# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.23000 root default
-2 0.23000 host cmprox01
0 0.23000 osd.0 up 1.00000 1.00000
root@cmprox01:/#


Maybe old data on the disk from the last installation? But I use the "zap" command to erase the disk.

thanks
 
Hi, osd.1 is up now.

But I had to mount the filesystem manually via
mount -t xfs /dev/sda1 /var/lib/ceph/osd/ceph-1/

and

ceph-disk activate /var/lib/ceph/osd/ceph-1
got monmap epoch 1
2016-04-28 14:47:40.976616 7f4c8ccdb880 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected 3a1b335e-57cf-4f65-9387-a1ebfd5ab312, invalid (someone else's?) journal
2016-04-28 14:47:41.227290 7f4c8ccdb880 -1 filestore(/var/lib/ceph/osd/ceph-1) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
2016-04-28 14:47:41.536567 7f4c8ccdb880 -1 created object store /var/lib/ceph/osd/ceph-1 journal /var/lib/ceph/osd/ceph-1/journal for osd.1 fsid 4452a4bf-aa32-4b4b-927f-4923f7f9ac16
2016-04-28 14:47:41.536604 7f4c8ccdb880 -1 auth: error reading file: /var/lib/ceph/osd/ceph-1/keyring: can't open /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
2016-04-28 14:47:41.541768 7f4c8ccdb880 -1 created new key in keyring /var/lib/ceph/osd/ceph-1/keyring
added key for osd.1
=== osd.1 ===
create-or-move updating item name 'osd.1' weight 1.81 at location {host=cmprox01,root=default} to crush map
Starting Ceph osd.1 on cmprox01...
Running as unit ceph-osd.1.1461847661.775779701.service.
root@cmprox01:~#
 
This is usual scenario when you are recreating an OSD with the same ID. Did you have osd.1 in the cluster before?
The Zap command prepares the disk itself but it does not remove the old ceph osd folder. When you are removing osd, there are some steps that need to be followed specially if you are doing it entirely through CLI. Following is what i use:
1. Stop OSD : ceph osd down osd.1
2. Out OSD : ceph osd out osd.1
3. Remove OSD : ceph osd rm osd.1
4. Remove Authentication : ceph auth del osd.1

In some cases i had to manually delete the old Osd folder in /var/lib/ceph/<osd folder>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!