OSD Creation issues

Paspao

Active Member
Aug 1, 2017
69
2
28
55
Hello,

I am testing to install Proxmox5 on a OVH dedicated server ( EG - 64G E5-1650v3 )

I have 2 SSD disks.

I have issues on creating OSD, as I get a 100MB partition.

- ceph version 12.1.1

I already tried to zap the disk with both:

- ceph-disk zap /dev/nvme1n1

or (as found in another thread )

- dd if=/dev/zero of=/dev/nvme1n1 bs=10000000000 count=1

I created OSD with:

pveceph createosd /dev/nvme1n1


create OSD on /dev/nvme1n1 (xfs)
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
Creating new GPT entries.
The operation has completed successfully.
Setting name!
partNum is 0
REALLY setting name!
The operation has completed successfully.
Setting name!
partNum is 1
REALLY setting name!
The operation has completed successfully.
The operation has completed successfully.
meta-data=/dev/nvme1n1p1 isize=2048 agcount=4, agsize=6400 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0
data = bsize=4096 blocks=25600, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=864, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.


Disk /dev/nvme1n1: 419.2 GiB, 450098159616 bytes, 879097968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 8618651E-3E2B-4D1B-BEF5-CD4858B33471

Device Start End Sectors Size Type
/dev/nvme1n1p1 2048 206847 204800 100M Ceph OSD
/dev/nvme1n1p2 206848 879097934 878891087 419.1G unknown


pveceph status
{
"fsmap" : {
"epoch" : 1,
"by_rank" : []
},
"quorum_names" : [
"0",
"1"
],
"health" : {
"checks" : {
"MGR_DOWN" : {
"message" : "no active mgr",
"severity" : "HEALTH_WARN"
}
},
"status" : "HEALTH_WARN"
},
"pgmap" : {
"bytes_total" : 0,
"data_bytes" : 0,
"pgs_by_state" : [],
"num_pools" : 0,
"num_objects" : 0,
"num_pgs" : 0,
"bytes_used" : 0,
"bytes_avail" : 0
},
"mgrmap" : {
"modules" : [
"restful",
"status"
],
"epoch" : 31550,
"standbys" : [],
"active_name" : "",
"active_addr" : "-",
"available_modules" : [],
"available" : false,
"active_gid" : 0
},
"quorum" : [
0,
1
],
"election_epoch" : 12,
"osdmap" : {
"osdmap" : {
"num_osds" : 2,
"num_up_osds" : 1,
"full" : false,
"epoch" : 29,
"num_in_osds" : 1,
"num_remapped_pgs" : 0,
"nearfull" : false
}
},
"monmap" : {
"created" : "2017-08-04 10:46:13.602020",
"fsid" : "6a0235aa-895c-4772-8ca5-f4ba8bd39307",
"modified" : "2017-08-04 10:49:01.462958",
"epoch" : 2,
"features" : {
"optional" : [],
"persistent" : [
"kraken",
"luminous"
]
},
"mons" : [
{
"addr" : "172.16.0.1:6789/0",
"name" : "0",
"rank" : 0,
"public_addr" : "172.16.0.1:6789/0"
},
{
"rank" : 1,
"public_addr" : "172.16.0.2:6789/0",
"name" : "1",
"addr" : "172.16.0.2:6789/0"
}
]
},
"servicemap" : {
"services" : {},
"modified" : "0.000000",
"epoch" : 1
},
"fsid" : "6a0235aa-895c-4772-8ca5-f4ba8bd39307"
}

Could you please give me suggestions to debug this issue?

Which is the best way to clean up, unmount, delete the wrong OSD as I get device is in use error when tryng to delete it?

pveceph destroyosd 0
osd is in use (in == 1)

Thank you.
 
are you on the enterprise repository?
if yes, there are some changes which are not available there yet.

by default ceph luminous create bluestore osds which only have a 100 mb metadata partition and uses the remaining device directly
then you have to create a manager which we will be doing when creating a monitor automatically

also if you want to destroy an osd, you have to first set it to 'out' and stop it

please also note, that ceph luminous is still only a release candidate and not yet suited for production
 
Thank you for your reply.

First install was done with pve-enterprise (OVH image template) then I moved to pvetest repo.

I am a newbie with Ceph. Your ceph server documentation gives only commands to install/create/add with pveceph.
But when things go wrong it is not explained how to delete or clean things to start again, and looking to original Ceph docs is not easy as Proxmox Ceph configuration is different and some commands are missing so is really difficult to fix things.

Could you please add some more troubleshooting documentation about Ceph server?

I need to setup an Hyper-converged production environment, is the best option still using Proxmox 4 with Ceph jewel ?

Thank you
P.
 
  • Like
Reactions: curious

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!