pveceph createosd makes "partitions" instead of "osd.x"

ravib123

Active Member
Nov 27, 2012
47
0
26
United States
deusmachine.com
Just setting up a new 8 node cluster.

Each node offers two OSDs.

Looking at this what I am experiencing is that I seem to be capped at 14 OSDs for the whole cluster.

I was curious if this is just a change to Ceph.pm because I found this line:

pg_bits => {

description => "Placement group bits, used to specify the " .

"default number of placement groups.\n\nNOTE: 'osd pool " .

"default pg num' does not work for default pools.",

type => 'integer',

default => 6,

optional => 1,

minimum => 6,

maximum => 14,
 
That maximum is for placement groups and in 'bits' => 2^14 (that should be enough). This is not related to the number of OSDs.
 
try to remove all old data/partitions on these disks, e.g. go to "Disks" and click "Initialize Disk with GPT".

sometimes some old data prevents the creation, I remember similar issue with using old disks from a ZFS setup. I had to totally clean (with dd) the first part of the disk to get it running for Ceph.
 
I saw your previous post about that, these were fresh disks but I did the following:

dd if=/dev/zero of=/dev/sda bs=1000000000 count=1
dd if=/dev/zero of=/dev/sdb bs=1000000000 count=1

then

ceph-disk zap /dev/sda
ceph-disk zap /dev/sdb

then

pveceph createosd /dev/sda
pveceph createosd /dev/sdb

I still end up with only partitions and osd.14 and osd.15 are never created.
 
I have seen this for myself, before you create the OSD in Proxmox create the OSD root folder.

It should then be fine, since the update to Jewel I have noticed quite often Proxmox will fail to create the root OSD folder so when it gets to mounting the OSD it will just fail.
 
So I created:

/var/lib/ceph/osd/ceph-14
/var/lib/ceph/osd/ceph-15

Then re-ran the steps as described above.

It failed to produce different results.

Manual mounting of those did change the "partitions" markings under disks to osd.14 and osd.15 but apparently the script fails fully at that point because the "osd.14" and "osd.15" are not seen by ceph
 
So my next steps:

remove node7, reformat, add back to cluster as node7a.

dd the first 1G of the drives.
pveceph zap the drives
init the drives and make OSD from CLI or from GUI....

It took many fully removals of everything and attempts.

Realistically I didnt do anything different, just a fresh install. There was the same result out of the gate failed to create/mount osd fully, script failed at different points every time but I was left with my disks showing as partitions and no OSD being usable.

I just kept deleting and re-adding the entire osd (osd rm, osd auth del, etc). I suspect it was two fold as the core issue:

1. Proxmox perl scripting failed to create /var/lib/ceph/osd/ceph-13+ (I tried above 15 and it failed still)
2. Manually created osd mount points may have needed different permissions to be used

Useful notes in my resolution:
docs.ceph.com/docs/jewel/rados/operations/add-or-rm-osds/
lists.ceph.com/pipermail/ceph-users-ceph.com/2014-April/038664.html
github.com/ceph/ceph-docker/issues/171
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!