Unable to create Ceph OSD

kskl93 · Jul 30, 2019

Hello there, I recently upgraded from proxmox 5 to 6 as well as ceph luminous to nautilus. I wanted to go through and re-create the osds I have in my cluster. I ran into an issue with the second osd I wanted to convert (the first went fine). Here's what I get after I zap the disk:

Code:

pveceph createosd /dev/nvme8n2
create OSD on /dev/nvme8n2 (bluestore)
wipe disk/partition: /dev/nvme8n2
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 0.417411 s, 502 MB/s
-->  OSError: [Errno 5] Input/output error: '/var/lib/ceph/osd/ceph-2'
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 7f015a78-6062-4026-9a8a-7895aecf2eda
Running command: /sbin/vgcreate -s 1G --force --yes ceph-a449d0c2-542e-42cb-a5c3-395070ab51ca /dev/nvme8n2
 stdout: Physical volume "/dev/nvme8n2" successfully created.
 stdout: Volume group "ceph-a449d0c2-542e-42cb-a5c3-395070ab51ca" successfully created
Running command: /sbin/lvcreate --yes -l 100%FREE -n osd-block-7f015a78-6062-4026-9a8a-7895aecf2eda ceph-a449d0c2-542e-42cb-a5c3-395070ab51ca
 stdout: Logical volume "osd-block-7f015a78-6062-4026-9a8a-7895aecf2eda" created.
Running command: /usr/bin/ceph-authtool --gen-print-key
--> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.2 --yes-i-really-mean-it
 stderr: 2019-07-30 11:13:50.300 7f3e0f70b700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory
2019-07-30 11:13:50.300 7f3e0f70b700 -1 AuthRegistry(0x7f3e0807ed58) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx
 stderr: purged osd.2
command 'ceph-volume lvm create --cluster-fsid 0504a312-6d92-4443-b250-5e790079210e --data /dev/nvme8n2' failed: exit code 1

I can see it complaining about a missing key but I'm not really sure where that key should be coming from.

Alwin · Jul 31, 2019

How does your ceph.conf look like?

kskl93 · Jul 31, 2019

Here's my ceph.conf

Code:

[global]
     auth client required = cephx
     auth cluster required = cephx
     auth service required = cephx
     cluster network = 172.29.11.0/24
     fsid = this-is-the-fsid
     mon allow pool delete = true
     osd journal size = 5120
     osd pool default min size = 2
     osd pool default size = 3
     public network = 172.29.11.0/24
     mon_host = 172.29.11.11 172.29.11.12 172.29.11.13
[mds]

[osd]

[client]
    keyring = /etc/pve/priv/$cluster.$name.keyring

[mds.prox-ceph2]
     host = prox-ceph2
     mds standby for name = pve

[mds.prox-ceph1]
     host = prox-ceph1
     mds standby for name = pve

[mds.prox-ceph3]
     host = prox-ceph3
     mds standby for name = pve

[mon.prox-ceph2]
     host = prox-ceph2
     mon addr = 172.29.11.12:6789

[mon.prox-ceph1]
     host = prox-ceph1
     mon addr = 172.29.11.11:6789

[mon.prox-ceph3]
     host = prox-ceph3
     mon addr = 172.29.11.13:6789

Alwin · Aug 1, 2019

kskl93 said:
stderr: 2019-07-30 11:13:50.300 7f3e0f70b700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory

It seems, the ceph-volume is looking for the bootstrap keyring at the wrong location. Can you please comment out the keyring in the client section and try again?

kskl93 · Aug 1, 2019

Alwin said:
It seems, the ceph-volume is looking for the bootstrap keyring at the wrong location. Can you please comment out the keyring in the client section and try again?

So the problem seemed to be an OS issue. I ended up rebooting the server and the drives came up in a different order under the /dev/nvme* directory. After that reboot I was able to create the new osd. I left the keyring under the client section of my ceph.conf and everything is working fine.

szucs10 · Sep 12, 2020

Hello!

We are same issue. We have OSD backfill full, cos we added to more osd's. When added to osd when we add the osd, they get a similar error message. None of the nodes in the file you are looking for are there, only the key named pool is there.

We try comment out the client sections, but the ceph GUI section give No such or file directory.

stderr: 2020-09-12 16:03:16.675 7ff0bb288700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory
2020-09-12 16:03:16.675 7ff0bb288700 -1 AuthRegistry(0x7ff0b40817b8) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx
stderr: got monmap epoch 3

We need fast help, cos the pool is nearfull..

Alwin · Sep 14, 2020

@szucs10, please open up a new thread with the current ceph -s, ceph osd df tree and pveversion -v.

huky · Jan 13, 2021

I met the same problem and resolved it.

Code:

# ceph-volume lvm create --bluestore --data /dev/sda --block.wal /dev/nvme0n1p1 --block.db /dev/nvme0n1p7
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 17e6764c-c233-4a08-addf-3479bdab4671
stderr: 2021-01-13 11:10:12.695621 7f4304b52700 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
stderr: 2021-01-13 11:10:12.695719 7f4304b52700 -1 monclient: ERROR: missing keyring, cannot use cephx for authentication
stderr: 2021-01-13 11:10:12.695720 7f4304b52700  0 librados: client.bootstrap-osd initialization error (2) No such file or directory
stderr: [errno 2] error connecting to the cluster
-->  RuntimeError: Unable to create a new OSD id

then I found the file is same:

Code:

ansible pve31 -uroot -m shell -a 'md5sum /var/lib/ceph/bootstrap-osd/ceph.keyring'
172.31.254.1 | CHANGED | rc=0 >>
8ff95ce9ce219ea35ab95de8dc4b89d4  /var/lib/ceph/bootstrap-osd/ceph.keyring

172.31.254.2 | CHANGED | rc=0 >>
8ff95ce9ce219ea35ab95de8dc4b89d4  /var/lib/ceph/bootstrap-osd/ceph.keyring

172.31.254.5 | CHANGED | rc=0 >>
1a37b4090b058a48fffad6da24e8538c  /var/lib/ceph/bootstrap-osd/ceph.keyring

172.31.254.3 | CHANGED | rc=0 >>
8ff95ce9ce219ea35ab95de8dc4b89d4  /var/lib/ceph/bootstrap-osd/ceph.keyring

172.31.254.4 | CHANGED | rc=0 >>
1a37b4090b058a48fffad6da24e8538c  /var/lib/ceph/bootstrap-osd/ceph.keyring

172.31.254.7 | FAILED | rc=1 >>
md5sum: /var/lib/ceph/bootstrap-osd/ceph.keyring: No such file or directorynon-zero return code

172.31.254.9 | FAILED | rc=1 >>
md5sum: /var/lib/ceph/bootstrap-osd/ceph.keyring: No such file or directorynon-zero return code

172.31.254.6 | CHANGED | rc=0 >>
1a37b4090b058a48fffad6da24e8538c  /var/lib/ceph/bootstrap-osd/ceph.keyring

172.31.254.8 | CHANGED | rc=0 >>
1a37b4090b058a48fffad6da24e8538c  /var/lib/ceph/bootstrap-osd/ceph.keyring

just copy

Code:

scp /var/lib/ceph/bootstrap-osd/ceph.keyring 172.31.254.7:/var/lib/ceph/bootstrap-osd/ceph.keyring

then it worked.

michelef · Jul 21, 2021

I've been able to solve this just creating the directory


mkdir /var/lib/ceph/bootstrap-osd

hope this helps.

alext · Feb 28, 2022

In case it's of any use to anyone; I had a similar problem and this thread helped me pinpoint the answer.
In my case, a newly added node that wasn't completely new had the wrong ceph key in a few files. Replacing the key value in the following two files with the key value of an existing cluster member did the trick.

- /var/lib/ceph/bootstrap-osd/ceph.keyring
- /etc/ceph/ceph.client.admin.keyring

I suspect that some old residual config caused the Proxmox replication process to fail and update these keys when I added it to the cluster.

Search

Search

Unable to create Ceph OSD

kskl93

New Member

Alwin

Proxmox Retired Staff

kskl93

New Member

Alwin

Proxmox Retired Staff

kskl93

New Member

szucs10

Member

Alwin

Proxmox Retired Staff

huky

Renowned Member

michelef

New Member

alext

Active Member

We value your privacy