Proxmox 6.0 - unable to create Ceph OSDs (Unable to create a new OSD id)

victorhooi

Well-Known Member
Apr 3, 2018
250
20
58
38
I have just installed Proxmox 6.0 beta on a 3-node cluster.

I have setup the cluster, and also setup Ceph Managers/Monitors on each node.

I’m now at the stage to create OSDs - I’m using Intel Optane drives, which benefit from multiple OSDs per drive. However, when I try to run the command to create them on the first node, I get an error:
Code:
root@vwnode1:~# ceph-volume lvm batch --osds-per-device 4 /dev/nvme0n1

Total OSDs: 4

  Type            Path                                                    LV Size         % of device
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme0n1                                            111.50 GB       25%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme0n1                                            111.50 GB       25%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme0n1                                            111.50 GB       25%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme0n1                                            111.50 GB       25%
--> The above OSDs would be created if the operation continues
--> do you want to proceed? (yes/no) yes
Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-7d521072-a146-4dcd-ba30-86b44fc1b0a6 /dev/nvme0n1
 stdout: Physical volume "/dev/nvme0n1" successfully created.
 stdout: Volume group "ceph-7d521072-a146-4dcd-ba30-86b44fc1b0a6" successfully created
Running command: /usr/sbin/lvcreate --yes -l 111 -n osd-data-ac34507a-d697-4366-a4ab-4bb4dcf96a5e ceph-7d521072-a146-4dcd-ba30-86b44fc1b0a6
 stdout: Logical volume "osd-data-ac34507a-d697-4366-a4ab-4bb4dcf96a5e" created.
Running command: /usr/sbin/lvcreate --yes -l 111 -n osd-data-c735f58d-ab6f-40dd-bd13-0edd5d423803 ceph-7d521072-a146-4dcd-ba30-86b44fc1b0a6
 stdout: Logical volume "osd-data-c735f58d-ab6f-40dd-bd13-0edd5d423803" created.
Running command: /usr/sbin/lvcreate --yes -l 111 -n osd-data-40fb4974-d072-447d-b123-0ce2f4f69af6 ceph-7d521072-a146-4dcd-ba30-86b44fc1b0a6
 stdout: Logical volume "osd-data-40fb4974-d072-447d-b123-0ce2f4f69af6" created.
Running command: /usr/sbin/lvcreate --yes -l 111 -n osd-data-a00a4dff-b131-4e45-abcb-68e54b98c196 ceph-7d521072-a146-4dcd-ba30-86b44fc1b0a6
 stdout: Logical volume "osd-data-a00a4dff-b131-4e45-abcb-68e54b98c196" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new ebe32748-be47-41d5-88fb-d500852c297d
 stderr: [errno 2] error connecting to the cluster
-->  RuntimeError: Unable to create a new OSD id
Any ideas what’s going on?

(I did try to enable the Ceph dashboard before, but then had to disable it, as it was looking for some routes package).
 
I mean once you start using ceph native tools over ours (pveceph) you're a bit on your own.

But I had this working, the thing which was probably different in my setup was that I had created another OSD through our tooling, thus all the required setups (config, keyring, bootstrap key, etc..) was already correctly setup,
 
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new ebe32748-be47-41d5-88fb-d500852c297d stderr: [errno 2] error connecting to the cluster
like my colleague hinted on, this indicates a missing bootstrap keyring file

the easiest way to get it is to do:

Code:
ceph auth get client.bootstrap-osd

the content should be saved in the indicated location (/var/lib/ceph/bootstrap-osd/ceph.keyring)
 
  • Like
Reactions: nolchetyre
like my colleague hinted on, this indicates a missing bootstrap keyring file

the easiest way to get it is to do:

Code:
ceph auth get client.bootstrap-osd

the content should be saved in the indicated location (/var/lib/ceph/bootstrap-osd/ceph.keyring)
Thank you! I will try this tonight as soon as I get home, I really want to get this working.

So basically I just run that one command, and then the ceph-volume command should work as is?

Do you think it might make sense to add an option in the Proxmox Ceph GUI to specify the number of OSDs per disk?

Also I noticed the docs now suggest using the GUI over the Proxmox Ceph commands. I'm curious if there a particular reason for that? (I find the CLI easier to reproduce across multiple systems, and also explain remotely)
 
So basically I just run that one command, and then the ceph-volume command should work as is?
Exactly.

Do you think it might make sense to add an option in the Proxmox Ceph GUI to specify the number of OSDs per disk?

Maybe as advanced option somewhere, yes. But we need to still evaluate how much the benefit can be here, it normally only makes sense for NVMe with there multiple parallel channels anyway.

Also I noticed the docs now suggest using the GUI over the Proxmox Ceph commands. I'm curious if there a particular reason for that? (I find the CLI easier to reproduce across multiple systems, and also explain remotely)

CLI is perfectly fine, especially if you're used to it. But for users new to ceph or PVE in general we just recommend using the webui as it can be a bit simpler at first, also often less chance to mess up order of commands or command arguments for people not to familiar with CLI :)
 
I can confirm this worked.

To clarify, ceph auth get client.bootstrap-osd simply prints out the key information, you actually need to redirect it to the correct location (this got me at first, haha):
Code:
root@vwnode1:~# ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring
exported keyring for client.bootstrap-osd
Then, the ceph-volume command above worked.

I still feel like this would make sense in the GUI, even if behind the "Advanced" section.

NVMe drives are becoming quite affordable/accessible these days, and I'd posit that many of those setting up a Ceph cluster nowadays, will look into them.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!