Unable to create OSDs in new Proxmox/Ceph cluster - RADOS object not found (error connecting to the cluster)

victorhooi

Active Member
Apr 3, 2018
250
20
38
37
I've setup a new 4-node Proxmox/Ceph cluster.

I have run pveceph install on each node.

I have also setup ceph mon and ceph mgr on each node.

Here is the output of /etc/pve/ceph.conf:
Code:
# cat /etc/pve/ceph.conf
[global]
     auth_client_required = cephx
     auth_cluster_required = cephx
     auth_service_required = cephx
     cluster_network = 10.7.15.3/24
     fsid = f17ee24c-0562-44c3-80ab-e7ba8366db86
     mon_allow_pool_delete = true
     mon_host = 10.7.15.3 10.7.15.4 10.7.15.5 10.7.15.6
     osd_pool_default_min_size = 2
     osd_pool_default_size = 3
     public_network = 10.7.15.3/24

[client]
     keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.examplemtv-vm01]
     public_addr = 10.7.15.3

[mon.examplemtv-vm02]
     public_addr = 10.7.15.4

[mon.examplemtv-vm03]
     public_addr = 10.7.15.5

[mon.examplemtv-vm04]
     public_addr = 10.7.15.6

My OSD tree:
Code:
# ceph osd tree
ID  CLASS  WEIGHT  TYPE NAME     STATUS  REWEIGHT  PRI-AFF
-1              0  root default
Ceph status:
Code:
# ceph status
  cluster:
    id:     f17ee24c-0562-44c3-80ab-e7ba8366db86
    health: HEALTH_WARN
            Module 'volumes' has failed dependency: No module named 'distutils.util'
            Reduced data availability: 1 pg inactive
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 4 daemons, quorum examplemtv-vm01,examplemtv-vm02,examplemtv-vm03,examplemtv-vm04 (age 3h)
    mgr: examplemtv-vm02(active, since 3h), standbys: examplemtv-vm03, examplemtv-vm01, examplemtv-vm04
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     100.000% pgs unknown
             1 unknown
Here are the available disks for the first node:
Code:
# ceph-volume inventory

Device Path               Size         rotates available Model name
/dev/nvme0n1              894.25 GB    False   True      INTEL SSDPED1D960GAY
/dev/nvme2n1              3.64 TB      False   True      INTEL SSDPE2KX040T7
/dev/nvme3n1              3.64 TB      False   True      INTEL SSDPE2KX040T7
/dev/nvme4n1              3.64 TB      False   True      INTEL SSDPE2KX040T7
/dev/nvme5n1              3.64 TB      False   True      INTEL SSDPE2KX040T7
/dev/nvme6n1              3.64 TB      False   True      INTEL SSDPE2KX040T7
/dev/nvme7n1              3.64 TB      False   True      INTEL SSDPE2KX040T7
/dev/nvme1n1              931.51 GB    False   False     Samsung SSD 960 EVO 1TB

I am now trying to create OSDs on the first node - however, I get a RADOS object not found error - error connecting to the cluster:

Code:
# ceph-volume lvm batch --osds-per-device 4 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1 /dev/nvme7n1 --db-devices /dev/nvme0n1

Total OSDs: 24

Solid State VG:
  Targets:   block.db                  Total size: 893.00 GB
  Total LVs: 96                        Size per LV: 37.21 GB
  Devices:   /dev/nvme0n1

  Type            Path                                                    LV Size         % of device
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme2n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme2n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme2n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme2n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme3n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme3n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme3n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme3n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme4n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme4n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme4n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme4n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme5n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme5n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme5n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme5n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme6n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme6n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme6n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme6n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme7n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme7n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme7n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
----------------------------------------------------------------------------------------------------
  [data]          /dev/nvme7n1                                            931.25 GB       25.0%
  [block.db]      vg: vg/lv                                               37.21 GB        4%
--> The above OSDs would be created if the operation continues
--> do you want to proceed? (yes/no) y
Running command: /usr/sbin/vgcreate --force --yes ceph-block-f4847ec1-5108-4438-9bd6-3c7ecdf496d4 /dev/nvme2n1
 stdout: Physical volume "/dev/nvme2n1" successfully created.
 stdout: Volume group "ceph-block-f4847ec1-5108-4438-9bd6-3c7ecdf496d4" successfully created
Running command: /usr/sbin/vgcreate --force --yes ceph-block-cf6ae065-c2c2-45bd-a9fa-96ddc70d654f /dev/nvme3n1
 stdout: Physical volume "/dev/nvme3n1" successfully created.
 stdout: Volume group "ceph-block-cf6ae065-c2c2-45bd-a9fa-96ddc70d654f" successfully created
Running command: /usr/sbin/vgcreate --force --yes ceph-block-fd3bf495-1907-4eb3-8122-df5ec83c2c10 /dev/nvme4n1
 stdout: Physical volume "/dev/nvme4n1" successfully created.
 stdout: Volume group "ceph-block-fd3bf495-1907-4eb3-8122-df5ec83c2c10" successfully created
Running command: /usr/sbin/vgcreate --force --yes ceph-block-a3526a34-d438-41ce-9d03-3e101b84dfb7 /dev/nvme5n1
 stdout: Physical volume "/dev/nvme5n1" successfully created.
 stdout: Volume group "ceph-block-a3526a34-d438-41ce-9d03-3e101b84dfb7" successfully created
Running command: /usr/sbin/vgcreate --force --yes ceph-block-657a14c8-c9d9-48ed-9537-5bd1354b93b4 /dev/nvme6n1
 stdout: Physical volume "/dev/nvme6n1" successfully created.
 stdout: Volume group "ceph-block-657a14c8-c9d9-48ed-9537-5bd1354b93b4" successfully created
Running command: /usr/sbin/vgcreate --force --yes ceph-block-0c4edee0-3221-456a-9bec-49128de4f2b5 /dev/nvme7n1
 stdout: Physical volume "/dev/nvme7n1" successfully created.
 stdout: Volume group "ceph-block-0c4edee0-3221-456a-9bec-49128de4f2b5" successfully created
Running command: /usr/sbin/vgcreate --force --yes ceph-block-dbs-6d1c858b-b673-4cc5-a0ce-f32e6791c96c /dev/nvme0n1
 stdout: Physical volume "/dev/nvme0n1" successfully created.
 stdout: Volume group "ceph-block-dbs-6d1c858b-b673-4cc5-a0ce-f32e6791c96c" successfully created
Running command: /usr/sbin/lvcreate --yes -l 238465 -n osd-block-e43407fc-85c4-47a6-9549-c5cfd1275e06 ceph-block-f4847ec1-5108-4438-9bd6-3c7ecdf496d4
 stdout: Logical volume "osd-block-e43407fc-85c4-47a6-9549-c5cfd1275e06" created.
Running command: /usr/sbin/lvcreate --yes -l 9538 -n osd-block-db-efc00ed9-9c8a-4745-8e7b-3c73952b77f5 ceph-block-dbs-6d1c858b-b673-4cc5-a0ce-f32e6791c96c
 stdout: Logical volume "osd-block-db-efc00ed9-9c8a-4745-8e7b-3c73952b77f5" created.
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 859045ef-e2fc-4798-bf4c-b437b2b21eea
 stderr: [errno 2] RADOS object not found (error connecting to the cluster)
-->  RuntimeError: Unable to create a new OSD id

Does anybody know what the above error means?

Thanks,
Victor
 
Yes, there is a symlink there:
Code:
# ls -lah /etc/ceph
total 23K
drwxr-xr-x  2 ceph ceph   5 Nov 12 11:29 .
drwxr-xr-x 90 root root 179 Nov 20 05:58 ..
-rw-------  1 ceph ceph 151 Nov 12 11:29 ceph.client.admin.keyring
lrwxrwxrwx  1 root root  18 Nov 12 11:29 ceph.conf -> /etc/pve/ceph.conf
-rw-r--r--  1 root root  92 Aug 28  2019 rbdmap
However, I think I found the issue - it was actually from one of my older threads (link) - thanks for helping me there =).

To save time, the fix was this:
Code:
# ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring
exported keyring for client.bootstrap-osd
It seems when Proxmox sets up Ceph - it doesn't setup the above keyring, and the standard Ceph commands like ceph-volume won't work.

Does it make sense to integrate setting up the keyring as part of the normal Proxmox Ceph install process?
 
It looks like both the symlink shown by @victorhooi AND installing apt install python3-distutils are needed to make out-of-the-box Ceph on Proxmox actually work flawlessly.
 
It looks like both the symlink shown by @victorhooi AND installing apt install python3-distutils are needed to make out-of-the-box Ceph on Proxmox actually work flawlessly.
The symlink is needed (and automatically created) by Ceph itself, since all tooling is looking for it. The distutils is a missing dependency, only recently introduced by upstream.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!