[SOLVED] Unable to use /dev/rssd[x] devices in Ceph, "unable to get device info for '/dev/rssda'"

kmorwath

New Member
Jan 21, 2025
15
0
1
I'm evaluating Proxmox (8.3.1) to replace our VMWare infrastructure. Unluckily, I have to perform a PoC on some old hardware given to me - six identical Dell R720 servers. These came with some spinning disks with hardware RAID, and some SSD disks connected via PCIe. These disks appear as /dev/rssd[x], i.e. /dev/rssda

root@hephaestus:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 408.4G 0 disk
├─sda1 8:1 0 1007K 0 part
├─sda2 8:2 0 1G 0 part /boot/efi
└─sda3 8:3 0 407.4G 0 part
├─pve-swap 252:0 0 8G 0 lvm [SWAP]
├─pve-root 252:1 0 96G 0 lvm /
├─pve-data_tmeta 252:2 0 2.9G 0 lvm
│ └─pve-data 252:4 0 281.6G 0 lvm
└─pve-data_tdata 252:3 0 281.6G 0 lvm
└─pve-data 252:4 0 281.6G 0 lvm
rssda 251:0 0 326G 0 disk
rssdb 251:16 0 326G 0 disk

I installed Proxmox on the RAID, and tried to assign the SSD disks to Ceph (version 19, it needs to be part of the evaluation). But:

  • In the UI these disks are not shown, although they were listed by the Proxmox installer. If I try to create an OSD, "No disk unused" is shown.
  • If I try to create them from the command line, i.e. pveceph osd create /dev/rssda, I get unable to get device info for '/dev/rssda'
  • If I create a LVM disk from the command line I can do it, and then it appears in the LVM page
In a similar test I made with Ubuntu Sunbeam OpenStack, I encountered a similar issue, and I found it was due to apparmor Ceph profiles not allowing access to many disk devices types Linux can identify, like "rssd" and "md" ones. I was able to fix it modifying the profiles, but I didn't find anything alike in Proxmox.

Is there a way to allow Proxmox + Ceph to see those devices?
 
Last edited:
I found an old post (six years ago, Ceph 14), using ceph-volume lvm create, it didn't help:

Code:
root@hephaestus:~# ceph-volume lvm create --data /dev/rssda
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 971639fd-4a45-46d9-8f10-1bdcad61595b
 stderr: 2025-01-21T13:58:40.428+0100 7bc64fa006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory
 stderr: 2025-01-21T13:58:40.428+0100 7bc64fa006c0 -1 AuthRegistry(0x7bc648065d50) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx
 stderr: 2025-01-21T13:58:40.433+0100 7bc64fa006c0 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
 stderr: 2025-01-21T13:58:40.433+0100 7bc64fa006c0 -1 AuthRegistry(0x7bc648065d50) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
 stderr: 2025-01-21T13:58:40.433+0100 7bc64fa006c0 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
 stderr: 2025-01-21T13:58:40.433+0100 7bc64fa006c0 -1 AuthRegistry(0x7bc64806b790) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
 stderr: 2025-01-21T13:58:40.434+0100 7bc64fa006c0 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
 stderr: 2025-01-21T13:58:40.434+0100 7bc64fa006c0 -1 AuthRegistry(0x7bc64f9ff3b0) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
 stderr: [errno 2] RADOS object not found (error connecting to the cluster)
-->  RuntimeError: Unable to create a new OSD id

Also found this https://forum.proxmox.com/threads/pveceph-unable-to-get-device-info.44927/#post-214510 - but again it was six years ago, does proxmox still whitelist only a subset of Linux disk devices? Unluckily Linux is could be very inventive when different devices are added - and my next step would be to test remote block devices (i.e. iSCSI, FC) - would they be available or not?
 
Last edited:
Bash:
root@hephaestus:~# ceph-volume inventory
 stderr: Unknown device "/dev/pve/data_tmeta": No such device
 stderr: Unknown device "/dev/pve/data_tdata": No such device

Device Path               Size         Device nodes    rotates available Model name
/dev/rssda                326.04 GB    rssda           False   True
/dev/rssdb                326.04 GB    rssdb           False   True
/dev/pve/data             281.62 GB    dm-2,dm-3       True    False
/dev/pve/data_tdata       281.62 GB    sda3            True    False
/dev/pve/data_tmeta       2.87 GB      sda3            True    False
/dev/sda                  408.38 GB    sda             True    False     PERC H710

But again:

Bash:
root@hephaestus:~# ceph-volume raw prepare --bluestore --data /dev/rssda

Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6197a97e-317f-492e-80ac-61b5f72ed004
 stderr: 2025-01-21T14:18:50.333+0100 7f6e45a006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory
 stderr: 2025-01-21T14:18:50.333+0100 7f6e45a006c0 -1 AuthRegistry(0x7f6e40065d50) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx
 stderr: 2025-01-21T14:18:50.337+0100 7f6e45a006c0 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
 stderr: 2025-01-21T14:18:50.337+0100 7f6e45a006c0 -1 AuthRegistry(0x7f6e40065d50) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
 stderr: 2025-01-21T14:18:50.337+0100 7f6e45a006c0 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
 stderr: 2025-01-21T14:18:50.337+0100 7f6e45a006c0 -1 AuthRegistry(0x7f6e4006b790) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
 stderr: 2025-01-21T14:18:50.338+0100 7f6e45a006c0 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
 stderr: 2025-01-21T14:18:50.338+0100 7f6e45a006c0 -1 AuthRegistry(0x7f6e459ff3b0) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
 stderr: [errno 2] RADOS object not found (error connecting to the cluster)
-->  RuntimeError: Unable to create a new OSD id
 
This command helped:

ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring

Then ceph-volume raw prepare --bluestore --data /dev/rssda executed, but I'm not sure the OSD was correctly created:

Bash:
root@hephaestus:/var/lib/ceph/bootstrap-osd# ceph-volume raw prepare --bluestore --data /dev/rssda
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 392840f9-f973-411c-a4ac-cf89addf5b8d
Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
--> Executable selinuxenabled not in PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Running command: /usr/bin/chown -R ceph:ceph /dev/rssda
Running command: /usr/bin/ln -s /dev/rssda /var/lib/ceph/osd/ceph-0/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
 stderr: 2025-01-21T14:26:14.127+0100 7e6b41e006c0 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory
2025-01-21T14:26:14.127+0100 7e6b41e006c0 -1 AuthRegistry(0x7e6b3c065d50) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx
 stderr: got monmap epoch 1
--> Creating keyring file for osd.0
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 392840f9-f973-411c-a4ac-cf89addf5b8d --setuser ceph --setgroup ceph
 stderr: 2025-01-21T14:26:14.482+0100 75102bf2a840 -1 bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-0//block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
 stderr: 2025-01-21T14:26:14.482+0100 75102bf2a840 -1 bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-0//block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
 stderr: 2025-01-21T14:26:14.482+0100 75102bf2a840 -1 bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-0//block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
 stderr: 2025-01-21T14:26:14.482+0100 75102bf2a840 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
--> ceph-volume raw dmcrypt prepare successful for: /dev/rssda

Is successful or not? What AuthRegistry(0x7e6b3c065d50) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx implies?
 
Last edited:
I add to run also:

ceph auth get client.bootstrap-osd > /etc/pve/priv/ceph.client.bootstrap-osd.keyring

Then

ceph-volume raw prepare --bluestore --data /dev/rssda

It looks some error displayed may be ignored (

ceph-osd --no-mon-config --get-device-fsid /dev/rssda to get the OSD UUID

And

ceph-volume raw activate --osd-uuid <UUID>

To activate the OSDs.

Now, even if the Ceph cluser helat is "OK", I can't see the OSDs in the UI. Moreover the page shows "ghost OSDs".
 
Last edited:
I repeated the setup using ceph-volume lvm instead of raw, and it worked - now everything looks OK.

I think previously the OSDs weren't really active, but I did not indagate it, I purged the OSDs and recreated them.