[SOLVED] Ceph OSD adding issues

aloprozam

Member
Sep 29, 2021
3
0
6
25
Moldova
Greetings community!

After few month of using ceph from Proxmox i decided to add new disk and stuck with this issue.

ceph version 17.2.7 (2dd3854d5b35a35486e86e2616727168e244f470) quincy (stable)

Code:
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 55cd9159-1cd1-4fd3-b2d8-57efd310b8f3
Running command: vgcreate --force --yes ceph-9883ffe0-5382-49ed-8d52-be901c78cb21 /dev/sde
 stdout: Physical volume "/dev/sde" successfully created.
 stdout: Volume group "ceph-9883ffe0-5382-49ed-8d52-be901c78cb21" successfully created
Running command: lvcreate --yes -l 381546 -n osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3 ceph-9883ffe0-5382-49ed-8d52-be901c78cb21
 stdout: Logical volume "osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3" created.
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-9
--> Executable selinuxenabled not in PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-9883ffe0-5382-49ed-8d52-be901c78cb21/osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-8
Running command: /usr/bin/ln -s /dev/ceph-9883ffe0-5382-49ed-8d52-be901c78cb21/osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3 /var/lib/ceph/osd/ceph-9/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-9/activate.monmap
 stderr: got monmap epoch 5
--> Creating keyring file for osd.9
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-9/keyring
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-9/
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 9 --monmap /var/lib/ceph/osd/ceph-9/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-9/ --osd-uuid 55cd9159-1cd1-4fd3-b2d8-57efd310b8f3 --setuser ceph --setgroup ceph
 stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _read_fsid unparsable uuid
 stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bdev(0x62c489750c00 /var/lib/ceph/osd/ceph-9//block.db) open open got: (22) Invalid argument
 stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _minimal_open_bluefs add block device(/var/lib/ceph/osd/ceph-9//block.db) returned: (22) Invalid argument
 stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _open_db failed to prepare db environment:
 stderr: 2024-03-23T12:28:13.602+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) mkfs failed, (5) Input/output error
 stderr: 2024-03-23T12:28:13.602+0200 7e95a66383c0 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
 stderr: 2024-03-23T12:28:13.602+0200 7e95a66383c0 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-9/: (5) Input/output error
--> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.9 --yes-i-really-mean-it
 stderr: purged osd.9
--> Zapping: /dev/ceph-9883ffe0-5382-49ed-8d52-be901c78cb21/osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3
--> Unmounting /var/lib/ceph/osd/ceph-9
Running command: /usr/bin/umount -v /var/lib/ceph/osd/ceph-9
 stderr: umount: /var/lib/ceph/osd/ceph-9 unmounted
Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-9883ffe0-5382-49ed-8d52-be901c78cb21/osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3 bs=1M count=10 conv=fsync
 stderr: 10+0 records in
10+0 records out
 stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0356262 s, 294 MB/s
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-9883ffe0-5382-49ed-8d52-be901c78cb21
Running command: vgremove -v -f ceph-9883ffe0-5382-49ed-8d52-be901c78cb21
 stderr: Removing ceph--9883ffe0--5382--49ed--8d52--be901c78cb21-osd--block--55cd9159--1cd1--4fd3--b2d8--57efd310b8f3 (252:8)
 stderr: Releasing logical volume "osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3"
  Archiving volume group "ceph-9883ffe0-5382-49ed-8d52-be901c78cb21" metadata (seqno 5).
 stdout: Logical volume "osd-block-55cd9159-1cd1-4fd3-b2d8-57efd310b8f3" successfully removed.
 stderr: Removing physical volume "/dev/sde" from volume group "ceph-9883ffe0-5382-49ed-8d52-be901c78cb21"
 stdout: Volume group "ceph-9883ffe0-5382-49ed-8d52-be901c78cb21" successfully removed
 stderr: Creating volume group backup "/etc/lvm/backup/ceph-9883ffe0-5382-49ed-8d52-be901c78cb21" (seqno 6).
Running command: pvremove -v -f -f /dev/sde
 stdout: Labels on physical volume "/dev/sde" successfully wiped.
--> Zapping successful for OSD: 9
-->  RuntimeError: Command failed with exit code 250: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 9 --monmap /var/lib/ceph/osd/ceph-9/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-9/ --osd-uuid 55cd9159-1cd1-4fd3-b2d8-57efd310b8f3 --setuser ceph --setgroup ceph



[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 10.10.x0.1/24
debug_asok = 0/0
debug_auth = 0/0
debug_buffer = 0/0
debug_client = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_filer = 0/0
debug_filestore = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_journal = 0/0
debug_journaler = 0/0
debug_lockdep = 0/0
debug_mon = 0/0
debug_monc = 0/0
debug_ms = 0/0
debug_objclass = 0/0
debug_objectcatcher = 0/0
debug_objecter = 0/0
debug_optracker = 0/0
debug_osd = 0/0
debug_paxos = 0/0
debug_perfcounter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_rgw = 0/0
debug_throttle = 0/0
debug_timer = 0/0
debug_tp = 0/0
fsid = 0b1f7638-2c49-40a5-bb7c-13ec486a1626
mon_allow_pool_delete = true
mon_host = 10.10.x0.1 10.10.x0.2 10.10.x0.3
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.10.x0.1/24

[client]
keyring = /etc/pve/priv/ceph.client.admin.keyring
rbd_cache = true
rbd_cache_max_dirty = 50331648
rbd_cache_max_dirty_age = 2
rbd_cache_size = 67108864
rbd_cache_target_dirty = 33554432
rbd_cache_writethrough_until_flush = true
rbd_concurrent_management_ops = 10
rbd_default_format = 2

[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mon]
mon_allow_pool_delete = true
mon_max_pg_per_osd = 300
mon_osd_backfillfull_ratio = 0.9
mon_osd_down_out_interval = 5
mon_osd_full_ratio = 0.95
mon_osd_nearfull_ratio = 0.9
mon_pg_warn_max_per_osd = 520

[osd]
bluestore_block_db_create = true
bluestore_block_db_size = 5368709120
bluestore_block_wal_create = true
bluestore_block_wal_size = 1073741824
bluestore_cache_size_hdd = 3221225472
bluestore_cache_size_ssd = 9663676416
journal_aio = true
journal_block_align = true
journal_dio = true
journal_max_write_bytes = 1073714824
journal_max_write_entries = 10000
journal_queue_max_bytes = 10485760000
journal_queue_max_ops = 50000
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd_client_message_size_cap = 1073741824
osd_disk_thread_ioprio_class = idle
osd_disk_thread_ioprio_priority = 7
osd_disk_threads = 2
osd_failsafe_full_ratio = 0.95
osd_heartbeat_grace = 5
osd_heartbeat_interval = 3
osd_map_dedup = true
osd_max_backfills = 4
osd_max_write_size = 256
osd_mon_heartbeat_interval = 5
osd_op_num_threads_per_shard = 1
osd_op_num_threads_per_shard_hdd = 2
osd_op_num_threads_per_shard_ssd = 2
osd_op_threads = 16
osd_pool_default_min_size = 1
osd_pool_default_size = 2
osd_recovery_delay_start = 10.0
osd_recovery_max_active = 1
osd_recovery_max_chunk = 1048576
osd_recovery_max_single_start = 3
osd_recovery_op_priority = 1
osd_recovery_priority = 1
osd_recovery_sleep = 2
osd_scrub_chunk_max = 4
osd_scrub_chunk_min = 2
osd_scrub_sleep = 0.1
rocksdb_separate_wal_dir = true
 
While preparing the OSD there is an IO error. This is why it fails.

Are there any entries in the kernel log?
Hm, from kernel log some date while trying to create osd

Simple lvm - lvm thin pool or even directory easy to create only when creating ceph osd i've got errors.

I'm searching stuff regarding this error
Code:
stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _read_fsid unparsable uuid

 stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bdev(0x62c489750c00 /var/lib/ceph/osd/ceph-9//block.db) open open got: (22) Invalid argument

 stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _minimal_open_bluefs add block device(/var/lib/ceph/osd/ceph-9//block.db) returned: (22) Invalid argument

 stderr: 2024-03-23T12:28:13.314+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) _open_db failed to prepare db environment:

 stderr: 2024-03-23T12:28:13.602+0200 7e95a66383c0 -1 bluestore(/var/lib/ceph/osd/ceph-9/) mkfs failed, (5) Input/output error




Code:
[100493.914208] perf: interrupt took too long (7820 > 7818), lowering kernel.perf_event_max_sample_rate to 25500
[127963.356953] sd 0:0:1:0: [sdb] tag#487 Sense Key : Recovered Error [current]
[127963.356966] sd 0:0:1:0: [sdb] tag#487 Add. Sense: Grown defect list not found
[127967.312958] sd 0:0:2:0: [sdc] tag#502 Sense Key : Recovered Error [current]
[127967.312979] sd 0:0:2:0: [sdc] tag#502 Add. Sense: Defect list not found
[127970.842646] sd 0:0:3:0: [sdd] tag#491 Sense Key : Recovered Error [current]
[127970.842660] sd 0:0:3:0: [sdd] tag#491 Add. Sense: Defect list not found
[157329.482301] sd 0:0:1:0: [sdb] tag#742 Sense Key : Recovered Error [current]
[157329.482313] sd 0:0:1:0: [sdb] tag#742 Add. Sense: Grown defect list not found
[157339.088172] sd 0:0:4:0: [sde] tag#739 Sense Key : Recovered Error [current]
[157339.088183] sd 0:0:4:0: [sde] tag#739 Add. Sense: Defect list not found
[157373.177463] sd 0:0:1:0: [sdb] tag#732 Sense Key : Recovered Error [current]
[157373.177475] sd 0:0:1:0: [sdb] tag#732 Add. Sense: Grown defect list not found
[157386.020222] sd 0:0:4:0: [sde] tag#764 Sense Key : Recovered Error [current]
[157386.020236] sd 0:0:4:0: [sde] tag#764 Add. Sense: Defect list not found
[157501.777368] sd 0:0:4:0: [sde] tag#323 Sense Key : Recovered Error [current]
[157501.777383] sd 0:0:4:0: [sde] tag#323 Add. Sense: Defect list not found
[157531.230567] sd 0:0:4:0: [sde] tag#220 Sense Key : Recovered Error [current]
[157531.230583] sd 0:0:4:0: [sde] tag#220 Add. Sense: Defect list not found
[157591.285338] sd 0:0:4:0: [sde] tag#415 Sense Key : Recovered Error [current]
[157591.285351] sd 0:0:4:0: [sde] tag#415 Add. Sense: Defect list not found
[157625.172006] sd 0:0:4:0: [sde] tag#425 Sense Key : Recovered Error [current]
[157625.172019] sd 0:0:4:0: [sde] tag#425 Add. Sense: Defect list not found
 
Last edited:
Resolved.
Removed these lines

Code:
[osd]
bluestore_block_db_create = true
bluestore_block_db_size = 5368709120
bluestore_block_wal_create = true
bluestore_block_wal_size = 1073741824
bluestore_cache_size_hdd = 3221225472
bluestore_cache_size_ssd = 9663676416
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!