Ceph Cluster Error osds wont start

bonkersdeluxe · Mar 14, 2022

Hi everyone,
My Cluster died and now i try to rescue it.
the status:

Screenshot_2022-03-14 vsrv01 - Proxmox Virtual Environment.png

ceph osd tree


Last login: Mon Mar 14 14:07:38 2022 from 192.168.178.88
root@vsrv01:~# ceph osd tree
ID   CLASS  WEIGHT    TYPE NAME        STATUS  REWEIGHT  PRI-AFF
 -1         73.77718  root default                           
 -3         17.82758      host vsrv01                        
  0    hdd  14.55269          osd.0      down         0  1.00000
  3    hdd   1.81940          osd.3        up   1.00000  1.00000
 12   nvme   1.45549          osd.12     down         0  1.00000
 -5         22.73965      host vsrv02                        
  2    hdd  14.55269          osd.2      down   1.00000  1.00000
  4    hdd   2.72899          osd.4      down         0  1.00000
 13    hdd   2.72899          osd.13     down         0  1.00000
  6   nvme   2.72899          osd.6        up   1.00000  1.00000
 -7         18.65726      host vsrv03                        
  7    hdd   2.72899          osd.7        up   1.00000  1.00000
  8    hdd   2.72899          osd.8        up   1.00000  1.00000
 14    hdd  10.91409          osd.14       up   1.00000  1.00000
  9   nvme   1.81940          osd.9        up   1.00000  1.00000
 10   nvme   0.46579          osd.10       up   1.00000  1.00000
-13         14.55269      host vsrv04                        
  1    hdd  14.55269          osd.1        up   1.00000  1.00000

It seems that osds can not start because empty pg. Ill try to recover the pg 2.d2, thats runs massive fail.
How can i delete or better recovery the pg. In the log it seems that taht pg is the reason why the osds 0 and 2 wont start.

The error from journalctl -xe


2022-03-13T15:17:29.208+0100 7f40b40d0f00 -1 osd.0 99749 log_to_monitors {default=true}
2022-03-13T15:17:30.856+0100 7f40995b8700 -1 log_channel(cluster) log [ERR] : 2.d2 past_intervals [90070,99748) start interval does not contain the required bound [76853,99748) start
2022-03-13T15:17:30.856+0100 7f40995b8700 -1 osd.0 pg_epoch: 99749 pg[2.d2( empty local-lis/les=0/0 n=0 ec=90070/90070 lis/c=84387/76852 les/c/f=84388/76853/0 sis=99748) [0,8] r=0 lpr=99748 pi=[90070,99748)/22 crt=0'0 mlcod 0'0 unknown mbc={}] 2.d2 past_intervals [90070,99748) start interval does not contain the required bound [76853,99748) start
./src/osd/PeeringState.cc: In function 'void PeeringState::check_past_interval_bounds() const' thread 7f40995b8700 time 2022-03-13T15:17:30.856945+0100
./src/osd/PeeringState.cc: 991: ceph_abort_msg("past_interval start interval mismatch")
 ceph version 16.2.7 (f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x562f885b00df]
 2: (PeeringState::check_past_interval_bounds() const+0x67c) [0x562f889135dc]
 3: (PeeringState::Reset::react(PeeringState::AdvMap const&)+0x292) [0x562f889252e2]
 4: (boost::statechart::simple_state<PeeringState::Reset, PeeringState::PeeringMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x206) [0x562f88971bb6]

Output from ceph pg 2.d2 query. At the last block we can see, that osd 0 and osd 2
See at Attachment


ceph pg 2.d2 query

I removed osd 0 and 2 and the error is still existent an recreation.
I tried to say, okay the pg is lost also remove ist.
he tells me successful removed but is still there anway. What should i do?



ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0/ --pgid 2.d2 --op remove --force
 marking collection for removal
setting '_remove' omap key
finish_remove_pgs 2.d2_head removing 2.d2
Remove successful

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2/ --pgid 2.d2 --op remove --force
 marking collection for removal
setting '_remove' omap key
finish_remove_pgs 2.d2_head removing 2.d2
Remove successful

There were more OSDs that wont start, but with other error, it seems because the OSD to be full? because they wont start i dont know how full is it. With df-h


df -h
Dateisystem      Größe Benutzt Verf. Verw% Eingehängt auf
udev               63G       0   63G    0% /dev
tmpfs              13G    1,4M   13G    1% /run
rpool/ROOT/pve-1  281G     88G  193G   32% /
tmpfs              63G     51M   63G    1% /dev/shm
tmpfs             5,0M       0  5,0M    0% /run/lock
rpool             193G    128K  193G    1% /rpool
rpool/ROOT        193G    128K  193G    1% /rpool/ROOT
rpool/data        193G    128K  193G    1% /rpool/data
/dev/fuse         128M     56K  128M    1% /etc/pve
tmpfs              13G       0   13G    0% /run/user/0
tmpfs              63G     24K   63G    1% /var/lib/ceph/osd/ceph-13
tmpfs              63G     28K   63G    1% /var/lib/ceph/osd/ceph-6
tmpfs              63G     28K   63G    1% /var/lib/ceph/osd/ceph-2
tmpfs              63G     24K   63G    1% /var/lib/ceph/osd/ceph-4


2022-03-13T00:15:59.393+0100 7fefbbc76f00 -1 bluefs _allocate allocation failed, needed 0x71ee4

i tried all this commands to the affected osd

CEPH_ARGS="--bluestore_rocksdb_options avoid_flush_during_recovery=1" ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ repair
then
ceph-bluestore-tool --log-level 30 --path /var/lib/ceph/osd/ceph-4 --command bluefs-bdev-expand
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ --allocator block free-score
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ --allocator block free-dump

this command crahsed wit the same error at begin. he tried to recover without luck...
ceph-osd --setuser ceph --setgroup ceph -i 4 -d --bluefs_allocator=bitmap --bluestore_allocator=bitmap
log piece
[ICODE]
2022-03-14T14:39:32.745+0100 7f9ee3ebdf00  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1647265172748882, "job": 1, "event": "recovery_started", "log_files": [743]}
2022-03-14T14:39:32.745+0100 7f9ee3ebdf00  4 rocksdb: [db_impl/db_impl_open.cc:758] Recovering log #743 mode 2
....
and then bam
    -8> 2022-03-14T14:40:56.412+0100 7f9ee3ebdf00  4 rocksdb: [version_set.cc:4574] Column family [P] (ID 11), log number is 715

    -7> 2022-03-14T14:40:56.412+0100 7f9ee3ebdf00  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1647265256416854, "job": 1, "event": "recovery_started", "log_files": [743]}
    -6> 2022-03-14T14:40:56.412+0100 7f9ee3ebdf00  4 rocksdb: [db_impl/db_impl_open.cc:758] Recovering log #743 mode 2
    -5> 2022-03-14T14:41:17.005+0100 7f9ee3ebdf00  3 rocksdb: [le/block_based/filter_policy.cc:579] Using legacy Bloom filter with high (20) bits/key. Dramatic filter space and/or accuracy improvement is available with format_version>=5.
    [B]-4> 2022-03-14T14:41:23.437+0100 7f9ee3ebdf00  1 bluefs _allocate unable to allocate 0x1c2000 on bdev 1, allocator name block, allocator [/B]type bitmap, capacity 0x2baa1000000, block size 0x1000, free 0xae000, fragmentation 1, allocated 0xae000
    -3> 2022-03-14T14:41:23.437+0100 7f9ee3ebdf00 -1 bluefs _allocate allocation failed, needed 0x1c1951
    -2> 2022-03-14T14:41:23.437+0100 7f9ee3ebdf00 -1 bluefs _flush_range allocated: 0x1515000 offset: 0x15147b7 length: 0x1c219a
    -1> 2022-03-14T14:41:23.457+0100 7f9ee3ebdf00 -1 ./src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f9ee3ebdf00 time 2022-03-14T14:41:23.442701+0100
./src/os/bluestore/BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc")
Then i tried
CEPH_ARGS="--bluefs-shared-alloc-size 4096 --bluestore_allocator stupid --bluefs_allocator stupid --debug_bluefs 20/20" ceph-kvstore-tool bluestore-kv  /var/lib/ceph/osd/ceph-4/ compact 2>&1
Output
....
0xc73c85f000~1000,1:0xc75b761000~1000,1:0xc75b7b5000~1000,1:0xc76dc07000~1000,1:0xc77845f000~1000,1:0xc7aaffa000~1000,1:0xc7bec83000~1000,1:0xc7d6ce5000~1000,1:0xc7d6f09000~1000,1:0xc7e1429000~1000,1:0xc7f96db000~1000,1:0xc86a443000~1000,1:0xc8d482f000~1000,1:0xc96f54b000~1000,1:0xc97354f000~1000,1:0xc99278e000~1000,1:0xce5144f000~1000,1:0xd162c5a000~1000,1:0xd5f244e000~1000,1:0xd5f884f000~1000,1:0xd697458000~1000,1:0xd94ac5f000~1000,1:0xd951c5f000~1000,1:0xd9a085f000~1000,1:0xda69c49000~1000,1:0xdb7585f000~1000,1:0xde2e1e9000~1000])
    -7> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 10 bluefs _allocate len 0x1c1951 from 1
    -6> 2022-03-14T14:57:17.367+0100 7f6ccfb02240  1 bluefs _allocate unable to allocate 0x1c2000 on bdev 1, allocator name block, allocator type stupid, capacity 0x2baa1000000, block size 0x1000, free 0xae000, fragmentation 1, allocated 0xae000
    -5> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 20 bluefs _allocate fallback to bdev 2
    -4> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 10 bluefs _allocate len 0x1c1951 from 2
    -3> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 -1 bluefs _allocate allocation failed, needed 0x1c1951
    -2> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 -1 bluefs _flush_range allocated: 0x1515000 offset: 0x15147b7 length: 0x1c219a
    -1> 2022-03-14T14:57:17.379+0100 7f6ccfb02240 -1 ./src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f6ccfb02240 time 2022-03-14T14:57:17.373523+0100
./src/os/bluestore/BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc")

crashed again

what can i do?
any Ideas?
Thank you!
Sincerely Bonkersdeluxe

bonkersdeluxe · Mar 16, 2022

Hi, no idea?
I push it again, in hope anyone can help me.
i dont know, what can i do...
Thanky you!
Sincerely Bonkersdeluxe

Search

Search

Ceph Cluster Error osds wont start

bonkersdeluxe

Renowned Member

Attachments

bonkersdeluxe

Renowned Member

We value your privacy