Hi everyone,
My Cluster died and now i try to rescue it.
the status:

ceph osd tree
It seems that osds can not start because empty pg. Ill try to recover the pg 2.d2, thats runs massive fail.
How can i delete or better recovery the pg. In the log it seems that taht pg is the reason why the osds 0 and 2 wont start.
The error from journalctl -xe
Output from ceph pg 2.d2 query. At the last block we can see, that osd 0 and osd 2
See at Attachment
I removed osd 0 and 2 and the error is still existent an recreation.
I tried to say, okay the pg is lost also remove ist.
he tells me successful removed but is still there anway. What should i do?
There were more OSDs that wont start, but with other error, it seems because the OSD to be full? because they wont start i dont know how full is it. With df-h
what can i do?
any Ideas?
Thank you!
Sincerely Bonkersdeluxe
My Cluster died and now i try to rescue it.
the status:

ceph osd tree
Last login: Mon Mar 14 14:07:38 2022 from 192.168.178.88
root@vsrv01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 73.77718 root default
-3 17.82758 host vsrv01
0 hdd 14.55269 osd.0 down 0 1.00000
3 hdd 1.81940 osd.3 up 1.00000 1.00000
12 nvme 1.45549 osd.12 down 0 1.00000
-5 22.73965 host vsrv02
2 hdd 14.55269 osd.2 down 1.00000 1.00000
4 hdd 2.72899 osd.4 down 0 1.00000
13 hdd 2.72899 osd.13 down 0 1.00000
6 nvme 2.72899 osd.6 up 1.00000 1.00000
-7 18.65726 host vsrv03
7 hdd 2.72899 osd.7 up 1.00000 1.00000
8 hdd 2.72899 osd.8 up 1.00000 1.00000
14 hdd 10.91409 osd.14 up 1.00000 1.00000
9 nvme 1.81940 osd.9 up 1.00000 1.00000
10 nvme 0.46579 osd.10 up 1.00000 1.00000
-13 14.55269 host vsrv04
1 hdd 14.55269 osd.1 up 1.00000 1.00000
It seems that osds can not start because empty pg. Ill try to recover the pg 2.d2, thats runs massive fail.
How can i delete or better recovery the pg. In the log it seems that taht pg is the reason why the osds 0 and 2 wont start.
The error from journalctl -xe
2022-03-13T15:17:29.208+0100 7f40b40d0f00 -1 osd.0 99749 log_to_monitors {default=true}
2022-03-13T15:17:30.856+0100 7f40995b8700 -1 log_channel(cluster) log [ERR] : 2.d2 past_intervals [90070,99748) start interval does not contain the required bound [76853,99748) start
2022-03-13T15:17:30.856+0100 7f40995b8700 -1 osd.0 pg_epoch: 99749 pg[2.d2( empty local-lis/les=0/0 n=0 ec=90070/90070 lis/c=84387/76852 les/c/f=84388/76853/0 sis=99748) [0,8] r=0 lpr=99748 pi=[90070,99748)/22 crt=0'0 mlcod 0'0 unknown mbc={}] 2.d2 past_intervals [90070,99748) start interval does not contain the required bound [76853,99748) start
./src/osd/PeeringState.cc: In function 'void PeeringState::check_past_interval_bounds() const' thread 7f40995b8700 time 2022-03-13T15:17:30.856945+0100
./src/osd/PeeringState.cc: 991: ceph_abort_msg("past_interval start interval mismatch")
ceph version 16.2.7 (f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x562f885b00df]
2: (PeeringState::check_past_interval_bounds() const+0x67c) [0x562f889135dc]
3: (PeeringState::Reset::react(PeeringState::AdvMap const&)+0x292) [0x562f889252e2]
4: (boost::statechart::simple_state<PeeringState::Reset, PeeringState::PeeringMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x206) [0x562f88971bb6]
See at Attachment
ceph pg 2.d2 query
I removed osd 0 and 2 and the error is still existent an recreation.
I tried to say, okay the pg is lost also remove ist.
he tells me successful removed but is still there anway. What should i do?
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0/ --pgid 2.d2 --op remove --force
marking collection for removal
setting '_remove' omap key
finish_remove_pgs 2.d2_head removing 2.d2
Remove successful
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2/ --pgid 2.d2 --op remove --force
marking collection for removal
setting '_remove' omap key
finish_remove_pgs 2.d2_head removing 2.d2
Remove successful
There were more OSDs that wont start, but with other error, it seems because the OSD to be full? because they wont start i dont know how full is it. With df-h
df -h
Dateisystem Größe Benutzt Verf. Verw% Eingehängt auf
udev 63G 0 63G 0% /dev
tmpfs 13G 1,4M 13G 1% /run
rpool/ROOT/pve-1 281G 88G 193G 32% /
tmpfs 63G 51M 63G 1% /dev/shm
tmpfs 5,0M 0 5,0M 0% /run/lock
rpool 193G 128K 193G 1% /rpool
rpool/ROOT 193G 128K 193G 1% /rpool/ROOT
rpool/data 193G 128K 193G 1% /rpool/data
/dev/fuse 128M 56K 128M 1% /etc/pve
tmpfs 13G 0 13G 0% /run/user/0
tmpfs 63G 24K 63G 1% /var/lib/ceph/osd/ceph-13
tmpfs 63G 28K 63G 1% /var/lib/ceph/osd/ceph-6
tmpfs 63G 28K 63G 1% /var/lib/ceph/osd/ceph-2
tmpfs 63G 24K 63G 1% /var/lib/ceph/osd/ceph-4
2022-03-13T00:15:59.393+0100 7fefbbc76f00 -1 bluefs _allocate allocation failed, needed 0x71ee4
i tried all this commands to the affected osd
CEPH_ARGS="--bluestore_rocksdb_options avoid_flush_during_recovery=1" ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ repair
then
ceph-bluestore-tool --log-level 30 --path /var/lib/ceph/osd/ceph-4 --command bluefs-bdev-expand
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ --allocator block free-score
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ --allocator block free-dump
this command crahsed wit the same error at begin. he tried to recover without luck...
ceph-osd --setuser ceph --setgroup ceph -i 4 -d --bluefs_allocator=bitmap --bluestore_allocator=bitmap
log piece
[ICODE]
2022-03-14T14:39:32.745+0100 7f9ee3ebdf00 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1647265172748882, "job": 1, "event": "recovery_started", "log_files": [743]}
2022-03-14T14:39:32.745+0100 7f9ee3ebdf00 4 rocksdb: [db_impl/db_impl_open.cc:758] Recovering log #743 mode 2
....
and then bam
-8> 2022-03-14T14:40:56.412+0100 7f9ee3ebdf00 4 rocksdb: [version_set.cc:4574] Column family [P] (ID 11), log number is 715
-7> 2022-03-14T14:40:56.412+0100 7f9ee3ebdf00 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1647265256416854, "job": 1, "event": "recovery_started", "log_files": [743]}
-6> 2022-03-14T14:40:56.412+0100 7f9ee3ebdf00 4 rocksdb: [db_impl/db_impl_open.cc:758] Recovering log #743 mode 2
-5> 2022-03-14T14:41:17.005+0100 7f9ee3ebdf00 3 rocksdb: [le/block_based/filter_policy.cc:579] Using legacy Bloom filter with high (20) bits/key. Dramatic filter space and/or accuracy improvement is available with format_version>=5.
[B]-4> 2022-03-14T14:41:23.437+0100 7f9ee3ebdf00 1 bluefs _allocate unable to allocate 0x1c2000 on bdev 1, allocator name block, allocator [/B]type bitmap, capacity 0x2baa1000000, block size 0x1000, free 0xae000, fragmentation 1, allocated 0xae000
-3> 2022-03-14T14:41:23.437+0100 7f9ee3ebdf00 -1 bluefs _allocate allocation failed, needed 0x1c1951
-2> 2022-03-14T14:41:23.437+0100 7f9ee3ebdf00 -1 bluefs _flush_range allocated: 0x1515000 offset: 0x15147b7 length: 0x1c219a
-1> 2022-03-14T14:41:23.457+0100 7f9ee3ebdf00 -1 ./src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f9ee3ebdf00 time 2022-03-14T14:41:23.442701+0100
./src/os/bluestore/BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc")
Then i tried
CEPH_ARGS="--bluefs-shared-alloc-size 4096 --bluestore_allocator stupid --bluefs_allocator stupid --debug_bluefs 20/20" ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-4/ compact 2>&1
Output
....
0xc73c85f000~1000,1:0xc75b761000~1000,1:0xc75b7b5000~1000,1:0xc76dc07000~1000,1:0xc77845f000~1000,1:0xc7aaffa000~1000,1:0xc7bec83000~1000,1:0xc7d6ce5000~1000,1:0xc7d6f09000~1000,1:0xc7e1429000~1000,1:0xc7f96db000~1000,1:0xc86a443000~1000,1:0xc8d482f000~1000,1:0xc96f54b000~1000,1:0xc97354f000~1000,1:0xc99278e000~1000,1:0xce5144f000~1000,1:0xd162c5a000~1000,1:0xd5f244e000~1000,1:0xd5f884f000~1000,1:0xd697458000~1000,1:0xd94ac5f000~1000,1:0xd951c5f000~1000,1:0xd9a085f000~1000,1:0xda69c49000~1000,1:0xdb7585f000~1000,1:0xde2e1e9000~1000])
-7> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 10 bluefs _allocate len 0x1c1951 from 1
-6> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 1 bluefs _allocate unable to allocate 0x1c2000 on bdev 1, allocator name block, allocator type stupid, capacity 0x2baa1000000, block size 0x1000, free 0xae000, fragmentation 1, allocated 0xae000
-5> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 20 bluefs _allocate fallback to bdev 2
-4> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 10 bluefs _allocate len 0x1c1951 from 2
-3> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 -1 bluefs _allocate allocation failed, needed 0x1c1951
-2> 2022-03-14T14:57:17.367+0100 7f6ccfb02240 -1 bluefs _flush_range allocated: 0x1515000 offset: 0x15147b7 length: 0x1c219a
-1> 2022-03-14T14:57:17.379+0100 7f6ccfb02240 -1 ./src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f6ccfb02240 time 2022-03-14T14:57:17.373523+0100
./src/os/bluestore/BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc")
crashed again
what can i do?
any Ideas?
Thank you!
Sincerely Bonkersdeluxe
Attachments
Last edited: