Failed OSD for ceph stale pg

sb-jw · Nov 24, 2023

It doesn't look like there is actually a second copy for the PGs mentioned. It looks more like this data is lost.

Before you replace the disk, you should really make sure that the OSD 1203 is so irreparably damaged that it will never start again. From my point of view, this seems to be your only chance of not suffering any data loss.

hahosting · Nov 24, 2023

yes i agree thankyou, we are trying to recover the data from the disk currently to a new device.

If i can recover the data is it possible to extract the missing pg's and add them back to the cluster or am i just wasting my time?

sb-jw · Nov 24, 2023

hahosting said:
If i can recover the data is it possible to extract the missing pg's and add them back to the cluster or am i just wasting my time?

Yes, that is basically possible. Either you can save the entire OSD or just import individual PGs back into the CEPH.
However, it is important not to execute the commands such as repair, scrub or mark_unfound_lost, as this could potentially affect success.

But I have to do some quick research myself, it's been a while since I had to do that.

hahosting · Nov 24, 2023

if we decide data is lost can i just delete the dead pg's and see what does and doesn't start from a vm point of view and restore what doesn't from backup?

hahosting · Nov 24, 2023

sb-jw said:
Yes, that is basically possible. Either you can save the entire OSD or just import individual PGs back into the CEPH.
However, it is important not to execute the commands such as repair, scrub or mark_unfound_lost, as this could potentially affect success.

But I have to do some quick research myself, it's been a while since I had to do that.

is it worth disabling scrubbing at this point?

sb-jw · Nov 24, 2023

hahosting said:
if we decide data is lost can i just delete the dead pg's and see what does and doesn't start from a vm point of view and restore what doesn't from backup?

In theory, yes, but in practice you won't easily find out what you lost. It may be that you only lost data from one VM, but it may also be that you lost data from 100 VMs. It may also be that you don't notice it in the next 3 days but only sometime in 2 years, when a document needs to be read from the server again or a PHP file on the web server no longer works.
If there is even a chance of being able to save this data, I would pursue it and do everything possible to achieve it.

hahosting said:
is it worth disabling scrubbing at this point?

I think you can definitely do this for 2-3 days. But it definitely shouldn't stay disabled for weeks. But I also think that 2 - 3 days are enough for a decision.

hahosting · Nov 24, 2023

ok thats great thanks for your help. I have set noscrub and nodeep-scrub for now

I think at this point all i can do is wait for the recovered osd device to return and see if

a. it can be activated and if not b. i can extract the pg's

i cant even start what looks like all the vm's with disks on this pool so recovery at all costs seems the best option

sb-jw · Nov 24, 2023

Please consider these instructions to be incomplete, it's been a really long time since I had to do this and the syntax may have changed in the meantime. You may also have to specify the journal path or other parameters explicitly. But that's how you can import a PG again.

It is always better if you can restart the OSD in the server and fix the errors there rather than doing it via the server.

You have to get the OSD to start and mount it somehow. The "ceph-objectstore-tool" must be available on the device, it doesn't matter whether you plug the OSD into a server or connect it to a Linux laptop via USB.

Please make sure there is enough free space under “/root”. Such a PG can quickly be several GB in size. If you have an export, it's best to check how big it is and whether there is enough space for the next ones.

Code:

# Export PGs from the old defective OSD 1203
ceph-objectstore-tool --op export --pgid 8.11 --data-path /var/lib/ceph/osd/ceph-1203 --file /root/8.11.export
ceph-objectstore-tool --op export --pgid 8.1e0 --data-path /var/lib/ceph/osd/ceph-1203 --file /root/8.1e0.export
ceph-objectstore-tool --op export --pgid 8.351 --data-path /var/lib/ceph/osd/ceph-1203 --file /root/8.351.export

# Import the missing PGs onto an existing OSD, e.g. OSD 1204
ceph-objectstore-tool --op import --pgid 8.11 --data-path /var/lib/ceph/osd/ceph-1204 --file /root/8.11.export
ceph-objectstore-tool --op import --pgid 8.1e0 --data-path /var/lib/ceph/osd/ceph-1204 --file /root/8.1e0.export
ceph-objectstore-tool --op import --pgid 8.351 --data-path /var/lib/ceph/osd/ceph-1204 --file /root/8.351.export

# Now initiate a scrub for the re-imported PGs so that the CEPH catches itself again.
ceph pg deep-scrub 8.11
ceph pg deep-scrub 8.1e0
ceph pg deep-scrub 8.351

Source: https://docs.ceph.com/en/latest/man/8/ceph-objectstore-tool/

//EDIT:
This page might also help you: https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-pg/

hahosting · Nov 24, 2023

great stuff yes i found the above commands before drive removal but without the disk mounted i couldnt do anything with it. I will update here when drive returns

sb-jw · Nov 25, 2023

Aside from your problems, a few recommendations for your CEPH:

hahosting said:
mon: 9 daemons, quorum vms-ceph110,vms-ceph112,vms-ceph113,vms-ceph114,vms-ceph117,vms-ceph120,vms-ceph119,vms-ceph106,vms-ceph121 (age 29m)
mgr: vms-ceph113(active, since 7w), standbys: vms-ceph106, vms-ceph110, vms-ceph114, vms-ceph117, vms-ceph119, vms-ceph120, vms-ceph121, vms-ceph112

That's clearly too much, you only need 3 mons/mgr for your 100 OSDs, once you get over 1.000 OSDs, you can think about 5 - but definitely not 9 pieces.

CEPH itself also writes that 3 are sufficient for small deployments, 5 can be used for larger ones, and 7 could be justified for very specific requirements.
=> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#adding-monitors

hahosting said:
Long shot but we have an old 2/1 pool, on our proxmox install hypoerconverged and have lost an osd.

Regarding the Replica 2, you have now noticed what effects that can have.
I can therefore really only strongly recommend that you switch everything to Replica 3. If the VMs are configured accordingly (virtio-scsi-single, disk integrated as scsi with discard flag), then you could possibly save a lot of storage space and perhaps not have to expand to get to Replica 3.
But with Replica 3, make sure that the data is actually distributed per host. So if you have adjusted the crush map so that the data is distributed via OSD, you should definitely undo this.

You could then also consider whether to use compression. You don't even have to use an intensive and powerful CPU; perhaps a simple one is enough to save you a lot of data. Please note that compression is only applied when writing. So not the entire pool is compressed in one, so it can take some time until a positive effect occurs.

=> https://docs.ceph.com/en/latest/rados/operations/pools/#setting-pool-values

hahosting said:
health: HEALTH_WARN
noout,nobackfill,norecover flag(s) set
2 osds down
1 host (1 osds) down
2 nearfull osd(s)

Basically, under all circumstances you should ensure that the cluster status is always set to "OK". A condition where an OSD or a node is gone should only be an exception. In productive operation, no global flags such as: B. noout must be set.

Especially when you have other problems, it is more difficult for me, to understand what is going on and where to start. Of course, this significantly delays help.
Example, you have set nobackfill and norecover, at the same time your CEPH says you have 2 neafull OSD. Now I first have to know what kind of OSDs they are and what their fill level is before I could perhaps tell you to take out norecover and nobackfill. Otherwise, your CEPH may then run at full capacity.

The nearfull condition should also be remedied immediately. To do this, you should reduce the weight of the OSDs or add new ones. The biggest problem here is always that the changes to the CRUSH map mean that the already very full OSDs are still taken into account and in the first phase, where a lot of data is actually moved, they also receive more data. If you have a nearfull OSD that is just below or at full ratio, then adding additional hardware could cause all pools that are connected to the nearfull OSD to go into the read-only state until the full ratio is fallen below again.

It also seems to me that the distribution of the PGs is suboptimal. Here you should consider whether you activate the balancer ceph balancer mode upmap.
What you should always keep in mind regarding the level of the CEPH itself, this is always evaluated based on the fullest OSD. The more you distribute your data across the nodes and the more the cut falls, the more space you will have afterwards.

=> https://docs.ceph.com/en/latest/rados/operations/balancer/

While it generally doesn't seem to me that your pools have too many PGs, they may have a little too few. The autoscaler could also help you.

To enable the Module: ceph mgr module enable pg_autoscaler
Then you can set with command ceph osd pool set rbd pg_autoscale_mode warn to only warn (so you can check the result).

=> https://docs.ceph.com/en/latest/rados/operations/placement-groups/

You use different sizes of HDDs. Many nodes have 2 TB hard drives and others have 10 TB hard drives. This ensures a strong and uneven distribution of your data if you have not separated the 2 TB and 10 TB nodes using a crushrule.You should only ever use one type of disk with a relatively uniform size in a Crushrule or, ideally, CEPH. This works best. If you don't pay attention to this, you will have to view the individual classes or groupings as a logical unit and ensure the correct distribution in these units.

What I don't know now is how your servers are connected and what your network looks like. But I would always recommend at least 2x 10 GbE in the LACP, plus enterprise switches that can also handle Layer3+4 and can process an MTU of 9000. I would generally recommend devices with MLAG, e.g. B. Arista or Juniper QFX5100, but no EX4550 etc.

I would be happy if my comments help you optimize the cluster a little and perhaps make it a little more resilient against individual hard drive failures.

Otherwise, I look forward to hearing from you how the case turned out.

hahosting · Nov 25, 2023

Thanks for your detailed reply.

Yes all our pools are now 3/2 except this old one.

We have set different crush rules for the different size disks for the various different pools we have created. So the 10tb disks are in a 3/2 backup crush rule pool whereas the 2tb disks are in a crush rule for the 3/2 general pool as we have called it.

For network, the CEPH cluster runs on 40GB InfiniBand currently.

I will investigate your other points which are very helpful.

hahosting · Nov 25, 2023

The disk has been cloned to a new one and heading back to the DC with it now so will update shortly

sb-jw · Nov 25, 2023

hahosting said:
The disk has been cloned to a new one and heading back to the DC with it now so will update shortly

I'm keeping my fingers crossed that it's a success!

hahosting · Nov 25, 2023

im afraid the new disk shows no valid partions and wont activate. I have put the broken osd back in and it does show albeit wont start. If i try to extract the pg as above i get

Code:

root@vms-ceph112:/var/lib/ceph/osd# ceph-objectstore-tool --op export --pgid 8.11 --data-path /var/lib/ceph/osd/ceph-1203 --file /root/8.11.export
./src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f035eac2e80 time 2023-11-25T13:29:51.083912+0000
./src/os/bluestore/BlueFS.cc: 2810: ceph_abort_msg("bluefs enospc")
 ceph version 16.2.13 (b81a1d7f978c8d41cf452da7af14e190542d2ee2) pacific (stable)
 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x7f035f6ea5d7]
 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x9bd) [0x561ed721598d]
 3: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x9a) [0x561ed7215f9a]
 4: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock<std::mutex>&)+0x2f) [0x561ed7230c2f]
 5: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x100) [0x561ed72408e0]
 6: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x48) [0x561ed7313b4c]
 7: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x338) [0x561ed74ef418]
 8: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x5d7) [0x561ed74ed99b]
 9: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0x11d) [0x561ed76b8eb1]
 10: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb::BlockHandle*, bool)+0x7d0) [0x561ed76b8c98]
 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksdb::BlockHandle*, bool)+0x48) [0x561ed76b84b4]
 12: (rocksdb::BlockBasedTableBuilder::Flush()+0x9a) [0x561ed76b8464]
 13: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x197) [0x561ed76b7f99]
 14: (rocksdb::BuildTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::FileOptions const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase<rocksdb::Slice>*, std::vector<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, std::default_delete<rocksdb::FragmentedRangeTombstoneIterator> >, std::allocator<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, std::default_delete<rocksdb::FragmentedRangeTombstoneIterator> > > >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint, unsigned long)+0x782) [0x561ed763af4a]
 15: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0x5ea) [0x561ed73b21ea]
 16: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool, bool*)+0x1ad1) [0x561ed73b0e61]
 17: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool, unsigned long*)+0x159e) [0x561ed73ae398]
 18: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool, bool)+0x677) [0x561ed73b3691]
 19: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x52) [0x561ed73b2a68]
 20: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1096) [0x561ed72c1506]
 21: (BlueStore::_open_db(bool, bool, bool)+0xa19) [0x561ed7142499]
 22: (BlueStore::_open_db_and_around(bool, bool)+0x332) [0x561ed718a5a2]
 23: (BlueStore::_mount()+0x191) [0x561ed718cf41]
 24: main()
 25: __libc_start_main()
 26: _start()
*** Caught signal (Aborted) **
 in thread 7f035eac2e80 thread_name:ceph-objectstor
 ceph version 16.2.13 (b81a1d7f978c8d41cf452da7af14e190542d2ee2) pacific (stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x13140) [0x7f035f195140]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x7f035f6ea68e]
 5: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x9bd) [0x561ed721598d]
 6: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x9a) [0x561ed7215f9a]
 7: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock<std::mutex>&)+0x2f) [0x561ed7230c2f]
 8: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x100) [0x561ed72408e0]
 9: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x48) [0x561ed7313b4c]
 10: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x338) [0x561ed74ef418]
 11: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x5d7) [0x561ed74ed99b]
 12: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0x11d) [0x561ed76b8eb1]
 13: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb::BlockHandle*, bool)+0x7d0) [0x561ed76b8c98]
 14: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksdb::BlockHandle*, bool)+0x48) [0x561ed76b84b4]
 15: (rocksdb::BlockBasedTableBuilder::Flush()+0x9a) [0x561ed76b8464]
 16: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x197) [0x561ed76b7f99]
 17: (rocksdb::BuildTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::FileOptions const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase<rocksdb::Slice>*, std::vector<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, std::default_delete<rocksdb::FragmentedRangeTombstoneIterator> >, std::allocator<std::unique_ptr<rocksdb::FragmentedRangeTombstoneIterator, std::default_delete<rocksdb::FragmentedRangeTombstoneIterator> > > >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint, unsigned long)+0x782) [0x561ed763af4a]
 18: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0x5ea) [0x561ed73b21ea]
 19: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool, bool*)+0x1ad1) [0x561ed73b0e61]
 20: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool, unsigned long*)+0x159e) [0x561ed73ae398]
 21: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool, bool)+0x677) [0x561ed73b3691]
 22: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x52) [0x561ed73b2a68]
 23: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1096) [0x561ed72c1506]
 24: (BlueStore::_open_db(bool, bool, bool)+0xa19) [0x561ed7142499]
 25: (BlueStore::_open_db_and_around(bool, bool)+0x332) [0x561ed718a5a2]
 26: (BlueStore::_mount()+0x191) [0x561ed718cf41]
 27: main()
 28: __libc_start_main()
 29: _start()
Aborted

sb-jw · Nov 25, 2023

Is the Path /var/lib/ceph/osd/ceph-1203 correct and mounted? Could you do a ls on it?

hahosting · Nov 25, 2023

yes:

Code:

root@vms-ceph112:/var/lib/ceph/osd/ceph-1203# ls
block  block.wal  ceph_fsid  fsid  keyring  ready  type  whoami
root@vms-ceph112:/var/lib/ceph/osd/ceph-1203#

sb-jw · Nov 25, 2023

It seems the tool has changed: https://docs.ceph.com/en/latest/man/8/ceph-bluestore-tool/

hahosting · Nov 25, 2023

interestingly i have run an fsck on the command you sent

Code:

root@vms-ceph112:/var/lib/ceph/osd# ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-1203
fsck success

And i get a success

hahosting · Nov 25, 2023

as an fyi this is the output when activating the osd

Code:

root@vms-ceph112:~# journalctl -u ceph-osd@1203.service -n 50
-- Journal begins at Fri 2023-06-30 13:25:22 BST, ends at Sat 2023-11-25 14:19:28 GMT. --
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  20: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1096) [0x563e274951c6]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  21: (BlueStore::_open_db(bool, bool, bool)+0xa19) [0x563e26f08669]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  22: (BlueStore::_open_db_and_around(bool, bool)+0x332) [0x563e26f50772]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  23: (BlueStore::_mount()+0x191) [0x563e26f53111]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  24: (OSD::init()+0x58d) [0x563e269ef80d]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  25: main()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  26: __libc_start_main()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  27: _start()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:      0> 2023-11-25T14:18:56.318+0000 7f3aa85ab080 -1 *** Caught signal (Aborted) **
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  in thread 7f3aa85ab080 thread_name:ceph-osd
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  ceph version 16.2.13 (b81a1d7f978c8d41cf452da7af14e190542d2ee2) pacific (stable)
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x13140) [0x7f3aa8c10140]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  2: gsignal()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  3: abort()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x563e268f76da]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  5: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x9bd) [0x563e26feea1d]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  6: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x9a) [0x563e26fef02a]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  7: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock<std::mutex>&)+0x2f) [0x563e27009cbf]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  8: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x100) [0x563e27019970]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  9: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x48) [0x563e274e742e]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  10: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x338) [0x563e276c1ef8]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  11: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x5d7) [0x563e276c047b]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  12: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0x11d) [0x563e2788adbd]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  13: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb::BlockHandle*, bool)+0x7d0) [0x563e2788aba4]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  14: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksdb::BlockHandle*, bool)+0x48) [0x563e2788a3c0]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  15: (rocksdb::BlockBasedTableBuilder::Flush()+0x9a) [0x563e2788a370]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  16: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x197) [0x563e27889ea5]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  17: (rocksdb::BuildTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, rocksdb::Mut>
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  18: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0x5ea) [0x563e27585406]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  19: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool, bool*)+0x1ad1) [0x563e2758407d]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  20: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool, unsigned long*)+0x159e) [0x563e275815b4]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  21: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::alloca>
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  22: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<>
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  23: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1096) [0x563e274951c6]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  24: (BlueStore::_open_db(bool, bool, bool)+0xa19) [0x563e26f08669]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  25: (BlueStore::_open_db_and_around(bool, bool)+0x332) [0x563e26f50772]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  26: (BlueStore::_mount()+0x191) [0x563e26f53111]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  27: (OSD::init()+0x58d) [0x563e269ef80d]
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  28: main()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  29: __libc_start_main()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  30: _start()
Nov 25 14:18:56 vms-ceph112 ceph-osd[116397]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Nov 25 14:18:56 vms-ceph112 systemd[1]: ceph-osd@1203.service: Main process exited, code=killed, status=6/ABRT
Nov 25 14:18:56 vms-ceph112 systemd[1]: ceph-osd@1203.service: Failed with result 'signal'.
Nov 25 14:18:56 vms-ceph112 systemd[1]: ceph-osd@1203.service: Consumed 1min 40.997s CPU time.
Nov 25 14:19:06 vms-ceph112 systemd[1]: ceph-osd@1203.service: Scheduled restart job, restart counter is at 2.
Nov 25 14:19:06 vms-ceph112 systemd[1]: Stopped Ceph object storage daemon osd.1203.
Nov 25 14:19:06 vms-ceph112 systemd[1]: ceph-osd@1203.service: Consumed 1min 40.997s CPU time.
Nov 25 14:19:06 vms-ceph112 systemd[1]: Starting Ceph object storage daemon osd.1203...
Nov 25 14:19:06 vms-ceph112 systemd[1]: Started Ceph object storage daemon osd.1203.
root@vms-ceph112:~#

Failed OSD for ceph stale pg

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Well-Known Member

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Well-Known Member

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Well-Known Member