Hi
I did a few things to our (Proxmox) Ceph cluster:
This got I/O going again.
But now I have one (large hdd) OSD that will not start - it crashes while loading pgs - see attached log file - excerpt below:
Any ideas?
Best regards,
Jesper
I did a few things to our (Proxmox) Ceph cluster:
- Added an additional node with a three more hdd OSD's (yielding a 3 node cluster with 3 hdds each)
- Increased pg_num and pgp_num for one of the pools (from 128 to 256 with size 3 and min_size 1)
- Set mon_max_pg_per_osd = 1000 to resolve issue with blocked requests
This got I/O going again.
But now I have one (large hdd) OSD that will not start - it crashes while loading pgs - see attached log file - excerpt below:
Code:
2019-08-02 10:08:21.021207 7fea86d7be00 0 osd.1 1844 load_pgs
2019-08-02 10:08:39.370112 7fea86d7be00 -1 *** Caught signal (Aborted) **
in thread 7fea86d7be00 thread_name:ceph-osd
ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
1: (()+0xa59c94) [0x55b835a6dc94]
2: (()+0x110e0) [0x7fea843800e0]
3: (gsignal()+0xcf) [0x7fea83347fff]
4: (abort()+0x16a) [0x7fea8334942a]
5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fea83c600ad]
6: (()+0x8f066) [0x7fea83c5e066]
7: (()+0x8f0b1) [0x7fea83c5e0b1]
8: (()+0x8f2c9) [0x7fea83c5e2c9]
9: (pg_log_entry_t::decode_with_checksum(ceph::buffer::list::iterator&)+0x156) [0x55b8356f57c6]
10: (void PGLog::read_log_and_missing<pg_missing_set<true> >(ObjectStore*, coll_t, coll_t, ghobject_t, pg_info_t const&, PGLog::IndexedLog&, pg_missing_set<true>&, bool, std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >&, bool, bool*, DoutPrefixProvider const*, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*, bool)+0x1ab4) [0x55b8355a6584]
11: (PG::read_state(ObjectStore*, ceph::buffer::list&)+0x38b) [0x55b83554b7eb]
12: (OSD::load_pgs()+0x8b8) [0x55b835496678]
13: (OSD::init()+0x2237) [0x55b8354b75c7]
14: (main()+0x3092) [0x55b8353bf1c2]
15: (__libc_start_main()+0xf1) [0x7fea833352e1]
16: (_start()+0x2a) [0x55b83544b8ca]
Any ideas?
Best regards,
Jesper