[SOLVED] ceph monitor start problem after upgrade to ceph 16.2

Daniel Keller

Renowned Member
Mar 10, 2015
39
3
73
hello

hello i tried to upgrade from ceph 15.2 to 16.2 when i try to start a monitor on an upgraded node it crashes immediately

-3> 2021-07-08T22:40:05.153+0200 7f28cb2ab580 1 mon.gcd-virthost2@-1(???) e30 preinit fsid 63b215c4-1240-42f3-83fa-feb0d06089a8 -2> 2021-07-08T22:40:05.153+0200 7f28cb2ab580 5 mon.gcd-virthost2@-1(???).mds e0 Unable to load 'last_metadata' -1> 2021-07-08T22:40:05.157+0200 7f28cb2ab580 -1 ./src/mds/FSMap.cc: In function 'void FSMap::decode(ceph::buffer::v15_2_0::list::const_iterator&)' thread 7f28cb2ab580 time 2021-07-08T22:40:05.158689+0200 ./src/mds/FSMap.cc: 648: ceph_abort_msg("abort() called") ceph version 16.2.4 (a912ff2c95b1f9a8e2e48509e602ee008d5c9434) pacific (stable) 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x7f28cc1847af] 2: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f28cc6e1db8] 3: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5621f08e4ba7] 4: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5621f06b26f3] 5: (Monitor::preinit()+0x9af) [0x5621f06de51f] 6: main() 7: __libc_start_main() 8: _start() 0> 2021-07-08T22:40:05.161+0200 7f28cb2ab580 -1 *** Caught signal (Aborted) ** in thread 7f28cb2ab580 thread_name:ceph-mon ceph version 16.2.4 (a912ff2c95b1f9a8e2e48509e602ee008d5c9434) pacific (stable) 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f28cbc67140] 2: gsignal() 3: abort() 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x7f28cc184866] 5: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f28cc6e1db8] 6: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5621f08e4ba7] 7: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5621f06b26f3] 8: (Monitor::preinit()+0x9af) [0x5621f06de51f] 9: main() 10: __libc_start_main() 11: _start() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


any ideas what to do? the cluster is still running with the old monitors on the other nodes
 
Yes, I had 7.0 beta running and then upgraded to 7.0 release and then tried the upgrade to ceph 16.2.

I changed the package sources for ceph and ran apt update
apt full-upgrade and when I restart the first monitor it throws the error