[SOLVED] upgrade auf pve7 / ceph-pacific missglückt - mons starten nicht mehr, no qourum

markusd

Renowned Member
Apr 20, 2015
106
2
83
Dortmund
Hallo,
nach dem Upgrade auf ceph pacific ließen sich die mon's nicht mehr starten.

Ich habe dann wieder die octopus - Pakete manuell reingewürgt und ich konnte wenigstens die Monitore wieder starten.
Bei einem erneuten Versuch, upzugraden ist es nun ganz schief gelaufen und ich habe offensichtlich kein Quorum mehr.

Ein octopus - monitor läuft noch, allerdings kann ich nicht mehr auf das cluster zu greifen, sprich die üblichen Tools wie "ceph health" bleiben hängen
" 0 monclient(hunting): authenticate timed out after 300 "

systemctl status ceph-mon@virt02.service sagt mir:
"..-1 mon.virt02@0(probing) e42 get_health_metrics reporting 11675 slow ops, oldest is osd_beacon(pgs [2.3c,2.37,2.18b..

Sorry, für die wirre Zusammenfassung.
Kann mir bitte jemand helfen, einzukreisen, weshalb die pacifc - Monitore nicht starten..?

Welche Infos werden benötigt: ..ceph.conf..?

Danke

Gruß

Markus
 
Hallo,
ich hab nun die monmap editiert und kann nun wieder auf das cluster zugreifen.
Es laufen nun drei octopus - monitore
nach dem update eines nodes auf pacific lässt sich der mon wieder nicht starten

# ceph-mon@storage01.service: Start request repeated too quickly.
# Jul 27 11:36:51 storage01 systemd[1]: ceph-mon@storage01.service: Failed with result 'signal'.
# Jul 27 11:36:51 storage01 systemd[1]: Failed to start Ceph cluster monitor daemon.
# ceph-mon -f -i storage01
./src/mds/FSMap.cc: In function 'void FSMap::decode(ceph::buffer::v15_2_0::list::const_iterator&)' thread 7f4ebbd36580 time 2021-07-27T12:18:03.538913+0200
./src/mds/FSMap.cc: 648: ceph_abort_msg("abort() called")
ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x7f4ebcc117ad]
2: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
3: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
4: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
5: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
6: main()
7: __libc_start_main()
8: _start()
*** Caught signal (Aborted) **
in thread 7f4ebbd36580 thread_name:ceph-mon
2021-07-27T12:18:03.541+0200 7f4ebbd36580 -1 ./src/mds/FSMap.cc: In function 'void FSMap::decode(ceph::buffer::v15_2_0::list::const_iterator&)' thread 7f4ebbd36580 time 2021-07-27T12:18:03.53
8913+0200
./src/mds/FSMap.cc: 648: ceph_abort_msg("abort() called")

ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x7f4ebcc117ad]
2: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
3: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
4: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
5: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
6: main()
7: __libc_start_main()
8: _start()

ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f4ebc6f2140]
2: gsignal()
3: abort()
4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x7f4ebcc11864]
5: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
6: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
7: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
8: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
9: main()
10: __libc_start_main()
11: _start()
2021-07-27T12:18:03.561+0200 7f4ebbd36580 -1 *** Caught signal (Aborted) **
in thread 7f4ebbd36580 thread_name:ceph-mon

ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f4ebc6f2140]
2: gsignal()
3: abort()
4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x7f4ebcc11864]
5: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
6: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
7: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
8: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
9: main()
10: __libc_start_main()
11: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

-1> 2021-07-27T12:18:03.541+0200 7f4ebbd36580 -1 ./src/mds/FSMap.cc: In function 'void FSMap::decode(ceph::buffer::v15_2_0::list::const_iterator&)' thread 7f4ebbd36580 time 2021-07-27T12:
18:03.538913+0200
./src/mds/FSMap.cc: 648: ceph_abort_msg("abort() called")

ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x7f4ebcc117ad]
2: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
3: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
4: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
5: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
6: main()
7: __libc_start_main()
8: _start()

0> 2021-07-27T12:18:03.561+0200 7f4ebbd36580 -1 *** Caught signal (Aborted) **
in thread 7f4ebbd36580 thread_name:ceph-mon

ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f4ebc6f2140]
2: gsignal()
3: abort()
4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x7f4ebcc11864]
5: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
6: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
7: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
8: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
9: main()
10: __libc_start_main()
11: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

-1> 2021-07-27T12:18:03.541+0200 7f4ebbd36580 -1 ./src/mds/FSMap.cc: In function 'void FSMap::decode(ceph::buffer::v15_2_0::list::const_iterator&)' thread 7f4ebbd36580 time 2021-07-27T12:
18:03.538913+0200
./src/mds/FSMap.cc: 648: ceph_abort_msg("abort() called")

ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x7f4ebcc117ad]
2: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
3: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
4: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
5: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
6: main()
7: __libc_start_main()
8: _start()

0> 2021-07-27T12:18:03.561+0200 7f4ebbd36580 -1 *** Caught signal (Aborted) **
in thread 7f4ebbd36580 thread_name:ceph-mon

ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable)
1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f4ebc6f2140]
2: gsignal()
3: abort()
4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x7f4ebcc11864]
5: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x898) [0x7f4ebd16f1f8]
6: (MDSMonitor::update_from_paxos(bool*)+0x257) [0x5610d11374c7]
7: (Monitor::refresh_from_paxos(bool*)+0x163) [0x5610d0f04643]
8: (Monitor::preinit()+0x9af) [0x5610d0f3088f]
9: main()
10: __libc_start_main()
11: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Abgebrochen


Worauf kann das hin deuten?

Danke und Gruß

Markus
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!