Ceph - got timeout (500) - again

MarvinE

Active Member
Jan 16, 2020
119
17
38
27
Hi everyone,

we are using ceph for some days now. Now the VMs running but ceph seams "offline", the overview shows "got timeout (500)". (on every node)
Bildschirmfoto 2024-03-22 um 07.48.54.pngBildschirmfoto 2024-03-22 um 07.49.00.png

Code:
# pveceph status
command 'ceph -s' failed: got timeout

# ceph -s
2024-03-22T07:58:06.965+0100 736240a846c0  0 monclient(hunting): authenticate timed out after 300
[errno 110] RADOS timed out (error connecting to the cluster)

All ceph mon / mgr / osd services are running but some report the following:
Code:
22 07:54:48 host04 ceph-mon[723835]: 2024-03-22T07:54:48.727+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1642 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:54:53 host04 ceph-mon[723835]: 2024-03-22T07:54:53.727+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1651 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:54:58 host04 ceph-mon[723835]: 2024-03-22T07:54:58.727+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1661 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:55:03 host04 ceph-mon[723835]: 2024-03-22T07:55:03.727+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1670 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:55:08 host04 ceph-mon[723835]: 2024-03-22T07:55:08.727+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1676 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:55:13 host04 ceph-mon[723835]: 2024-03-22T07:55:13.728+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1688 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:55:18 host04 ceph-mon[723835]: 2024-03-22T07:55:18.728+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1705 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:55:23 host04 ceph-mon[723835]: 2024-03-22T07:55:23.728+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1708 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:55:28 host04 ceph-mon[723835]: 2024-03-22T07:55:28.728+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1718 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)
Mar 22 07:55:33 host04 ceph-mon[723835]: 2024-03-22T07:55:33.728+0100 73dafe6356c0 -1 mon.host04@1(probing) e8 get_health_metrics reporting 1731 slow ops, oldest is log(7 entries from seq 66 at 2024-03-21T05:06:34.539449+0100)

PVE Versions:
Code:
proxmox-ve: 8.1.0 (running kernel: 6.5.13-1-pve)
pve-manager: 8.1.4 (running version: 8.1.4/ec5affc9e41f1d79)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.5.13-1-pve-signed: 6.5.13-1
proxmox-kernel-6.5: 6.5.13-1
proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8
ceph: 18.2.1-pve2
ceph-fuse: 18.2.1-pve2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.2
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.1
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.1.0
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.4-1
proxmox-backup-file-restore: 3.1.4-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.5
proxmox-widget-toolkit: 4.1.4
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.4
pve-edk2-firmware: 4.2023.08-4
pve-firewall: 5.0.3
pve-firmware: 3.9-2
pve-ha-manager: 4.0.3
pve-i18n: 3.2.1
pve-qemu-kvm: 8.1.5-3
pve-xtermjs: 5.3.0-3
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve2

Any idea how to fix it?
 
Last edited:
  • Like
Reactions: aasami
I also have a system freeze on kernel 6.5, I solved it by rolling back to proxmox-boot-tool kernel pin 6.2.16-20-pve
 
@biotim thanks for this tip!

It's very strange, another node triggered the problem.

Code:
# pvecm status
Cluster information
-------------------
Name:             cluster01
Config Version:   8
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Fri Mar 22 08:02:01 2024
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000004
Ring ID:          3.2f4d4
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      3
Quorum:           3 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000003          1 10.XXX.X.X
0x00000004          1 10.XXX.X.X (local)
0x00000005          1 10.XXX.X.X

Code:
Mar 21 01:38:45 host03 ceph-mon[1980312]: [239B blob data]
Mar 21 01:38:45 host03 ceph-mon[1980312]: PutCF( prefix = paxos key = '2317815' value size = 1672)
Mar 21 01:38:45 host03 ceph-mon[1980312]: PutCF( prefix = paxos key = 'pending_v' value size = 8)
Mar 21 01:38:45 host03 ceph-mon[1980312]: PutCF( prefix = paxos key = 'pending_pn' value size = 8)
Mar 21 01:38:45 host03 ceph-mon[1980312]: ./src/mon/MonitorDBStore.h: In function 'int MonitorDBStore::apply_transaction(TransactionRef)' thread 71a8b18c66c0 time 2024-03-21T01:38:45.721560+0100
Mar 21 01:38:45 host03 ceph-mon[1980312]: ./src/mon/MonitorDBStore.h: 355: ceph_abort_msg("failed to write to db")
Mar 21 01:38:45 host03 ceph-mon[1980312]:  ceph version 18.2.1 (850293cdaae6621945e1191aa8c28ea2918269c3) reef (stable)
Mar 21 01:38:45 host03 ceph-mon[1980312]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd4) [0x71a8ba69bd83]
Mar 21 01:38:45 host03 ceph-mon[1980312]:  2: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x5ecc8a0359d1]
... -> see attachment

Than this node is crashed by a power supply failure!?
Bildschirmfoto 2024-03-22 um 08.20.24.png

After a fresh boot, everything is up and good.
 

Attachments

  • ceph-mon_log.txt
    17.7 KB · Views: 0
Some days later bug is back, same node same time:

Code:
Mar 23 01:38:54 host03 ceph-mon[1327]: [239B blob data]
Mar 23 01:38:54 host03 ceph-mon[1327]: PutCF( prefix = paxos key = '2403805' value size = 728)
Mar 23 01:38:54 host03 ceph-mon[1327]: PutCF( prefix = paxos key = 'pending_v' value size = 8)
Mar 23 01:38:54 host03 ceph-mon[1327]: PutCF( prefix = paxos key = 'pending_pn' value size = 8)
Mar 23 01:38:54 host03 ceph-mon[1327]: ./src/mon/MonitorDBStore.h: In function 'int MonitorDBStore::apply_transaction(TransactionRef)' thread 742cb2f2b6c0 time 2024-03-23T01:38:54.721702+0100
Mar 23 01:38:54 host03 ceph-mon[1327]: ./src/mon/MonitorDBStore.h: 355: ceph_abort_msg("failed to write to db")
Mar 23 01:38:54 host03 ceph-mon[1327]:  ceph version 18.2.1 (850293cdaae6621945e1191aa8c28ea2918269c3) reef (stable)
Mar 23 01:38:54 host03 ceph-mon[1327]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd4) [0x742cbbc9bd83]
Mar 23 01:38:54 host03 ceph-mon[1327]:  2: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x55d822e719d1]
Mar 23 01:38:54 host03 ceph-mon[1327]:  3: (Paxos::handle_begin(boost::intrusive_ptr<MonOpRequest>)+0x3c8) [0x55d822f714f8]
Mar 23 01:38:54 host03 ceph-mon[1327]:  4: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x223) [0x55d822f7c223]
Mar 23 01:38:54 host03 ceph-mon[1327]:  5: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xfa6) [0x55d822e42c76]
Mar 23 01:38:54 host03 ceph-mon[1327]:  6: (Monitor::_ms_dispatch(Message*)+0x3e8) [0x55d822e43248]
Mar 23 01:38:54 host03 ceph-mon[1327]:  7: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x45) [0x55d822e733d5]
Mar 23 01:38:54 host03 ceph-mon[1327]:  8: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x398) [0x742cbbf15688]
Mar 23 01:38:54 host03 ceph-mon[1327]:  9: (DispatchQueue::entry()+0x6ef) [0x742cbbf132ef]
Mar 23 01:38:54 host03 ceph-mon[1327]:  10: (DispatchQueue::DispatchThread::entry()+0xd) [0x742cbbfcd10d]
Mar 23 01:38:54 host03 ceph-mon[1327]:  11: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x742cbb8a8134]
Mar 23 01:38:54 host03 ceph-mon[1327]:  12: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x742cbb9287dc]
Mar 23 01:38:54 host03 ceph-mon[1327]: *** Caught signal (Aborted) **
Mar 23 01:38:54 host03 ceph-mon[1327]:  in thread 742cb2f2b6c0 thread_name:ms_dispatch
Mar 23 01:38:54 host03 ceph-mon[1327]: 2024-03-23T01:38:54.732+0100 742cb2f2b6c0 -1 ./src/mon/MonitorDBStore.h: In function 'int MonitorDBStore::apply_transaction(TransactionRef)' thread 742cb2f2b6c0 time 2024-03-23T01:38:54.721702+0100
Mar 23 01:38:54 host03 ceph-mon[1327]: ./src/mon/MonitorDBStore.h: 355: ceph_abort_msg("failed to write to db")
Mar 23 01:38:54 host03 ceph-mon[1327]:  ceph version 18.2.1 (850293cdaae6621945e1191aa8c28ea2918269c3) reef (stable)
Mar 23 01:38:54 host03 ceph-mon[1327]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd4) [0x742cbbc9bd83]
Mar 23 01:38:54 host03 ceph-mon[1327]:  2: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x55d822e719d1]
Mar 23 01:38:54 host03 ceph-mon[1327]:  3: (Paxos::handle_begin(boost::intrusive_ptr<MonOpRequest>)+0x3c8) [0x55d822f714f8]
Mar 23 01:38:54 host03 ceph-mon[1327]:  4: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x223) [0x55d822f7c223]
Mar 23 01:38:54 host03 ceph-mon[1327]:  5: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xfa6) [0x55d822e42c76]
Mar 23 01:38:54 host03 ceph-mon[1327]:  6: (Monitor::_ms_dispatch(Message*)+0x3e8) [0x55d822e43248]
Mar 23 01:38:54 host03 ceph-mon[1327]:  7: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x45) [0x55d822e733d5]
Mar 23 01:38:54 host03 ceph-mon[1327]:  8: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x398) [0x742cbbf15688]
Mar 23 01:38:54 host03 ceph-mon[1327]:  9: (DispatchQueue::entry()+0x6ef) [0x742cbbf132ef]
Mar 23 01:38:54 host03 ceph-mon[1327]:  10: (DispatchQueue::DispatchThread::entry()+0xd) [0x742cbbfcd10d]
Mar 23 01:38:54 host03 ceph-mon[1327]:  11: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x742cbb8a8134]
Mar 23 01:38:54 host03 ceph-mon[1327]:  12: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x742cbb9287dc]
Mar 23 01:38:54 host03 ceph-mon[1327]:  ceph version 18.2.1 (850293cdaae6621945e1191aa8c28ea2918269c3) reef (stable)
Mar 23 01:38:54 host03 ceph-mon[1327]:  1: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x742cbb85b050]
Mar 23 01:38:54 host03 ceph-mon[1327]:  2: /lib/x86_64-linux-gnu/libc.so.6(+0x8ae2c) [0x742cbb8a9e2c]
Mar 23 01:38:54 host03 ceph-mon[1327]:  3: gsignal()
Mar 23 01:38:54 host03 ceph-mon[1327]:  4: abort()
Mar 23 01:38:54 host03 ceph-mon[1327]:  5: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x742cbbc9be39]
Mar 23 01:38:54 host03 ceph-mon[1327]:  6: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x55d822e719d1]
Mar 23 01:38:54 host03 ceph-mon[1327]:  7: (Paxos::handle_begin(boost::intrusive_ptr<MonOpRequest>)+0x3c8) [0x55d822f714f8]
Mar 23 01:38:54 host03 ceph-mon[1327]:  8: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x223) [0x55d822f7c223]
Mar 23 01:38:54 host03 ceph-mon[1327]:  9: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xfa6) [0x55d822e42c76]
Mar 23 01:38:54 host03 ceph-mon[1327]:  10: (Monitor::_ms_dispatch(Message*)+0x3e8) [0x55d822e43248]
Mar 23 01:38:54 host03 ceph-mon[1327]:  11: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x45) [0x55d822e733d5]
Mar 23 01:38:54 host03 ceph-mon[1327]:  12: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x398) [0x742cbbf15688]
Mar 23 01:38:54 host03 ceph-mon[1327]:  13: (DispatchQueue::entry()+0x6ef) [0x742cbbf132ef]
Mar 23 01:38:54 host03 ceph-mon[1327]:  14: (DispatchQueue::DispatchThread::entry()+0xd) [0x742cbbfcd10d]
Mar 23 01:38:54 host03 ceph-mon[1327]:  15: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x742cbb8a8134]
Mar 23 01:38:54 host03 ceph-mon[1327]:  16: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x742cbb9287dc]
Mar 23 01:38:54 host03 ceph-mon[1327]: 2024-03-23T01:38:54.748+0100 742cb2f2b6c0 -1 *** Caught signal (Aborted) **
Mar 23 01:38:54 host03 ceph-mon[1327]:  in thread 742cb2f2b6c0 thread_name:ms_dispatch
Mar 23 01:38:54 host03 ceph-mon[1327]:  ceph version 18.2.1 (850293cdaae6621945e1191aa8c28ea2918269c3) reef (stable)
Mar 23 01:38:54 host03 ceph-mon[1327]:  1: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x742cbb85b050]
Mar 23 01:38:54 host03 ceph-mon[1327]:  2: /lib/x86_64-linux-gnu/libc.so.6(+0x8ae2c) [0x742cbb8a9e2c]
Mar 23 01:38:54 host03 ceph-mon[1327]:  3: gsignal()
Mar 23 01:38:54 host03 ceph-mon[1327]:  4: abort()
Mar 23 01:38:54 host03 ceph-mon[1327]:  5: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x742cbbc9be39]
Mar 23 01:38:54 host03 ceph-mon[1327]:  6: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x55d822e719d1]
Mar 23 01:38:54 host03 ceph-mon[1327]:  7: (Paxos::handle_begin(boost::intrusive_ptr<MonOpRequest>)+0x3c8) [0x55d822f714f8]
Mar 23 01:38:54 host03 ceph-mon[1327]:  8: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x223) [0x55d822f7c223]
Mar 23 01:38:54 host03 ceph-mon[1327]:  9: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xfa6) [0x55d822e42c76]
Mar 23 01:38:54 host03 ceph-mon[1327]:  10: (Monitor::_ms_dispatch(Message*)+0x3e8) [0x55d822e43248]
Mar 23 01:38:54 host03 ceph-mon[1327]:  11: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x45) [0x55d822e733d5]
Mar 23 01:38:54 host03 ceph-mon[1327]:  12: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x398) [0x742cbbf15688]
Mar 23 01:38:54 host03 ceph-mon[1327]:  13: (DispatchQueue::entry()+0x6ef) [0x742cbbf132ef]
Mar 23 01:38:54 host03 ceph-mon[1327]:  14: (DispatchQueue::DispatchThread::entry()+0xd) [0x742cbbfcd10d]
Mar 23 01:38:54 host03 ceph-mon[1327]:  15: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x742cbb8a8134]
Mar 23 01:38:54 host03 ceph-mon[1327]:  16: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x742cbb9287dc]
Mar 23 01:38:54 host03 ceph-mon[1327]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mar 23 01:38:54 host03 ceph-mon[1327]: [247B blob data]
Mar 23 01:38:54 host03 ceph-mon[1327]: PutCF( prefix = paxos key = '2403805' value size = 728)
Mar 23 01:38:54 host03 ceph-mon[1327]: PutCF( prefix = paxos key = 'pending_v' value size = 8)
Mar 23 01:38:54 host03 ceph-mon[1327]: PutCF( prefix = paxos key = 'pending_pn' value size = 8)
Mar 23 01:38:54 host03 ceph-mon[1327]:  -9998> 2024-03-23T01:38:54.732+0100 742cb2f2b6c0 -1 ./src/mon/MonitorDBStore.h: In function 'int MonitorDBStore::apply_transaction(TransactionRef)' thread 742cb2f2b6c0 time 2024-03-23T01:38:54.7217>
Mar 23 01:38:54 host03 ceph-mon[1327]: ./src/mon/MonitorDBStore.h: 355: ceph_abort_msg("failed to write to db")
Mar 23 01:38:54 host03 ceph-mon[1327]:  ceph version 18.2.1 (850293cdaae6621945e1191aa8c28ea2918269c3) reef (stable)
Mar 23 01:38:54 host03 ceph-mon[1327]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd4) [0x742cbbc9bd83]
Mar 23 01:38:54 host03 ceph-mon[1327]:  2: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x55d822e719d1]
Mar 23 01:38:54 host03 ceph-mon[1327]:  3: (Paxos::handle_begin(boost::intrusive_ptr<MonOpRequest>)+0x3c8) [0x55d822f714f8]
Mar 23 01:38:54 host03 ceph-mon[1327]:  4: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x223) [0x55d822f7c223]
Mar 23 01:38:54 host03 ceph-mon[1327]:  5: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xfa6) [0x55d822e42c76]
Mar 23 01:38:54 host03 ceph-mon[1327]:  6: (Monitor::_ms_dispatch(Message*)+0x3e8) [0x55d822e43248]
Mar 23 01:38:54 host03 ceph-mon[1327]:  7: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x45) [0x55d822e733d5]
Mar 23 01:38:54 host03 ceph-mon[1327]:  8: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x398) [0x742cbbf15688]
Mar 23 01:38:54 host03 ceph-mon[1327]:  9: (DispatchQueue::entry()+0x6ef) [0x742cbbf132ef]
Mar 23 01:38:54 host03 ceph-mon[1327]:  10: (DispatchQueue::DispatchThread::entry()+0xd) [0x742cbbfcd10d]
Mar 23 01:38:54 host03 ceph-mon[1327]:  11: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x742cbb8a8134]
Mar 23 01:38:54 host03 ceph-mon[1327]:  12: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x742cbb9287dc]
Mar 23 01:38:54 host03 ceph-mon[1327]:  -9997> 2024-03-23T01:38:54.748+0100 742cb2f2b6c0 -1 *** Caught signal (Aborted) **
Mar 23 01:38:54 host03 ceph-mon[1327]:  in thread 742cb2f2b6c0 thread_name:ms_dispatch
Mar 23 01:38:54 host03 ceph-mon[1327]:  ceph version 18.2.1 (850293cdaae6621945e1191aa8c28ea2918269c3) reef (stable)
Mar 23 01:38:54 host03 ceph-mon[1327]:  1: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x742cbb85b050]
Mar 23 01:38:54 host03 ceph-mon[1327]:  2: /lib/x86_64-linux-gnu/libc.so.6(+0x8ae2c) [0x742cbb8a9e2c]
Mar 23 01:38:54 host03 ceph-mon[1327]:  3: gsignal()
Mar 23 01:38:54 host03 ceph-mon[1327]:  4: abort()
Mar 23 01:38:54 host03 ceph-mon[1327]:  5: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x742cbbc9be39]
Mar 23 01:38:54 host03 ceph-mon[1327]:  6: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x55d822e719d1]
Mar 23 01:38:54 host03 ceph-mon[1327]:  7: (Paxos::handle_begin(boost::intrusive_ptr<MonOpRequest>)+0x3c8) [0x55d822f714f8]
Mar 23 01:38:54 host03 ceph-mon[1327]:  8: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x223) [0x55d822f7c223]
Mar 23 01:38:54 host03 ceph-mon[1327]:  9: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xfa6) [0x55d822e42c76]
Mar 23 01:38:54 host03 ceph-mon[1327]:  10: (Monitor::_ms_dispatch(Message*)+0x3e8) [0x55d822e43248]
Mar 23 01:38:54 host03 ceph-mon[1327]:  11: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x45) [0x55d822e733d5]
Mar 23 01:38:54 host03 ceph-mon[1327]:  12: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x398) [0x742cbbf15688]
Mar 23 01:38:54 host03 ceph-mon[1327]:  13: (DispatchQueue::entry()+0x6ef) [0x742cbbf132ef]
Mar 23 01:38:54 host03 ceph-mon[1327]:  14: (DispatchQueue::DispatchThread::entry()+0xd) [0x742cbbfcd10d]
Mar 23 01:38:54 host03 ceph-mon[1327]:  4: abort()
Mar 23 01:38:54 host03 ceph-mon[1327]:  5: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x18a) [0x742cbbc9be39]
Mar 23 01:38:54 host03 ceph-mon[1327]:  6: (MonitorDBStore::apply_transaction(std::shared_ptr<MonitorDBStore::Transaction>)+0xa41) [0x55d822e719d1]
... see attachment for more.

Any idea?
 

Attachments

  • ceph-mon_log.txt
    18.6 KB · Views: 0
Some interesting,

I think it can be the problem.
When our LXC backups running it fills up the root disk.
Bildschirmfoto 2024-03-26 um 08.17.42.png

I resized pve/root, and will take a look again.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!