Ceph OSD stoped and out

frantek

Member
May 30, 2009
160
4
18
Hi,

I've a problem with one OSD in my Ceph cluster:

Code:
# ceph health detail
HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
OSD_SCRUB_ERRORS 1 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
    pg 7.2fa is active+clean+inconsistent, acting [13,6,16]
Code:
# systemctl status ceph-osd@14.service
● ceph-osd@14.service - Ceph object storage daemon osd.14
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-osd@.service.d
           └─ceph-after-pve-cluster.conf
   Active: failed (Result: signal) since Wed 2019-05-15 10:20:26 CEST; 6min ago
  Process: 3225166 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 14 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
  Process: 3225161 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 14 (code=exited, status=0/SUCCESS)
 Main PID: 3225166 (code=killed, signal=ABRT)

Mai 15 10:20:26 pve03 systemd[1]: ceph-osd@14.service: Start request repeated too quickly.
Mai 15 10:20:26 pve03 systemd[1]: Failed to start Ceph object storage daemon osd.14.
Mai 15 10:20:26 pve03 systemd[1]: ceph-osd@14.service: Unit entered failed state.
Mai 15 10:20:26 pve03 systemd[1]: ceph-osd@14.service: Failed with result 'signal'.
Mai 15 10:23:20 pve03 systemd[1]: ceph-osd@14.service: Start request repeated too quickly.
Mai 15 10:23:20 pve03 systemd[1]: Failed to start Ceph object storage daemon osd.14.
Mai 15 10:23:20 pve03 systemd[1]: ceph-osd@14.service: Failed with result 'signal'.
Mai 15 10:24:14 pve03 systemd[1]: ceph-osd@14.service: Start request repeated too quickly.
Mai 15 10:24:14 pve03 systemd[1]: Failed to start Ceph object storage daemon osd.14.
Mai 15 10:24:14 pve03 systemd[1]: ceph-osd@14.service: Failed with result 'signal'.
# pveversion --verbose
proxmox-ve: 5.4-1 (running kernel: 4.15.18-12-pve)
pve-manager: 5.4-5 (running version: 5.4-5/c6fdb264)
pve-kernel-4.15: 5.4-2
pve-kernel-4.15.18-14-pve: 4.15.18-38
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
ceph: 12.2.12-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-9
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-51
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-42
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-26
pve-cluster: 5.0-37
pve-container: 2.0-37
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-20
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-51
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2

How to fix this?

TIA
 

frantek

Member
May 30, 2009
160
4
18
Hmm, now scrubbing errors are gone by doing nothing. Now I get:

Code:
~# ceph health detail
HEALTH_WARN 1 osds down; 44423/801015 objects misplaced (5.546%)
OSD_DOWN 1 osds down
    osd.14 (root=default,host=pve03) is down
OBJECT_MISPLACED 44423/801015 objects misplaced (5.546%)
Code:
# systemctl status ceph-osd@14.service
● ceph-osd@14.service - Ceph object storage daemon osd.14
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-osd@.service.d
           └─ceph-after-pve-cluster.conf
   Active: activating (auto-restart) (Result: signal) since Wed 2019-05-15 19:38:15 CEST; 7s ago
  Process: 3633324 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 14 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
  Process: 3633319 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 14 (code=exited, status=0/SUCCESS)
 Main PID: 3633324 (code=killed, signal=ABRT)

Mai 15 19:38:15 pve03 systemd[1]: ceph-osd@14.service: Failed with result 'signal'.
I can start the OSD but then it crashes with:

--
-- Unit pvesr.service has begun starting up.
Mai 15 19:39:01 pve03 systemd[1]: Started Proxmox VE replication runner.
-- Subject: Unit pvesr.service has finished start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit pvesr.service has finished starting up.
--
-- The start-up result is done.
Mai 15 19:39:05 pve03 systemd[1]: ceph-osd@14.service: Service hold-off time over, scheduling restart.
Mai 15 19:39:05 pve03 systemd[1]: Stopped Ceph object storage daemon osd.14.
-- Subject: Unit ceph-osd@14.service has finished shutting down
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@14.service has finished shutting down.
Mai 15 19:39:05 pve03 systemd[1]: Starting Ceph object storage daemon osd.14...
-- Subject: Unit ceph-osd@14.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@14.service has begun starting up.
Mai 15 19:39:05 pve03 systemd[1]: Started Ceph object storage daemon osd.14.
-- Subject: Unit ceph-osd@14.service has finished start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@14.service has finished starting up.
--
-- The start-up result is done.
Mai 15 19:39:05 pve03 ceph-osd[3634089]: starting osd.14 at - osd_data /var/lib/ceph/osd/ceph-14 /var/lib/ceph/osd/ceph-14/journal
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Sense Key : Medium Error [current]
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Add. Sense: Unrecovered read error
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 CDB: Read(10) 28 00 05 d8 4c 60 00 00 08 00
Mai 15 19:39:08 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Sense Key : Medium Error [current]
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Add. Sense: Unrecovered read error
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 CDB: Read(10) 28 00 05 d8 4c 60 00 00 08 00
Mai 15 19:39:09 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Sense Key : Medium Error [current]
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Add. Sense: Unrecovered read error
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 CDB: Read(10) 28 00 05 d8 4c 60 00 00 08 00
Mai 15 19:39:12 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.065699 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) _write(3544): wri
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: In function 'v
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: 3185: FAILED a
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.065753 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) error (5) Input/
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563d3983f262]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.069017 7fccb4415e00 -1 /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/sr
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: 3185: FAILED a
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563d3983f262]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: -4> 2019-05-15 19:39:12.065699 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) _write(35
Mai 15 19:39:12 pve03 ceph-osd[3634089]: -3> 2019-05-15 19:39:12.065753 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) error (5
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 0> 2019-05-15 19:39:12.069017 7fccb4415e00 -1 /mnt/pve/store/tlamprecht/sources/ceph/ceph-12
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: 3185: FAILED a
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563d3983f262]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: *** Caught signal (Aborted) **
Mai 15 19:39:12 pve03 ceph-osd[3634089]: in thread 7fccb4415e00 thread_name:ceph-osd
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (()+0xa59c94) [0x563d397f6c94]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (()+0x110e0) [0x7fccb1a1d0e0]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (gsignal()+0xcf) [0x7fccb09e4fff]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (abort()+0x16a) [0x7fccb09e642a]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x563d3983f3ee]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 10: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 11: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 12: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 13: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.072802 7fccb4415e00 -1 *** Caught signal (Aborted) **
Mai 15 19:39:12 pve03 ceph-osd[3634089]: in thread 7fccb4415e00 thread_name:ceph-osd
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (()+0xa59c94) [0x563d397f6c94]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (()+0x110e0) [0x7fccb1a1d0e0]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (gsignal()+0xcf) [0x7fccb09e4fff]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (abort()+0x16a) [0x7fccb09e642a]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x563d3983f3ee]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 10: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 11: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 12: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 13: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 0> 2019-05-15 19:39:12.072802 7fccb4415e00 -1 *** Caught signal (Aborted) **
Mai 15 19:39:12 pve03 ceph-osd[3634089]: in thread 7fccb4415e00 thread_name:ceph-osd
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (()+0xa59c94) [0x563d397f6c94]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (()+0x110e0) [0x7fccb1a1d0e0]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (gsignal()+0xcf) [0x7fccb09e4fff]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (abort()+0x16a) [0x7fccb09e642a]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x563d3983f3ee]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 10: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 11: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 12: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 13: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 systemd[1]: ceph-osd@14.service: Main process exited, code=killed, status=6/ABRT
Mai 15 19:39:12 pve03 systemd[1]: ceph-osd@14.service: Unit entered failed state.
Mai 15 19:39:12 pve03 systemd[1]: ceph-osd@14.service: Failed with result 'signal'.

Code:
Mai 15 19:39:08 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
sounds like I should replace the disk ... right?

Well, smartcl does not look good ...

Code:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   091   091   006    Pre-fail  Always       -       181999400
  3 Spin_Up_Time            0x0003   098   098   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       15
  5 Reallocated_Sector_Ct   0x0033   097   097   036    Pre-fail  Always       -       584
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       538350167
  9 Power_On_Hours          0x0032   076   076   000    Old_age   Always       -       21203 (23 243 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       15
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       664
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   069   064   045    Old_age   Always       -       31 (Min/Max 29/33)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       2
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       11
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       71
194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always       -       31 (0 25 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       16
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       16
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   076   076   000    Old_age   Offline      -       21202 (114 252 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       48702137992
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       41224941657
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0
 

adamb

Well-Known Member
Mar 1, 2012
1,028
32
48
Hmm, now scrubbing errors are gone by doing nothing. Now I get:

Code:
~# ceph health detail
HEALTH_WARN 1 osds down; 44423/801015 objects misplaced (5.546%)
OSD_DOWN 1 osds down
    osd.14 (root=default,host=pve03) is down
OBJECT_MISPLACED 44423/801015 objects misplaced (5.546%)
Code:
# systemctl status ceph-osd@14.service
● ceph-osd@14.service - Ceph object storage daemon osd.14
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-osd@.service.d
           └─ceph-after-pve-cluster.conf
   Active: activating (auto-restart) (Result: signal) since Wed 2019-05-15 19:38:15 CEST; 7s ago
  Process: 3633324 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 14 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
  Process: 3633319 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 14 (code=exited, status=0/SUCCESS)
 Main PID: 3633324 (code=killed, signal=ABRT)

Mai 15 19:38:15 pve03 systemd[1]: ceph-osd@14.service: Failed with result 'signal'.
I can start the OSD but then it crashes with:

--
-- Unit pvesr.service has begun starting up.
Mai 15 19:39:01 pve03 systemd[1]: Started Proxmox VE replication runner.
-- Subject: Unit pvesr.service has finished start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit pvesr.service has finished starting up.
--
-- The start-up result is done.
Mai 15 19:39:05 pve03 systemd[1]: ceph-osd@14.service: Service hold-off time over, scheduling restart.
Mai 15 19:39:05 pve03 systemd[1]: Stopped Ceph object storage daemon osd.14.
-- Subject: Unit ceph-osd@14.service has finished shutting down
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@14.service has finished shutting down.
Mai 15 19:39:05 pve03 systemd[1]: Starting Ceph object storage daemon osd.14...
-- Subject: Unit ceph-osd@14.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@14.service has begun starting up.
Mai 15 19:39:05 pve03 systemd[1]: Started Ceph object storage daemon osd.14.
-- Subject: Unit ceph-osd@14.service has finished start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit ceph-osd@14.service has finished starting up.
--
-- The start-up result is done.
Mai 15 19:39:05 pve03 ceph-osd[3634089]: starting osd.14 at - osd_data /var/lib/ceph/osd/ceph-14 /var/lib/ceph/osd/ceph-14/journal
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Sense Key : Medium Error [current]
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Add. Sense: Unrecovered read error
Mai 15 19:39:08 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 CDB: Read(10) 28 00 05 d8 4c 60 00 00 08 00
Mai 15 19:39:08 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Sense Key : Medium Error [current]
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Add. Sense: Unrecovered read error
Mai 15 19:39:09 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 CDB: Read(10) 28 00 05 d8 4c 60 00 00 08 00
Mai 15 19:39:09 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Sense Key : Medium Error [current]
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 Add. Sense: Unrecovered read error
Mai 15 19:39:12 pve03 kernel: sd 4:0:17:0: [sdg] tag#1 CDB: Read(10) 28 00 05 d8 4c 60 00 00 08 00
Mai 15 19:39:12 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.065699 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) _write(3544): wri
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: In function 'v
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: 3185: FAILED a
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.065753 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) error (5) Input/
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563d3983f262]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.069017 7fccb4415e00 -1 /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/sr
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: 3185: FAILED a
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563d3983f262]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: -4> 2019-05-15 19:39:12.065699 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) _write(35
Mai 15 19:39:12 pve03 ceph-osd[3634089]: -3> 2019-05-15 19:39:12.065753 7fccb4415e00 -1 filestore(/var/lib/ceph/osd/ceph-14) error (5
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 0> 2019-05-15 19:39:12.069017 7fccb4415e00 -1 /mnt/pve/store/tlamprecht/sources/ceph/ceph-12
Mai 15 19:39:12 pve03 ceph-osd[3634089]: /mnt/pve/store/tlamprecht/sources/ceph/ceph-12.2.12/src/os/filestore/FileStore.cc: 3185: FAILED a
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563d3983f262]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: *** Caught signal (Aborted) **
Mai 15 19:39:12 pve03 ceph-osd[3634089]: in thread 7fccb4415e00 thread_name:ceph-osd
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (()+0xa59c94) [0x563d397f6c94]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (()+0x110e0) [0x7fccb1a1d0e0]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (gsignal()+0xcf) [0x7fccb09e4fff]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (abort()+0x16a) [0x7fccb09e642a]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x563d3983f3ee]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 10: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 11: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 12: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 13: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2019-05-15 19:39:12.072802 7fccb4415e00 -1 *** Caught signal (Aborted) **
Mai 15 19:39:12 pve03 ceph-osd[3634089]: in thread 7fccb4415e00 thread_name:ceph-osd
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (()+0xa59c94) [0x563d397f6c94]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (()+0x110e0) [0x7fccb1a1d0e0]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (gsignal()+0xcf) [0x7fccb09e4fff]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (abort()+0x16a) [0x7fccb09e642a]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x563d3983f3ee]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 10: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 11: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 12: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 13: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 0> 2019-05-15 19:39:12.072802 7fccb4415e00 -1 *** Caught signal (Aborted) **
Mai 15 19:39:12 pve03 ceph-osd[3634089]: in thread 7fccb4415e00 thread_name:ceph-osd
Mai 15 19:39:12 pve03 ceph-osd[3634089]: ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 1: (()+0xa59c94) [0x563d397f6c94]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 2: (()+0x110e0) [0x7fccb1a1d0e0]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 3: (gsignal()+0xcf) [0x7fccb09e4fff]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 4: (abort()+0x16a) [0x7fccb09e642a]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x563d3983f3ee]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 6: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHand
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 7: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 8: (JournalingObjectStore::journal_replay(unsigned long)+0xdda) [0x563d395fa2ea]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 9: (FileStore::mount()+0x48f8) [0x563d395e4d18]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 10: (OSD::init()+0x3e2) [0x563d3923e772]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 11: (main()+0x3092) [0x563d391481c2]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 12: (__libc_start_main()+0xf1) [0x7fccb09d22e1]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: 13: (_start()+0x2a) [0x563d391d48ca]
Mai 15 19:39:12 pve03 ceph-osd[3634089]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Mai 15 19:39:12 pve03 systemd[1]: ceph-osd@14.service: Main process exited, code=killed, status=6/ABRT
Mai 15 19:39:12 pve03 systemd[1]: ceph-osd@14.service: Unit entered failed state.
Mai 15 19:39:12 pve03 systemd[1]: ceph-osd@14.service: Failed with result 'signal'.

Code:
Mai 15 19:39:08 pve03 kernel: print_req_error: critical medium error, dev sdg, sector 98061408
sounds like I should replace the disk ... right?

Well, smartcl does not look good ...

Code:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   091   091   006    Pre-fail  Always       -       181999400
  3 Spin_Up_Time            0x0003   098   098   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       15
  5 Reallocated_Sector_Ct   0x0033   097   097   036    Pre-fail  Always       -       584
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       538350167
  9 Power_On_Hours          0x0032   076   076   000    Old_age   Always       -       21203 (23 243 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       15
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       664
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   069   064   045    Old_age   Always       -       31 (Min/Max 29/33)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       2
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       11
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       71
194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always       -       31 (0 25 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       16
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       16
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   076   076   000    Old_age   Offline      -       21202 (114 252 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       48702137992
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       41224941657
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0

I agree sounds like a bad disk, I am new to ceph as well, but trying to keep up with threads like this so I can learn from others experience. Let me know how you make out. Hopefully a new disk will get you up and going.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!