OSD keeps going down and out

Adam Koczarski

Well-Known Member
Mar 10, 2017
31
3
48
64
Seattle
arkski.com
I have an OSD which keeps toggling to down and out. Here's what I'm seeing in the syslog. Any clue here why this would be happening?

Sep 23 02:22:26 SeaC01N02 kernel: [533115.376053] sd 0:0:16:0: [sdo] tag#262 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 23 02:22:26 SeaC01N02 kernel: [533115.376846] sd 0:0:16:0: [sdo] tag#262 Sense Key : Medium Error [current]
Sep 23 02:22:26 SeaC01N02 kernel: [533115.377630] sd 0:0:16:0: [sdo] tag#262 Add. Sense: Unrecovered read error
Sep 23 02:22:26 SeaC01N02 kernel: [533115.378294] sd 0:0:16:0: [sdo] tag#262 CDB: Read(16) 88 00 00 00 00 00 00 54 e8 80 00 00 00 80 00 00
Sep 23 02:22:26 SeaC01N02 kernel: [533115.378964] print_req_error: critical medium error, dev sdo, sector 5564592 flags 0
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: /root/sources/pve/ceph/ceph-14.2.2/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_do_read(BlueStore::Collection*, BlueStore::OnodeRef, uint64_t, size_t, ceph::bufferlist&, uint32_t, uint64_t)' threa
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: /root/sources/pve/ceph/ceph-14.2.2/src/os/bluestore/BlueStore.cc: 8786: FAILED ceph_assert(r == 0)
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 2019-09-23 02:22:26.599 7ff715337700 -1 bdev(0x5580918fa000 /var/lib/ceph/osd/ceph-28/block) read stalled read 0xa9c10000~10000 (direct) since 533150s, timeout is 5s
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 2019-09-23 02:22:26.599 7ff715337700 -1 bluestore(/var/lib/ceph/osd/ceph-28) _do_read bdev-read failed: (61) No data available
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: ceph version 14.2.2 (a887fe9a5d3d97fe349065d3c1c9dbd7b8870855) nautilus (stable)
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x5580850d0e84]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 2: (()+0x51905c) [0x5580850d105c]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 3: (BlueStore::_do_read(BlueStore::Collection*, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&, unsigned int, unsigned long)+0x3e6a) [0x5580856e601a]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 4: (BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&, unsigned int)+0x1d3) [0x5580856e6303]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 5: (ReplicatedBackend::be_deep_scrub(hobject_t const&, ScrubMap&, ScrubMapBuilder&, ScrubMap::eek:bject&)+0x2cb) [0x5580855645db]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 6: (PGBackend::be_scan_list(ScrubMap&, ScrubMapBuilder&)+0x6db) [0x5580854816fb]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 7: (PG::build_scrub_map_chunk(ScrubMap&, ScrubMapBuilder&, hobject_t, hobject_t, bool, ThreadPool::TPHandle&)+0x83) [0x558085320b13]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 8: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x194b) [0x55808534cf7b]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 9: (PG::scrub(unsigned int, ThreadPool::TPHandle&)+0x4bb) [0x55808534e09b]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 10: (PGScrub::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x1a) [0x5580855043ca]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x7d7) [0x558085282667]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b4) [0x55808585f7d4]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5580858621d0]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 14: (()+0x7fa3) [0x7ff730066fa3]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 15: (clone()+0x3f) [0x7ff72fc164cf]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: *** Caught signal (Aborted) **
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: in thread 7ff715337700 thread_name:tp_osd_tp
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 2019-09-23 02:22:26.603 7ff715337700 -1 /root/sources/pve/ceph/ceph-14.2.2/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_do_read(BlueStore::Collection*, BlueStore::OnodeRef, uint64_t, size_t, ceph:
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: /root/sources/pve/ceph/ceph-14.2.2/src/os/bluestore/BlueStore.cc: 8786: FAILED ceph_assert(r == 0)
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: ceph version 14.2.2 (a887fe9a5d3d97fe349065d3c1c9dbd7b8870855) nautilus (stable)
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x5580850d0e84]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 2: (()+0x51905c) [0x5580850d105c]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 3: (BlueStore::_do_read(BlueStore::Collection*, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&, unsigned int, unsigned long)+0x3e6a) [0x5580856e601a]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 4: (BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&, unsigned int)+0x1d3) [0x5580856e6303]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 5: (ReplicatedBackend::be_deep_scrub(hobject_t const&, ScrubMap&, ScrubMapBuilder&, ScrubMap::eek:bject&)+0x2cb) [0x5580855645db]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 6: (PGBackend::be_scan_list(ScrubMap&, ScrubMapBuilder&)+0x6db) [0x5580854816fb]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 7: (PG::build_scrub_map_chunk(ScrubMap&, ScrubMapBuilder&, hobject_t, hobject_t, bool, ThreadPool::TPHandle&)+0x83) [0x558085320b13]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 8: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x194b) [0x55808534cf7b]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 9: (PG::scrub(unsigned int, ThreadPool::TPHandle&)+0x4bb) [0x55808534e09b]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 10: (PGScrub::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x1a) [0x5580855043ca]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x7d7) [0x558085282667]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b4) [0x55808585f7d4]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5580858621d0]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 14: (()+0x7fa3) [0x7ff730066fa3]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 15: (clone()+0x3f) [0x7ff72fc164cf]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: ceph version 14.2.2 (a887fe9a5d3d97fe349065d3c1c9dbd7b8870855) nautilus (stable)
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 1: (()+0x12730) [0x7ff730071730]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 2: (gsignal()+0x10b) [0x7ff72fb547bb]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 3: (abort()+0x121) [0x7ff72fb3f535]
Sep 23 02:22:26 SeaC01N02 ceph-osd[1518564]: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x5580850d0ed5]
 
Sep 23 02:22:26 SeaC01N02 kernel: [533115.376053] sd 0:0:16:0: [sdo] tag#262 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 23 02:22:26 SeaC01N02 kernel: [533115.376846] sd 0:0:16:0: [sdo] tag#262 Sense Key : Medium Error [current]
Sep 23 02:22:26 SeaC01N02 kernel: [533115.377630] sd 0:0:16:0: [sdo] tag#262 Add. Sense: Unrecovered read error
Sep 23 02:22:26 SeaC01N02 kernel: [533115.378294] sd 0:0:16:0: [sdo] tag#262 CDB: Read(16) 88 00 00 00 00 00 00 54 e8 80 00 00 00 80 00 00
Sep 23 02:22:26 SeaC01N02 kernel: [533115.378964] print_req_error: critical medium error, dev sdo, sector 5564592 flags 0
This looks like a disk is failing - replace sdo

I hope this helps!
 
Just ran a smartctl and got the following. I also ran it on the other 16 drives in this node. sdm and sdn also have Raw_Read_Errors. The rest look clean. I might have a few drives which need replacing??

1569248164863.png
 
hmm - the raw read error rate would not worry me this much - the 4600 current pending sectors and the offline uncorrectable count would worry me.

since the disk looks rather new (949 hours) - maybe also check the cables and power supply.

What kind of disk is this?
 
These are 8TB spinners in (5) brand new Dell R740xd servers. I'll contact Dell and get this drive replaced. I'll also check the Raw_Read_Errors on the other spinners. I did have a couple of the other adjacent drives go down and out last week. I <think> it was the other drives with Raw_Read_Errors.
 
These are 8TB spinners in (5) brand new Dell R740xd servers. I'll contact Dell and get this drive replaced. I'll also check the Raw_Read_Errors on the other spinners. I did have a couple of the other adjacent drives go down and out last week. I <think> it was the other drives with Raw_Read_Errors.

You should be checking every disk for errors before putting them into production. Run a test on the disk before trying to RMA - if it's fine your issue is elsewhere.
 
I'm in pre-production of Proxmox/Ceph now. As for running disk tests, what would you recommend for accomplishing this?

The usual. Smartctl can run long tests on offline disks and there are other options out there if you research.

They were in the cluster already so it's technically in production even if the cluster isn't. You want to test the disk for errors right after getting them so you can return them to the seller for replacement/refund during the 15-30 day RMA period. Otherwise you're stuck with going through seagate/WD unless you have a contract with the seller.
 
Dell confirmed the failing drive via iDRAC. The replacement is on the way. Is the process for replacing a drive with an associated DB/WAL via the version 6 GUI documented somewhere?
 
Can anyone confirm if destroying an osd via the GUI will also destroy the associated db/wal I initially created on the VNMe? Then just create the replacement osd on the new drive referencing the NVMe as before??

TIA!
 
New drive installed. Since the osd was already down and out I destroyed it, shut down the node and replaced this non-hot swapable drive in the mid-bay of the server. Booted it back up, tested the drive and recreated the osd and associated it with the VNMe for db/wal. Worked like a charm!

Thx for the help...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!