Hello,
I have a cluster with 3 PCs, each having 3 osd's.
It works great so far, but over time (after a few hours / days) the osd's start to go down.
The cluster has been in use for about 4 weeks and roughly I lose one osd per day.
After it's down it cannot be restarted via ceph commands.
Even a (hot) reboot of the pc won't fix it.
But if I shut down the pc and start it again, all osd's are back online again and the "osd goes down" cycle starts again.
I'm using consumer grade hdd disks. What I would like to know is, if it's a hardware problem or a software problem. The fact, that it works fine for a time after a restart lets me doubt that the hdd's are defect. I also find it unlikely that all 9 hdds have the same problem.
After the osd is down the block device is still available in the system. Checked it with lsblk and in the /dev directory.
pve-manager/5.1-43/bdb08029 (running kernel: 4.13.13-5-pve)
ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)
Here the dmesg output after an osd failed:
[15144.958084] device tap103i0 entered promiscuous mode
[15144.963120] vmbr0: port 3(tap103i0) entered blocking state
[15144.963121] vmbr0: port 3(tap103i0) entered disabled state
[15144.963175] vmbr0: port 3(tap103i0) entered blocking state
[15144.963176] vmbr0: port 3(tap103i0) entered forwarding state
[95238.849549] ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[95238.849561] ata10.00: failed command: FLUSH CACHE EXT
[95238.849567] ata10.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 18
res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[95238.849573] ata10.00: status: { DRDY }
[95238.849578] ata10: hard resetting link
[95248.849796] ata10: softreset failed (1st FIS failed)
[95248.849808] ata10: hard resetting link
[95258.850199] ata10: softreset failed (1st FIS failed)
[95258.850210] ata10: hard resetting link
[95293.849896] ata10: softreset failed (1st FIS failed)
[95293.849910] ata10: limiting SATA link speed to 3.0 Gbps
[95293.849911] ata10: hard resetting link
[95299.025448] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[95299.025453] ata10.00: link online but device misclassified
[95304.117429] ata10.00: qc timeout (cmd 0xec)
[95304.117438] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[95304.117439] ata10.00: revalidation failed (errno=-5)
[95304.117448] ata10: hard resetting link
[95314.117507] ata10: softreset failed (1st FIS failed)
[95314.117516] ata10: hard resetting link
[95324.118412] ata10: softreset failed (1st FIS failed)
[95324.118421] ata10: hard resetting link
[95337.141381] INFO: task bstore_kv_sync:2185 blocked for more than 120 seconds.
[95337.141390] Tainted: P O 4.13.13-5-pve #1
[95337.141394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95337.141399] bstore_kv_sync D 0 2185 1 0x00000000
[95337.141401] Call Trace:
[95337.141407] __schedule+0x3cc/0x850
[95337.141409] schedule+0x36/0x80
[95337.141410] schedule_timeout+0x1da/0x350
[95337.141412] ? __blk_run_queue+0x3d/0x60
[95337.141414] ? blk_queue_bio+0x3d3/0x400
[95337.141415] io_schedule_timeout+0x1e/0x50
[95337.141416] ? io_schedule_timeout+0x1e/0x50
[95337.141417] wait_for_completion_io+0xb4/0x140
[95337.141418] ? wake_up_q+0x80/0x80
[95337.141420] submit_bio_wait+0x68/0x90
[95337.141421] blkdev_issue_flush+0x5c/0x90
[95337.141422] blkdev_fsync+0x35/0x50
[95337.141423] vfs_fsync_range+0x4b/0xb0
[95337.141424] do_fsync+0x3d/0x70
[95337.141425] SyS_fdatasync+0x13/0x20
[95337.141427] entry_SYSCALL_64_fastpath+0x33/0xa3
[95337.141428] RIP: 0033:0x7fa37558463d
[95337.141429] RSP: 002b:00007fa365129110 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[95337.141430] RAX: ffffffffffffffda RBX: 00005601c7ab0d80 RCX: 00007fa37558463d
[95337.141431] RDX: 0a20c08100000000 RSI: 00007fa3651290f0 RDI: 0000000000000016
[95337.141431] RBP: 0a20c0815a7d85da R08: 00005601c7ab0ed0 R09: 000000000002e21d
[95337.141432] R10: 00007fa3651290d0 R11: 0000000000000293 R12: 00007fa37650b020
[95337.141432] R13: 00005601c7ab0ed0 R14: 00005601c7dbe000 R15: 0000000000000000
[95337.141461] INFO: task vgs:9632 blocked for more than 120 seconds.
[95337.141465] Tainted: P O 4.13.13-5-pve #1
[95337.141469] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95337.141473] vgs D 0 9632 1952 0x00000000
[95337.141474] Call Trace:
[95337.141476] __schedule+0x3cc/0x850
[95337.141478] schedule+0x36/0x80
[95337.141479] io_schedule+0x16/0x40
[95337.141480] __blkdev_direct_IO_simple+0x1e7/0x320
[95337.141481] ? bdget+0x120/0x120
[95337.141482] blkdev_direct_IO+0x3e1/0x3f0
[95337.141483] ? blkdev_direct_IO+0x3e1/0x3f0
[95337.141485] ? __filemap_fdatawrite_range+0xd4/0x100
[95337.141486] generic_file_read_iter+0xcb/0x9e0
[95337.141487] ? generic_file_read_iter+0xcb/0x9e0
[95337.141489] ? do_filp_open+0xad/0x110
[95337.141490] ? _copy_to_user+0x2a/0x40
[95337.141491] ? cp_new_stat+0x152/0x180
[95337.141492] blkdev_read_iter+0x35/0x40
[95337.141494] new_sync_read+0xde/0x130
[95337.141495] __vfs_read+0x26/0x40
[95337.141496] vfs_read+0x96/0x130
[95337.141498] SyS_read+0x55/0xc0
[95337.141499] entry_SYSCALL_64_fastpath+0x33/0xa3
[95337.141500] RIP: 0033:0x7fb2ea2d7700
[95337.141500] RSP: 002b:00007fff10228278 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[95337.141501] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb2ea2d7700
[95337.141502] RDX: 0000000000001000 RSI: 000055fcab943000 RDI: 0000000000000004
[95337.141502] RBP: 00007fff102282d0 R08: 00007fb2ea596248 R09: 0000000000001000
[95337.141502] R10: 0000000000000080 R11: 0000000000000246 R12: 000055fcab943000
[95337.141503] R13: 0000000000000000 R14: 0000000000000004 R15: 00007fff10228330
[95359.118207] ata10: softreset failed (1st FIS failed)
[95359.118220] ata10: limiting SATA link speed to 1.5 Gbps
[95359.118221] ata10: hard resetting link
[95364.285325] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[95364.285330] ata10.00: link online but device misclassified
[95374.517311] ata10.00: qc timeout (cmd 0xec)
[95374.517321] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[95374.517322] ata10.00: revalidation failed (errno=-5)
[95374.517334] ata10: hard resetting link
[95384.517482] ata10: softreset failed (1st FIS failed)
[95384.517492] ata10: hard resetting link
[95394.518306] ata10: softreset failed (1st FIS failed)
[95394.518315] ata10: hard resetting link
[95429.517872] ata10: softreset failed (1st FIS failed)
[95429.517883] ata10: hard resetting link
[95434.693187] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[95434.693192] ata10.00: link online but device misclassified
[95457.977159] INFO: task bstore_kv_sync:2185 blocked for more than 120 seconds.
[95457.977169] Tainted: P O 4.13.13-5-pve #1
[95457.977173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95457.977177] bstore_kv_sync D 0 2185 1 0x00000000
[95457.977180] Call Trace:
[95457.977186] __schedule+0x3cc/0x850
[95457.977188] schedule+0x36/0x80
[95457.977189] schedule_timeout+0x1da/0x350
[95457.977191] ? __blk_run_queue+0x3d/0x60
[95457.977192] ? blk_queue_bio+0x3d3/0x400
[95457.977193] io_schedule_timeout+0x1e/0x50
[95457.977194] ? io_schedule_timeout+0x1e/0x50
[95457.977195] wait_for_completion_io+0xb4/0x140
[95457.977197] ? wake_up_q+0x80/0x80
[95457.977198] submit_bio_wait+0x68/0x90
[95457.977200] blkdev_issue_flush+0x5c/0x90
[95457.977201] blkdev_fsync+0x35/0x50
[95457.977202] vfs_fsync_range+0x4b/0xb0
[95457.977203] do_fsync+0x3d/0x70
[95457.977204] SyS_fdatasync+0x13/0x20
[95457.977205] entry_SYSCALL_64_fastpath+0x33/0xa3
[95457.977207] RIP: 0033:0x7fa37558463d
[95457.977208] RSP: 002b:00007fa365129110 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[95457.977209] RAX: ffffffffffffffda RBX: 00005601c7ab0d80 RCX: 00007fa37558463d
[95457.977210] RDX: 0a20c08100000000 RSI: 00007fa3651290f0 RDI: 0000000000000016
[95457.977210] RBP: 0a20c0815a7d85da R08: 00005601c7ab0ed0 R09: 000000000002e21d
[95457.977211] R10: 00007fa3651290d0 R11: 0000000000000293 R12: 00007fa37650b020
[95457.977211] R13: 00005601c7ab0ed0 R14: 00005601c7dbe000 R15: 0000000000000000
[95457.977241] INFO: task vgs:9632 blocked for more than 120 seconds.
[95457.977245] Tainted: P O 4.13.13-5-pve #1
[95457.977248] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95457.977252] vgs D 0 9632 1952 0x00000000
[95457.977253] Call Trace:
[95457.977258] __schedule+0x3cc/0x850
[95457.977260] schedule+0x36/0x80
[95457.977261] io_schedule+0x16/0x40
[95457.977262] __blkdev_direct_IO_simple+0x1e7/0x320
[95457.977263] ? bdget+0x120/0x120
[95457.977264] blkdev_direct_IO+0x3e1/0x3f0
[95457.977265] ? blkdev_direct_IO+0x3e1/0x3f0
[95457.977267] ? __filemap_fdatawrite_range+0xd4/0x100
[95457.977268] generic_file_read_iter+0xcb/0x9e0
[95457.977269] ? generic_file_read_iter+0xcb/0x9e0
[95457.977271] ? do_filp_open+0xad/0x110
[95457.977272] ? _copy_to_user+0x2a/0x40
[95457.977273] ? cp_new_stat+0x152/0x180
[95457.977274] blkdev_read_iter+0x35/0x40
[95457.977275] new_sync_read+0xde/0x130
[95457.977277] __vfs_read+0x26/0x40
[95457.977278] vfs_read+0x96/0x130
[95457.977279] SyS_read+0x55/0xc0
[95457.977281] entry_SYSCALL_64_fastpath+0x33/0xa3
[95457.977281] RIP: 0033:0x7fb2ea2d7700
[95457.977282] RSP: 002b:00007fff10228278 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[95457.977283] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb2ea2d7700
[95457.977283] RDX: 0000000000001000 RSI: 000055fcab943000 RDI: 0000000000000004
[95457.977284] RBP: 00007fff102282d0 R08: 00007fb2ea596248 R09: 0000000000001000
[95457.977284] R10: 0000000000000080 R11: 0000000000000246 R12: 000055fcab943000
[95457.977285] R13: 0000000000000000 R14: 0000000000000004 R15: 00007fff10228330
[95466.165227] ata10.00: qc timeout (cmd 0xec)
[95466.165237] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[95466.165238] ata10.00: revalidation failed (errno=-5)
[95466.165246] ata10.00: disabled
[95466.165258] ata10: hard resetting link
[95476.165476] ata10: softreset failed (1st FIS failed)
[95476.165486] ata10: hard resetting link
[95486.165106] ata10: softreset failed (1st FIS failed)
[95486.165115] ata10: hard resetting link
[95521.165018] ata10: softreset failed (1st FIS failed)
[95521.165031] ata10: hard resetting link
[95526.381011] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[95526.381015] ata10.00: link online but device misclassified
[95526.381026] ata10: EH complete
[95526.381042] sd 9:0:0:0: [sdd] tag#19 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381044] sd 9:0:0:0: [sdd] tag#19 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[95526.381048] print_req_error: I/O error, dev sdd, sector 0
[95526.381085] sd 9:0:0:0: [sdd] tag#20 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381086] sd 9:0:0:0: [sdd] tag#20 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95526.381087] print_req_error: I/O error, dev sdd, sector 4096
[95526.381091] sd 9:0:0:0: [sdd] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381092] sd 9:0:0:0: [sdd] tag#21 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00
[95526.381198] sd 9:0:0:0: [sdd] tag#23 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381200] sd 9:0:0:0: [sdd] tag#23 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95526.381202] print_req_error: I/O error, dev sdd, sector 7814033280
[95526.381258] sd 9:0:0:0: [sdd] tag#24 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381260] sd 9:0:0:0: [sdd] tag#24 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95526.381261] print_req_error: I/O error, dev sdd, sector 7814033392
[95526.381328] sd 9:0:0:0: [sdd] tag#25 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381330] sd 9:0:0:0: [sdd] tag#25 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95526.381331] print_req_error: I/O error, dev sdd, sector 4096
[95526.381979] sd 9:0:0:0: [sdd] tag#26 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381980] sd 9:0:0:0: [sdd] tag#26 CDB: Read(16) 88 00 00 00 00 00 00 00 10 08 00 00 00 08 00 00
[95526.381981] print_req_error: I/O error, dev sdd, sector 4104
[95526.382459] sd 9:0:0:0: [sdd] tag#27 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.382460] sd 9:0:0:0: [sdd] tag#27 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95526.382461] print_req_error: I/O error, dev sdd, sector 4096
[95526.393092] sd 9:0:0:0: [sdd] tag#29 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.393094] sd 9:0:0:0: [sdd] tag#29 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95526.393095] print_req_error: I/O error, dev sdd, sector 0
[95526.393548] sd 9:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.393549] sd 9:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00
[95526.393550] print_req_error: I/O error, dev sdd, sector 7814036992
[95526.393964] print_req_error: I/O error, dev sdd, sector 7814037152
[95526.524404] Buffer I/O error on dev dm-0, logical block 976753648, async page read
[95526.525126] Buffer I/O error on dev dm-0, logical block 976753648, async page read
[95536.559431] scsi_io_completion: 64 callbacks suppressed
[95536.559434] sd 9:0:0:0: [sdd] tag#28 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.559437] sd 9:0:0:0: [sdd] tag#28 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95536.559437] print_req_error: 63 callbacks suppressed
[95536.559438] print_req_error: I/O error, dev sdd, sector 4096
[95536.559881] sd 9:0:0:0: [sdd] tag#29 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.559882] sd 9:0:0:0: [sdd] tag#29 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95536.559883] print_req_error: I/O error, dev sdd, sector 7814033280
[95536.560297] sd 9:0:0:0: [sdd] tag#30 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.560299] sd 9:0:0:0: [sdd] tag#30 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95536.560300] print_req_error: I/O error, dev sdd, sector 7814033392
[95536.560701] sd 9:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.560702] sd 9:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95536.560703] print_req_error: I/O error, dev sdd, sector 4096
[95536.561130] sd 9:0:0:0: [sdd] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.561131] sd 9:0:0:0: [sdd] tag#1 CDB: Read(16) 88 00 00 00 00 00 00 00 10 08 00 00 00 08 00 00
[95536.561132] print_req_error: I/O error, dev sdd, sector 4104
[95536.561511] sd 9:0:0:0: [sdd] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.561513] sd 9:0:0:0: [sdd] tag#2 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95536.561513] print_req_error: I/O error, dev sdd, sector 4096
[95536.579212] sd 9:0:0:0: [sdd] tag#4 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.579214] sd 9:0:0:0: [sdd] tag#4 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95536.579215] print_req_error: I/O error, dev sdd, sector 0
[95536.579585] sd 9:0:0:0: [sdd] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.579587] sd 9:0:0:0: [sdd] tag#6 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00
[95536.579588] print_req_error: I/O error, dev sdd, sector 7814036992
[95536.579938] sd 9:0:0:0: [sdd] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.579939] sd 9:0:0:0: [sdd] tag#7 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be a0 00 00 00 08 00 00
[95536.579940] print_req_error: I/O error, dev sdd, sector 7814037152
[95536.580283] sd 9:0:0:0: [sdd] tag#8 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.580285] sd 9:0:0:0: [sdd] tag#8 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95536.580286] print_req_error: I/O error, dev sdd, sector 0
[95546.578153] scsi_io_completion: 24 callbacks suppressed
[95546.578158] sd 9:0:0:0: [sdd] tag#13 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.578160] sd 9:0:0:0: [sdd] tag#13 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 20 00 00
[95546.578160] print_req_error: 24 callbacks suppressed
[95546.578161] print_req_error: I/O error, dev sdd, sector 4096
[95546.578520] sd 9:0:0:0: [sdd] tag#14 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.578522] sd 9:0:0:0: [sdd] tag#14 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.578523] print_req_error: I/O error, dev sdd, sector 4096
[95546.578894] Buffer I/O error on dev dm-0, logical block 0, async page read
[95546.719591] sd 9:0:0:0: [sdd] tag#15 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.719593] sd 9:0:0:0: [sdd] tag#15 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.719594] print_req_error: I/O error, dev sdd, sector 4096
[95546.719973] sd 9:0:0:0: [sdd] tag#16 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.719975] sd 9:0:0:0: [sdd] tag#16 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95546.719975] print_req_error: I/O error, dev sdd, sector 7814033280
[95546.720338] sd 9:0:0:0: [sdd] tag#17 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.720340] sd 9:0:0:0: [sdd] tag#17 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95546.720341] print_req_error: I/O error, dev sdd, sector 7814033392
[95546.720748] sd 9:0:0:0: [sdd] tag#18 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.720749] sd 9:0:0:0: [sdd] tag#18 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.720750] print_req_error: I/O error, dev sdd, sector 4096
[95546.721076] sd 9:0:0:0: [sdd] tag#19 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.721078] sd 9:0:0:0: [sdd] tag#19 CDB: Read(16) 88 00 00 00 00 00 00 00 10 08 00 00 00 08 00 00
[95546.721079] print_req_error: I/O error, dev sdd, sector 4104
[95546.721399] sd 9:0:0:0: [sdd] tag#20 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.721401] sd 9:0:0:0: [sdd] tag#20 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.721401] print_req_error: I/O error, dev sdd, sector 4096
[95546.731599] sd 9:0:0:0: [sdd] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.731601] sd 9:0:0:0: [sdd] tag#21 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95546.731602] print_req_error: I/O error, dev sdd, sector 0
[95546.731918] sd 9:0:0:0: [sdd] tag#23 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.731920] sd 9:0:0:0: [sdd] tag#23 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00
[95546.731921] print_req_error: I/O error, dev sdd, sector 7814036992
[95556.834306] scsi_io_completion: 26 callbacks suppressed
[95556.834309] sd 9:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95556.834312] sd 9:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95556.834312] print_req_error: 26 callbacks suppressed
[95556.834313] print_req_error: I/O error, dev sdd, sector 4096
[95556.834614] sd 9:0:0:0: [sdd] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95556.834615] sd 9:0:0:0: [sdd] tag#1 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95556.834616] print_req_error: I/O error, dev sdd, sector 7814033280
[95556.834916] sd 9:0:0:0: [sdd] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95556.834918] sd 9:0:0:0: [sdd] tag#2 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95556.834919] print_req_error: I/O error, dev sdd, sector 7814033392
I have a cluster with 3 PCs, each having 3 osd's.
It works great so far, but over time (after a few hours / days) the osd's start to go down.
The cluster has been in use for about 4 weeks and roughly I lose one osd per day.
After it's down it cannot be restarted via ceph commands.
Even a (hot) reboot of the pc won't fix it.
But if I shut down the pc and start it again, all osd's are back online again and the "osd goes down" cycle starts again.
I'm using consumer grade hdd disks. What I would like to know is, if it's a hardware problem or a software problem. The fact, that it works fine for a time after a restart lets me doubt that the hdd's are defect. I also find it unlikely that all 9 hdds have the same problem.
After the osd is down the block device is still available in the system. Checked it with lsblk and in the /dev directory.
pve-manager/5.1-43/bdb08029 (running kernel: 4.13.13-5-pve)
ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)
Here the dmesg output after an osd failed:
[15144.958084] device tap103i0 entered promiscuous mode
[15144.963120] vmbr0: port 3(tap103i0) entered blocking state
[15144.963121] vmbr0: port 3(tap103i0) entered disabled state
[15144.963175] vmbr0: port 3(tap103i0) entered blocking state
[15144.963176] vmbr0: port 3(tap103i0) entered forwarding state
[95238.849549] ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[95238.849561] ata10.00: failed command: FLUSH CACHE EXT
[95238.849567] ata10.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 18
res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[95238.849573] ata10.00: status: { DRDY }
[95238.849578] ata10: hard resetting link
[95248.849796] ata10: softreset failed (1st FIS failed)
[95248.849808] ata10: hard resetting link
[95258.850199] ata10: softreset failed (1st FIS failed)
[95258.850210] ata10: hard resetting link
[95293.849896] ata10: softreset failed (1st FIS failed)
[95293.849910] ata10: limiting SATA link speed to 3.0 Gbps
[95293.849911] ata10: hard resetting link
[95299.025448] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[95299.025453] ata10.00: link online but device misclassified
[95304.117429] ata10.00: qc timeout (cmd 0xec)
[95304.117438] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[95304.117439] ata10.00: revalidation failed (errno=-5)
[95304.117448] ata10: hard resetting link
[95314.117507] ata10: softreset failed (1st FIS failed)
[95314.117516] ata10: hard resetting link
[95324.118412] ata10: softreset failed (1st FIS failed)
[95324.118421] ata10: hard resetting link
[95337.141381] INFO: task bstore_kv_sync:2185 blocked for more than 120 seconds.
[95337.141390] Tainted: P O 4.13.13-5-pve #1
[95337.141394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95337.141399] bstore_kv_sync D 0 2185 1 0x00000000
[95337.141401] Call Trace:
[95337.141407] __schedule+0x3cc/0x850
[95337.141409] schedule+0x36/0x80
[95337.141410] schedule_timeout+0x1da/0x350
[95337.141412] ? __blk_run_queue+0x3d/0x60
[95337.141414] ? blk_queue_bio+0x3d3/0x400
[95337.141415] io_schedule_timeout+0x1e/0x50
[95337.141416] ? io_schedule_timeout+0x1e/0x50
[95337.141417] wait_for_completion_io+0xb4/0x140
[95337.141418] ? wake_up_q+0x80/0x80
[95337.141420] submit_bio_wait+0x68/0x90
[95337.141421] blkdev_issue_flush+0x5c/0x90
[95337.141422] blkdev_fsync+0x35/0x50
[95337.141423] vfs_fsync_range+0x4b/0xb0
[95337.141424] do_fsync+0x3d/0x70
[95337.141425] SyS_fdatasync+0x13/0x20
[95337.141427] entry_SYSCALL_64_fastpath+0x33/0xa3
[95337.141428] RIP: 0033:0x7fa37558463d
[95337.141429] RSP: 002b:00007fa365129110 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[95337.141430] RAX: ffffffffffffffda RBX: 00005601c7ab0d80 RCX: 00007fa37558463d
[95337.141431] RDX: 0a20c08100000000 RSI: 00007fa3651290f0 RDI: 0000000000000016
[95337.141431] RBP: 0a20c0815a7d85da R08: 00005601c7ab0ed0 R09: 000000000002e21d
[95337.141432] R10: 00007fa3651290d0 R11: 0000000000000293 R12: 00007fa37650b020
[95337.141432] R13: 00005601c7ab0ed0 R14: 00005601c7dbe000 R15: 0000000000000000
[95337.141461] INFO: task vgs:9632 blocked for more than 120 seconds.
[95337.141465] Tainted: P O 4.13.13-5-pve #1
[95337.141469] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95337.141473] vgs D 0 9632 1952 0x00000000
[95337.141474] Call Trace:
[95337.141476] __schedule+0x3cc/0x850
[95337.141478] schedule+0x36/0x80
[95337.141479] io_schedule+0x16/0x40
[95337.141480] __blkdev_direct_IO_simple+0x1e7/0x320
[95337.141481] ? bdget+0x120/0x120
[95337.141482] blkdev_direct_IO+0x3e1/0x3f0
[95337.141483] ? blkdev_direct_IO+0x3e1/0x3f0
[95337.141485] ? __filemap_fdatawrite_range+0xd4/0x100
[95337.141486] generic_file_read_iter+0xcb/0x9e0
[95337.141487] ? generic_file_read_iter+0xcb/0x9e0
[95337.141489] ? do_filp_open+0xad/0x110
[95337.141490] ? _copy_to_user+0x2a/0x40
[95337.141491] ? cp_new_stat+0x152/0x180
[95337.141492] blkdev_read_iter+0x35/0x40
[95337.141494] new_sync_read+0xde/0x130
[95337.141495] __vfs_read+0x26/0x40
[95337.141496] vfs_read+0x96/0x130
[95337.141498] SyS_read+0x55/0xc0
[95337.141499] entry_SYSCALL_64_fastpath+0x33/0xa3
[95337.141500] RIP: 0033:0x7fb2ea2d7700
[95337.141500] RSP: 002b:00007fff10228278 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[95337.141501] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb2ea2d7700
[95337.141502] RDX: 0000000000001000 RSI: 000055fcab943000 RDI: 0000000000000004
[95337.141502] RBP: 00007fff102282d0 R08: 00007fb2ea596248 R09: 0000000000001000
[95337.141502] R10: 0000000000000080 R11: 0000000000000246 R12: 000055fcab943000
[95337.141503] R13: 0000000000000000 R14: 0000000000000004 R15: 00007fff10228330
[95359.118207] ata10: softreset failed (1st FIS failed)
[95359.118220] ata10: limiting SATA link speed to 1.5 Gbps
[95359.118221] ata10: hard resetting link
[95364.285325] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[95364.285330] ata10.00: link online but device misclassified
[95374.517311] ata10.00: qc timeout (cmd 0xec)
[95374.517321] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[95374.517322] ata10.00: revalidation failed (errno=-5)
[95374.517334] ata10: hard resetting link
[95384.517482] ata10: softreset failed (1st FIS failed)
[95384.517492] ata10: hard resetting link
[95394.518306] ata10: softreset failed (1st FIS failed)
[95394.518315] ata10: hard resetting link
[95429.517872] ata10: softreset failed (1st FIS failed)
[95429.517883] ata10: hard resetting link
[95434.693187] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[95434.693192] ata10.00: link online but device misclassified
[95457.977159] INFO: task bstore_kv_sync:2185 blocked for more than 120 seconds.
[95457.977169] Tainted: P O 4.13.13-5-pve #1
[95457.977173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95457.977177] bstore_kv_sync D 0 2185 1 0x00000000
[95457.977180] Call Trace:
[95457.977186] __schedule+0x3cc/0x850
[95457.977188] schedule+0x36/0x80
[95457.977189] schedule_timeout+0x1da/0x350
[95457.977191] ? __blk_run_queue+0x3d/0x60
[95457.977192] ? blk_queue_bio+0x3d3/0x400
[95457.977193] io_schedule_timeout+0x1e/0x50
[95457.977194] ? io_schedule_timeout+0x1e/0x50
[95457.977195] wait_for_completion_io+0xb4/0x140
[95457.977197] ? wake_up_q+0x80/0x80
[95457.977198] submit_bio_wait+0x68/0x90
[95457.977200] blkdev_issue_flush+0x5c/0x90
[95457.977201] blkdev_fsync+0x35/0x50
[95457.977202] vfs_fsync_range+0x4b/0xb0
[95457.977203] do_fsync+0x3d/0x70
[95457.977204] SyS_fdatasync+0x13/0x20
[95457.977205] entry_SYSCALL_64_fastpath+0x33/0xa3
[95457.977207] RIP: 0033:0x7fa37558463d
[95457.977208] RSP: 002b:00007fa365129110 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[95457.977209] RAX: ffffffffffffffda RBX: 00005601c7ab0d80 RCX: 00007fa37558463d
[95457.977210] RDX: 0a20c08100000000 RSI: 00007fa3651290f0 RDI: 0000000000000016
[95457.977210] RBP: 0a20c0815a7d85da R08: 00005601c7ab0ed0 R09: 000000000002e21d
[95457.977211] R10: 00007fa3651290d0 R11: 0000000000000293 R12: 00007fa37650b020
[95457.977211] R13: 00005601c7ab0ed0 R14: 00005601c7dbe000 R15: 0000000000000000
[95457.977241] INFO: task vgs:9632 blocked for more than 120 seconds.
[95457.977245] Tainted: P O 4.13.13-5-pve #1
[95457.977248] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[95457.977252] vgs D 0 9632 1952 0x00000000
[95457.977253] Call Trace:
[95457.977258] __schedule+0x3cc/0x850
[95457.977260] schedule+0x36/0x80
[95457.977261] io_schedule+0x16/0x40
[95457.977262] __blkdev_direct_IO_simple+0x1e7/0x320
[95457.977263] ? bdget+0x120/0x120
[95457.977264] blkdev_direct_IO+0x3e1/0x3f0
[95457.977265] ? blkdev_direct_IO+0x3e1/0x3f0
[95457.977267] ? __filemap_fdatawrite_range+0xd4/0x100
[95457.977268] generic_file_read_iter+0xcb/0x9e0
[95457.977269] ? generic_file_read_iter+0xcb/0x9e0
[95457.977271] ? do_filp_open+0xad/0x110
[95457.977272] ? _copy_to_user+0x2a/0x40
[95457.977273] ? cp_new_stat+0x152/0x180
[95457.977274] blkdev_read_iter+0x35/0x40
[95457.977275] new_sync_read+0xde/0x130
[95457.977277] __vfs_read+0x26/0x40
[95457.977278] vfs_read+0x96/0x130
[95457.977279] SyS_read+0x55/0xc0
[95457.977281] entry_SYSCALL_64_fastpath+0x33/0xa3
[95457.977281] RIP: 0033:0x7fb2ea2d7700
[95457.977282] RSP: 002b:00007fff10228278 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[95457.977283] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb2ea2d7700
[95457.977283] RDX: 0000000000001000 RSI: 000055fcab943000 RDI: 0000000000000004
[95457.977284] RBP: 00007fff102282d0 R08: 00007fb2ea596248 R09: 0000000000001000
[95457.977284] R10: 0000000000000080 R11: 0000000000000246 R12: 000055fcab943000
[95457.977285] R13: 0000000000000000 R14: 0000000000000004 R15: 00007fff10228330
[95466.165227] ata10.00: qc timeout (cmd 0xec)
[95466.165237] ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[95466.165238] ata10.00: revalidation failed (errno=-5)
[95466.165246] ata10.00: disabled
[95466.165258] ata10: hard resetting link
[95476.165476] ata10: softreset failed (1st FIS failed)
[95476.165486] ata10: hard resetting link
[95486.165106] ata10: softreset failed (1st FIS failed)
[95486.165115] ata10: hard resetting link
[95521.165018] ata10: softreset failed (1st FIS failed)
[95521.165031] ata10: hard resetting link
[95526.381011] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[95526.381015] ata10.00: link online but device misclassified
[95526.381026] ata10: EH complete
[95526.381042] sd 9:0:0:0: [sdd] tag#19 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381044] sd 9:0:0:0: [sdd] tag#19 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[95526.381048] print_req_error: I/O error, dev sdd, sector 0
[95526.381085] sd 9:0:0:0: [sdd] tag#20 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381086] sd 9:0:0:0: [sdd] tag#20 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95526.381087] print_req_error: I/O error, dev sdd, sector 4096
[95526.381091] sd 9:0:0:0: [sdd] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381092] sd 9:0:0:0: [sdd] tag#21 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00
[95526.381198] sd 9:0:0:0: [sdd] tag#23 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381200] sd 9:0:0:0: [sdd] tag#23 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95526.381202] print_req_error: I/O error, dev sdd, sector 7814033280
[95526.381258] sd 9:0:0:0: [sdd] tag#24 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381260] sd 9:0:0:0: [sdd] tag#24 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95526.381261] print_req_error: I/O error, dev sdd, sector 7814033392
[95526.381328] sd 9:0:0:0: [sdd] tag#25 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381330] sd 9:0:0:0: [sdd] tag#25 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95526.381331] print_req_error: I/O error, dev sdd, sector 4096
[95526.381979] sd 9:0:0:0: [sdd] tag#26 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.381980] sd 9:0:0:0: [sdd] tag#26 CDB: Read(16) 88 00 00 00 00 00 00 00 10 08 00 00 00 08 00 00
[95526.381981] print_req_error: I/O error, dev sdd, sector 4104
[95526.382459] sd 9:0:0:0: [sdd] tag#27 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.382460] sd 9:0:0:0: [sdd] tag#27 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95526.382461] print_req_error: I/O error, dev sdd, sector 4096
[95526.393092] sd 9:0:0:0: [sdd] tag#29 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.393094] sd 9:0:0:0: [sdd] tag#29 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95526.393095] print_req_error: I/O error, dev sdd, sector 0
[95526.393548] sd 9:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95526.393549] sd 9:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00
[95526.393550] print_req_error: I/O error, dev sdd, sector 7814036992
[95526.393964] print_req_error: I/O error, dev sdd, sector 7814037152
[95526.524404] Buffer I/O error on dev dm-0, logical block 976753648, async page read
[95526.525126] Buffer I/O error on dev dm-0, logical block 976753648, async page read
[95536.559431] scsi_io_completion: 64 callbacks suppressed
[95536.559434] sd 9:0:0:0: [sdd] tag#28 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.559437] sd 9:0:0:0: [sdd] tag#28 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95536.559437] print_req_error: 63 callbacks suppressed
[95536.559438] print_req_error: I/O error, dev sdd, sector 4096
[95536.559881] sd 9:0:0:0: [sdd] tag#29 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.559882] sd 9:0:0:0: [sdd] tag#29 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95536.559883] print_req_error: I/O error, dev sdd, sector 7814033280
[95536.560297] sd 9:0:0:0: [sdd] tag#30 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.560299] sd 9:0:0:0: [sdd] tag#30 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95536.560300] print_req_error: I/O error, dev sdd, sector 7814033392
[95536.560701] sd 9:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.560702] sd 9:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95536.560703] print_req_error: I/O error, dev sdd, sector 4096
[95536.561130] sd 9:0:0:0: [sdd] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.561131] sd 9:0:0:0: [sdd] tag#1 CDB: Read(16) 88 00 00 00 00 00 00 00 10 08 00 00 00 08 00 00
[95536.561132] print_req_error: I/O error, dev sdd, sector 4104
[95536.561511] sd 9:0:0:0: [sdd] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.561513] sd 9:0:0:0: [sdd] tag#2 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95536.561513] print_req_error: I/O error, dev sdd, sector 4096
[95536.579212] sd 9:0:0:0: [sdd] tag#4 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.579214] sd 9:0:0:0: [sdd] tag#4 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95536.579215] print_req_error: I/O error, dev sdd, sector 0
[95536.579585] sd 9:0:0:0: [sdd] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.579587] sd 9:0:0:0: [sdd] tag#6 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00
[95536.579588] print_req_error: I/O error, dev sdd, sector 7814036992
[95536.579938] sd 9:0:0:0: [sdd] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.579939] sd 9:0:0:0: [sdd] tag#7 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be a0 00 00 00 08 00 00
[95536.579940] print_req_error: I/O error, dev sdd, sector 7814037152
[95536.580283] sd 9:0:0:0: [sdd] tag#8 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95536.580285] sd 9:0:0:0: [sdd] tag#8 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95536.580286] print_req_error: I/O error, dev sdd, sector 0
[95546.578153] scsi_io_completion: 24 callbacks suppressed
[95546.578158] sd 9:0:0:0: [sdd] tag#13 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.578160] sd 9:0:0:0: [sdd] tag#13 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 20 00 00
[95546.578160] print_req_error: 24 callbacks suppressed
[95546.578161] print_req_error: I/O error, dev sdd, sector 4096
[95546.578520] sd 9:0:0:0: [sdd] tag#14 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.578522] sd 9:0:0:0: [sdd] tag#14 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.578523] print_req_error: I/O error, dev sdd, sector 4096
[95546.578894] Buffer I/O error on dev dm-0, logical block 0, async page read
[95546.719591] sd 9:0:0:0: [sdd] tag#15 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.719593] sd 9:0:0:0: [sdd] tag#15 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.719594] print_req_error: I/O error, dev sdd, sector 4096
[95546.719973] sd 9:0:0:0: [sdd] tag#16 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.719975] sd 9:0:0:0: [sdd] tag#16 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95546.719975] print_req_error: I/O error, dev sdd, sector 7814033280
[95546.720338] sd 9:0:0:0: [sdd] tag#17 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.720340] sd 9:0:0:0: [sdd] tag#17 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95546.720341] print_req_error: I/O error, dev sdd, sector 7814033392
[95546.720748] sd 9:0:0:0: [sdd] tag#18 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.720749] sd 9:0:0:0: [sdd] tag#18 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.720750] print_req_error: I/O error, dev sdd, sector 4096
[95546.721076] sd 9:0:0:0: [sdd] tag#19 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.721078] sd 9:0:0:0: [sdd] tag#19 CDB: Read(16) 88 00 00 00 00 00 00 00 10 08 00 00 00 08 00 00
[95546.721079] print_req_error: I/O error, dev sdd, sector 4104
[95546.721399] sd 9:0:0:0: [sdd] tag#20 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.721401] sd 9:0:0:0: [sdd] tag#20 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95546.721401] print_req_error: I/O error, dev sdd, sector 4096
[95546.731599] sd 9:0:0:0: [sdd] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.731601] sd 9:0:0:0: [sdd] tag#21 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[95546.731602] print_req_error: I/O error, dev sdd, sector 0
[95546.731918] sd 9:0:0:0: [sdd] tag#23 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95546.731920] sd 9:0:0:0: [sdd] tag#23 CDB: Read(16) 88 00 00 00 00 01 d1 c0 be 00 00 00 00 08 00 00
[95546.731921] print_req_error: I/O error, dev sdd, sector 7814036992
[95556.834306] scsi_io_completion: 26 callbacks suppressed
[95556.834309] sd 9:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95556.834312] sd 9:0:0:0: [sdd] tag#0 CDB: Read(16) 88 00 00 00 00 00 00 00 10 00 00 00 00 08 00 00
[95556.834312] print_req_error: 26 callbacks suppressed
[95556.834313] print_req_error: I/O error, dev sdd, sector 4096
[95556.834614] sd 9:0:0:0: [sdd] tag#1 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95556.834615] sd 9:0:0:0: [sdd] tag#1 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af 80 00 00 00 08 00 00
[95556.834616] print_req_error: I/O error, dev sdd, sector 7814033280
[95556.834916] sd 9:0:0:0: [sdd] tag#2 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[95556.834918] sd 9:0:0:0: [sdd] tag#2 CDB: Read(16) 88 00 00 00 00 01 d1 c0 af f0 00 00 00 08 00 00
[95556.834919] print_req_error: I/O error, dev sdd, sector 7814033392