Hello,
I have a problem with my RAID controller (LSISAS1064E) in a Fujitsu Primergy TX120 S3p server running Proxmox 7.4 - when writing data after some time to an array configured as RAID1, the array switches to read only mode and no further data can be written.... At first I thought it was a problem with one of the drives (both drives in the array were replaced) then I replaced the RAID card itself, the cable and the disk basket - nothing helped. Now I'm wondering if it's a software problem rather than a hardware problem - for several years this RAID card worked in this configuration without problems until one of the SATA drives had to be replaced. Below are the logs from the dmesg command.
I have a problem with my RAID controller (LSISAS1064E) in a Fujitsu Primergy TX120 S3p server running Proxmox 7.4 - when writing data after some time to an array configured as RAID1, the array switches to read only mode and no further data can be written.... At first I thought it was a problem with one of the drives (both drives in the array were replaced) then I replaced the RAID card itself, the cable and the disk basket - nothing helped. Now I'm wondering if it's a software problem rather than a hardware problem - for several years this RAID card worked in this configuration without problems until one of the SATA drives had to be replaced. Below are the logs from the dmesg command.
Code:
[188636.417002] mptscsih: ioc0: attempting task abort! (sc=00000000be174a8d)
[188636.417013] sd 6:1:2:0: [sdd] tag#60 CDB: Write(10) 2a 00 3a 19 ca f8 00 00 28 00
[188636.417023] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000be174a8d)
[188636.417027] mptscsih: ioc0: attempting task abort! (sc=00000000a920994f)
[188636.417028] sd 6:1:2:0: [sdd] tag#59 CDB: Write(10) 2a 00 34 da 45 f8 00 02 08 00
[188636.417029] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000a920994f)
[188636.417031] mptscsih: ioc0: attempting task abort! (sc=000000009741dbee)
[188636.417032] sd 6:1:2:0: [sdd] tag#58 CDB: Write(10) 2a 00 00 00 0a 20 00 00 10 00
[188636.417033] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=000000009741dbee)
[188636.417034] mptscsih: ioc0: attempting task abort! (sc=0000000028e56cf1)
[188636.417036] sd 6:1:2:0: [sdd] tag#57 CDB: Write(10) 2a 00 34 c0 08 28 00 00 08 00
[188636.417037] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=0000000028e56cf1)
[188636.417038] mptscsih: ioc0: attempting task abort! (sc=00000000e725f042)
[188636.417039] sd 6:1:2:0: [sdd] tag#56 CDB: Write(10) 2a 00 43 00 08 08 00 00 08 00
[188636.417040] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000e725f042)
[188636.417041] mptscsih: ioc0: attempting task abort! (sc=00000000904b1efa)
[188636.417043] sd 6:1:2:0: [sdd] tag#55 CDB: Write(10) 2a 00 43 00 0b 68 00 00 08 00
[188636.417043] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000904b1efa)
[188636.417045] mptscsih: ioc0: attempting task abort! (sc=000000006297d2dc)
[188636.417046] sd 6:1:2:0: [sdd] tag#54 CDB: Write(10) 2a 00 44 00 08 70 00 00 10 00
[188636.417047] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=000000006297d2dc)
[188636.417048] mptscsih: ioc0: attempting task abort! (sc=0000000084da4846)
[188636.417049] sd 6:1:2:0: [sdd] tag#53 CDB: Write(10) 2a 00 34 da 33 88 00 00 30 00
[188636.417050] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=0000000084da4846)
[188636.417051] mptscsih: ioc0: attempting task abort! (sc=00000000704e2881)
[188636.417052] sd 6:1:2:0: [sdd] tag#52 CDB: Write(10) 2a 00 34 da 33 b8 00 04 c0 00
[188636.417053] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000704e2881)
[188636.417054] mptscsih: ioc0: attempting task abort! (sc=000000001927a8ee)
[188636.417056] sd 6:1:2:0: [sdd] tag#51 CDB: Write(10) 2a 00 00 00 08 00 00 00 08 00
[188636.417056] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=000000001927a8ee)
[188636.417058] mptscsih: ioc0: attempting task abort! (sc=000000004605c810)
[188636.417059] sd 6:1:2:0: [sdd] tag#50 CDB: Write(10) 2a 00 34 da 38 78 00 04 00 00
[188636.417060] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=000000004605c810)
[188636.417061] mptscsih: ioc0: attempting task abort! (sc=00000000e6c7e723)
[188636.417062] sd 6:1:2:0: [sdd] tag#49 CDB: Write(10) 2a 00 34 da 3c 78 00 04 00 00
[188636.417063] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000e6c7e723)
[188636.417064] mptscsih: ioc0: attempting task abort! (sc=0000000077fc3a04)
[188636.417065] sd 6:1:2:0: [sdd] tag#48 CDB: Write(10) 2a 00 34 da 40 78 00 05 80 00
[188636.417066] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=0000000077fc3a04)
[188651.265003] mptscsih: ioc0: attempting task abort! (sc=00000000e427cbf2)
[188651.265015] sd 6:1:2:0: [sdd] tag#119 CDB: Inquiry 12 00 00 00 24 00
[188651.265017] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000e427cbf2)
[188651.289012] mptscsih: ioc0: attempting target reset! (sc=00000000be174a8d)
[188651.289017] sd 6:1:2:0: [sdd] tag#60 CDB: Write(10) 2a 00 3a 19 ca f8 00 00 28 00
[188651.767350] mptscsih: ioc0: target reset: SUCCESS (sc=00000000be174a8d)
[188662.005024] mptscsih: ioc0: attempting task abort! (sc=00000000be174a8d)
[188662.005037] sd 6:1:2:0: [sdd] tag#60 CDB: Test Unit Ready 00 00 00 00 00 00
[188662.005039] mptscsih: ioc0: task abort: FAILED (rv=2003) (sc=00000000be174a8d)
[188662.005041] mptscsih: ioc0: attempting target reset! (sc=00000000be174a8d)
[188662.005043] sd 6:1:2:0: [sdd] tag#60 CDB: Test Unit Ready 00 00 00 00 00 00
[188662.517324] mptscsih: ioc0: target reset: SUCCESS (sc=00000000be174a8d)
[188662.517337] mptscsih: ioc0: attempting host reset! (sc=00000000e427cbf2)
[188678.689019] mptscsih: ioc0: host reset: SUCCESS (sc=00000000e427cbf2)
[188693.749028] mptbase: ioc0: WARNING - Issuing Reset from mpt_config!!, doorbell=0x24000000
[188693.893050] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893054] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893055] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893056] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893057] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893059] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893059] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893060] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893061] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893062] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893063] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893064] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893065] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893066] sd 6:1:2:0: Device offlined - not ready after error recovery
[188693.893593] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895119] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895123] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895126] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895128] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895131] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895134] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895137] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895140] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895143] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895145] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.895147] mptctldrivers/message/fusion/mptctl.c::mptctl_ioctl() @646 - Controller disabled.
[188693.909009] sd 6:1:2:0: rejecting I/O to offline device
[188693.909126] blk_update_request: I/O error, dev sdd, sector 974768888 op 0x1WRITE) flags 0x800 phys_seg 5 prio class 0
[188693.909368] blk_update_request: I/O error, dev sdd, sector 886719992 op 0x1WRITE) flags 0x0 phys_seg 65 prio class 0
[188693.909371] Aborting journal on device sdd1-8.
[188693.909585] blk_update_request: I/O error, dev sdd, sector 886720512 op 0x1WRITE) flags 0x4000 phys_seg 128 prio class 0
[188693.909593] EXT4-fs warning (device sdd1): ext4_end_bio:344: I/O error 10 writing to inode 12 starting block 110840064)
[188693.909594] blk_update_request: I/O error, dev sdd, sector 886721536 op 0x1WRITE) flags 0x4000 phys_seg 128 prio class 0
[188693.909598] blk_update_request: I/O error, dev sdd, sector 2592 op 0x1WRITE) flags 0x103000 phys_seg 2 prio class 0
[188693.909602] Buffer I/O error on dev sdd1, logical block 68, lost async page write
[188693.909609] Buffer I/O error on dev sdd1, logical block 69, lost async page write
[188693.909612] blk_update_request: I/O error, dev sdd, sector 885000232 op 0x1WRITE) flags 0x103000 phys_seg 1 prio class 0
[188693.909615] Buffer I/O error on dev sdd1, logical block 110624773, lost async page write
[188693.909619] blk_update_request: I/O error, dev sdd, sector 1124075528 op 0x1WRITE) flags 0x103000 phys_seg 1 prio class 0
[188693.909622] Buffer I/O error on dev sdd1, logical block 140509185, lost async page write
[188693.909625] blk_update_request: I/O error, dev sdd, sector 1124076392 op 0x1WRITE) flags 0x103000 phys_seg 1 prio class 0
[188693.909628] Buffer I/O error on dev sdd1, logical block 140509293, lost async page write
[188693.909638] blk_update_request: I/O error, dev sdd, sector 1140852848 op 0x1WRITE) flags 0x103000 phys_seg 2 prio class 0
[188693.909641] Buffer I/O error on dev sdd1, logical block 142606350, lost async page write
[188693.909644] Buffer I/O error on dev sdd1, logical block 142606351, lost async page write
[188693.909648] blk_update_request: I/O error, dev sdd, sector 886715272 op 0x1WRITE) flags 0x0 phys_seg 4 prio class 0
[188693.909651] EXT4-fs warning (device sdd1): ext4_end_bio:344: I/O error 10 writing to inode 12 starting block 110839415)
[188693.909656] Buffer I/O error on dev sdd1, logical block 0, lost async page write
[188693.909661] EXT4-fs warning (device sdd1): ext4_end_bio:344: I/O error 10 writing to inode 12 starting block 110839695)
[188693.909666] EXT4-fs warning (device sdd1): ext4_end_bio:344: I/O error 10 writing to inode 12 starting block 110840002)
[188693.909704] EXT4-fs error (device sdd1): ext4_journal_check_start:83: comm dd: Detected aborted journal
[188693.909704] Buffer I/O error on dev sdd1, logical block 121667584, lost sync page write
[188693.909711] JBD2: Error -5 detected when updating journal superblock for sdd1-8.
[188693.909714] EXT4-fs error (device sdd1): ext4_journal_check_start:83: comm nfsd: Detected aborted journal
[188693.909739] EXT4-fs (sdd1): previous I/O error to superblock detected
[188693.909823] Buffer I/O error on dev sdd1, logical block 0, lost sync page write
[188693.909858] EXT4-fs (sdd1): I/O error while writing superblock
[188693.909860] EXT4-fs (sdd1): Remounting filesystem read-only
[188693.910038] EXT4-fs (sdd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
[188693.910172] EXT4-fs (sdd1): I/O error while writing superblock
[188693.910341] EXT4-fs error (device sdd1): ext4_check_bdev_write_error:217: comm kworker/u16:6: Error while async write back metadata
[188693.910350] EXT4-fs error (device sdd1) in ext4_reserve_inode_write:5789: Journal has aborted
[188693.910352] EXT4-fs error (device sdd1): mpage_map_and_submit_extent:2510: inode #12: comm kworker/u16:6: mark_inode_dirty error
[188693.910355] EXT4-fs error (device sdd1): mpage_map_and_submit_extent:2512: comm kworker/u16:6: Failed to mark inode 12 dirty
[188693.910370] EXT4-fs warning (device sdd1): ext4_end_bio:344: I/O error 10 writing to inode 12 starting block 110840465)
[188693.910381] EXT4-fs warning (device sdd1): ext4_end_bio:344: I/O error 10 writing to inode 12 starting block 110841088)
[188693.910740] EXT4-fs (sdd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
[188693.910967] EXT4-fs (sdd1): I/O error while writing superblock
[188693.911099] Buffer I/O error on device sdd1, logical block 110839439
[188693.915898] Buffer I/O error on device sdd1, logical block 110839440
[188693.916029] Buffer I/O error on device sdd1, logical block 110839441
[188693.916161] Buffer I/O error on device sdd1, logical block 110839442
[188693.916293] Buffer I/O error on device sdd1, logical block 110839443
[188693.916427] Buffer I/O error on device sdd1, logical block 110839444
[188693.916565] Buffer I/O error on device sdd1, logical block 110839445
[188693.916698] Buffer I/O error on device sdd1, logical block 110839446
[188693.916831] Buffer I/O error on device sdd1, logical block 110839447
[188693.916963] Buffer I/O error on device sdd1, logical block 110839448
[188693.917528] EXT4-fs (sdd1): failed to convert unwritten extents to written extents -- potential data loss! (inode 12, error -30)
[188861.685144] INFO: task kworker/3:0:49927 blocked for more than 120 seconds.
[188861.685299] Tainted: P O 5.15.152-1-pve #1
[188861.685423] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[188861.685586] task:kworker/3:0 state stack: 0 pid:49927 ppid: 2 flags:0x00004000
[188861.685591] Workqueue: mpt/0 mptsas_firmware_event_work [mptsas]
[188861.685599] Call Trace:
[188861.685601] <TASK>
[188861.685604] __schedule+0x34e/0x1740
[188861.685609] ? dequeue_entity+0xd8/0x4a0
[188861.685614] ? update_load_avg+0x82/0x670
[188861.685617] schedule+0x69/0x110
[188861.685619] schedule_timeout+0x105/0x140
[188861.685622] ? __schedule+0x356/0x1740
[188861.685624] __wait_for_common+0xae/0x150
[188861.685627] ? usleep_range_state+0x90/0x90
[188861.685630] wait_for_completion+0x24/0x30
[188861.685633] __flush_work.isra.0+0x72/0x90
[188861.685636] ? worker_detach_from_pool+0xc0/0xc0
[188861.685638] __cancel_work_timer+0x12b/0x1b0
[188861.685640] ? msleep+0x2d/0x40
[188861.685642] cancel_delayed_work_sync+0x13/0x20
[188861.685645] mptsas_cleanup_fw_event_q+0x166/0x1c0 [mptsas]
[188861.685648] mptsas_ioc_reset+0xce/0x100 [mptsas]
[188861.685651] mpt_signal_reset.isra.0+0x78/0x160 [mptbase]
[188861.685656] mpt_Soft_Hard_ResetHandler+0x264/0x410 [mptbase]
[188861.685660] mpt_config.cold+0x13/0x13d [mptbase]
[188861.685665] mpt_findImVolumes+0x123/0x350 [mptbase]
[188861.685668] ? psi_task_switch+0x1eb/0x220
[188861.685671] mptsas_firmware_event_work+0x2f3/0xe49 [mptsas]
[188861.685675] ? __schedule+0x356/0x1740
[188861.685677] ? psi_avgs_work+0x64/0xd0
[188861.685679] process_one_work+0x22b/0x3d0
[188861.685682] worker_thread+0x53/0x420
[188861.685684] ? process_one_work+0x3d0/0x3d0
[188861.685686] kthread+0x12a/0x150
[188861.685687] ? set_kthread_struct+0x50/0x50
[188861.685689] ret_from_fork+0x22/0x30
[188861.685693] </TASK>