Hi guys!
This forum has been extremely helpful multiple times, hope you can point me towards the right direction this time too.
My setup (host for PROXMOX):
- Motherboard: Z390 AORUS ELITE (it has 6 SATA slots which I make use of)
- CPU: Intel i9-9900K @ 3.60 Ghz
- RAM: G.Skill RipjawsV DDR4 32GB (4x8GB) 3200MHz CL16 rev2 XMP2 Black (F4-3200C16D-16GVKB)
- Storage: 5x: Seagate IronWolf Pro 16 TB 256MB 3.5" SATA (RAID V) + 1x SSD Crucial MX500 (for system)
My problem is around zfs pool that I created on Ironwolfs (RAID V). Generally those are all connected to the mainboard and PSU via standard cables. I made it into a NAS storage for a container, but it shouldn't really matter. What matters is that I do get deadman status from zed when I try to copy files there:
I tried scrubbing multiple times (each time scrub happens it finds some checksum errors or files which I tried to replace nicely, then when I reboot it happens again) but with no avail. I checked the cables but nothing obvious seems to be bad (and all of the drives end up in a deadman state, not just one so the chance to have all the cables screwed is super-low).
Could you please point me into some direction on how to start troubleshooting and fixing it? Something is clearly wrong, but it's impossible for me to find out what.
Thanks in advance!
This forum has been extremely helpful multiple times, hope you can point me towards the right direction this time too.
My setup (host for PROXMOX):
- Motherboard: Z390 AORUS ELITE (it has 6 SATA slots which I make use of)
- CPU: Intel i9-9900K @ 3.60 Ghz
- RAM: G.Skill RipjawsV DDR4 32GB (4x8GB) 3200MHz CL16 rev2 XMP2 Black (F4-3200C16D-16GVKB)
- Storage: 5x: Seagate IronWolf Pro 16 TB 256MB 3.5" SATA (RAID V) + 1x SSD Crucial MX500 (for system)
My problem is around zfs pool that I created on Ironwolfs (RAID V). Generally those are all connected to the mainboard and PSU via standard cables. I made it into a NAS storage for a container, but it shouldn't really matter. What matters is that I do get deadman status from zed when I try to copy files there:
Code:
Oct 12 16:33:26 washington zed[119688]: eid=150 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:33:26 washington zed[119692]: eid=151 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:33:26 washington zed[119696]: eid=152 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:33:26 washington zed[119698]: eid=153 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:33:26 washington zed[119700]: eid=154 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
Oct 12 16:34:28 washington zed[119945]: eid=155 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:34:28 washington zed[119949]: eid=156 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:34:28 washington zed[119952]: eid=157 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:34:28 washington zed[119955]: eid=158 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:34:28 washington zed[119957]: eid=159 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
Oct 12 16:35:29 washington zed[120174]: eid=160 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:35:29 washington zed[120178]: eid=161 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:35:29 washington zed[120181]: eid=162 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:35:29 washington zed[120184]: eid=163 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:35:29 washington zed[120186]: eid=164 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
Oct 12 16:36:31 washington zed[120405]: eid=165 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:36:31 washington zed[120409]: eid=166 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:36:31 washington zed[120413]: eid=167 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:36:31 washington zed[120415]: eid=168 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:36:31 washington zed[120417]: eid=169 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
Oct 12 16:37:20 washington pmxcfs[1535]: [status] notice: received log
Oct 12 16:37:32 washington zed[120633]: eid=170 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:37:32 washington zed[120637]: eid=171 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:37:32 washington zed[120641]: eid=172 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:37:32 washington zed[120643]: eid=173 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:37:32 washington zed[120645]: eid=174 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
Oct 12 16:38:33 washington zed[120863]: eid=175 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:38:33 washington zed[120867]: eid=176 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:38:33 washington zed[120870]: eid=177 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:38:33 washington zed[120873]: eid=178 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:38:33 washington zed[120875]: eid=179 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
Oct 12 16:39:35 washington zed[121092]: eid=180 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:39:35 washington zed[121096]: eid=181 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:39:35 washington zed[121099]: eid=182 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:39:35 washington zed[121102]: eid=183 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:39:35 washington zed[121104]: eid=184 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
Oct 12 16:40:36 washington zed[121323]: eid=185 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALAG-part1 size=196608 offset=1838406885376 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:40:36 washington zed[121327]: eid=186 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALBK-part1 size=196608 offset=1838409179136 priority=2 err=0 flags=0x40080480 delay=23542441ms
Oct 12 16:40:36 washington zed[121330]: eid=187 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALNC-part1 size=1048576 offset=1838409572352 priority=2 err=0 flags=0x40080480 delay=23542434ms
Oct 12 16:40:36 washington zed[121333]: eid=188 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIAL29-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542438ms
Oct 12 16:40:36 washington zed[121335]: eid=189 class=deadman pool='tank' vdev=ata-ST16000NT001-3LV101_SERIALQ3-part1 size=1048576 offset=1838412587008 priority=2 err=0 flags=0x40080480 delay=23542436ms
I tried scrubbing multiple times (each time scrub happens it finds some checksum errors or files which I tried to replace nicely, then when I reboot it happens again) but with no avail. I checked the cables but nothing obvious seems to be bad (and all of the drives end up in a deadman state, not just one so the chance to have all the cables screwed is super-low).
Could you please point me into some direction on how to start troubleshooting and fixing it? Something is clearly wrong, but it's impossible for me to find out what.
Thanks in advance!