ZFS Pool keeps degrading

Wildness2881

New Member
Jul 18, 2023
4
0
1
Hi there,

I hope this is the right place to ask since /r/zfs is no more.

Since a few months I try to repair my degraded ZFS pool. There is a failed drive (too many checksum errors) which I replaced with a new drive using this command:

Code:
zpool replace tank2 ata-WDC_WD40EFZX-68AWUN0_WD-WX00DA1DAS7H /dev/disk/by-id/ata-WDC_WD40EFZX-68AWUN0_WD-WX00DA1C9FH8 -f

After scrubbing for about a day the new installed drive again gets checksum errors:

Code:
# zpool status -v
  pool: tank2
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 1.40T in 20:48:28 with 0 errors on Mon Jul 17 08:47:18 2023
config:

        NAME                                          STATE     READ WRITE CKSUM
        tank2                                         DEGRADED     0     0     0
          raidz2-0                                    DEGRADED     0     0     0
            ata-ST4000VN008-2DR166_ZM00WREM           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZM00X4LJ           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZM000740           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZG00VXEG           ONLINE       0     0     0
            ata-WDC_WD40EFZX-68AWUN0_WD-WX00DA1C9FH8  DEGRADED     0     0   463  too many errors
            ata-ST4000VN008-2DR166_ZM00WJRF           ONLINE       0     0     0

errors: No known data errors

I can repeat this over and over - the result is the same. This is maybe the third or forth drive I tried and after every scrub it fails again.

Do you guys know any troubleshooting steps, additional log files, etc.?

As you can see, tank2 is not the frist ZFS pool - I replaced nearly everything is this server except the power supply.

Code:
# zfs version
zfs-2.1.11-pve1
zfs-kmod-2.1.11-pve1

Code:
# uname -a
Linux rlhv 5.15.108-1-pve #1 SMP PVE 5.15.108-1 (2023-06-17T09:41Z) x86_64 GNU/Linux

Kind regards!
 
Have you tried replacing the cable or using a different port on the SATA controller or using a different SATA controller?

Good thinking! I replaced the HBA controller (and the cables) - currently I'm running off the mainboard's SATA ports. I don't think that the calbes are the issue

I am currently using this controller from LSI
Code:
# lspci | grep LSI
01:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)
 
The drive is not reporting any (read or write) errors, but the data is corrupt (checksum fails), which suggest cable or controller port. If it's always the same physical path to the drive, it's probably not your RAM. But you already changed controller and cable and still getting the same errors? Maybe it's a firmware issue with the WD model (which is NAS compatible and not SMR)? Can you try any other brand (or if not, another model)?
EDIT: I'm not an expert and I would love it if someone more knowledgeable would correct me.
 
Last edited:
Maybe it's a firmware issue with the WD model (which is NAS compatible and not SMR)?
Its CMR.

Could also be the backplane in case you don't directly connect a SATA cable to the new HDD. In such a case I would switch two disks and see if then another disks is reporting errors in the same slot.

Did you also check the power cable and not just the SATA cable? Had disks failing once, when the power cable wasn't connected well.

And yes, if only a single disk/port is erroring all the time its usually the disk itself or the cabeling. With bad RAM, CPU, disk controller or PSU you usually would see errors on multiple disks when running a scrub.
 
Last edited:
  • Like
Reactions: leesteken
Its CMR.

Could also be the backplane in case you don't directly connect a SATA cable to the new HDD. In such a case I would switch two disks and see if then another disks is reporting errors in the same slot.

Did you also check the power cable and not just the SATA cable? Had disks failing once, when the power cable wasn't connected well.

And yes, if only a single disk/port is erroring all the time its usually the disk itself or the cabeling. With bad RAM, CPU, disk controller or PSU you usually would see errors on multiple disks when running a scrub.
Yes, it is CMR.

I swapped the power supply the other day and replaced the dead disk again (tho with the same as before). Same effect. I will swap the dead disk again with an other hard drive later.

Code:
ZFS has finished a scrub:

   eid: 1108
 class: scrub_finish
  host: rlhv
  time: 2023-07-23 12:48:58+0200
  pool: tank2
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 10.8M in 10:48:55 with 0 errors on Sun Jul 23 12:48:58 2023
config:

    NAME                                          STATE     READ WRITE CKSUM
    tank2                                         DEGRADED     0     0     0
      raidz2-0                                    DEGRADED     0     0     0
        ata-ST4000VN008-2DR166_ZM00WREM           ONLINE       0     0     0
        ata-ST4000VN008-2DR166_ZM00X4LJ           ONLINE       0     0     0
        ata-ST4000VN008-2DR166_ZM000740           ONLINE       0     0     0
        ata-ST4000VN008-2DR166_ZG00VXEG           ONLINE       0     0     0
        ata-WDC_WD40EFZX-68AWUN0_WD-WX00DA1DAS7H  DEGRADED     0     0   481  too many errors
        ata-ST4000VN008-2DR166_ZM00WJRF           ONLINE       0     0     0

errors: No known data errors
 
So it looks like I did fix it, although It is a bit embarrassing for me. I never really turned of RW on the pool because I thought, it could rebuild it self with a bit of load on it. This load were about ten VMs. This and the fact that I do not use registered ECC memory are probably the cause for the instantaneous checksum errors after a rebuild.

Code:
zpool status -v
  pool: tank2
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 10:22:32 with 0 errors on Sun Aug 13 12:22:34 2023
config:

        NAME                                          STATE     READ WRITE CKSUM
        tank2                                         DEGRADED     0     0     0
          raidz2-0                                    DEGRADED     0     0     0
            ata-ST4000VN008-2DR166_ZMxxWREM           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxxX4LJ           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxx0740           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZGxxVXEG           ONLINE       0     0     0
            ata-WDC_WD40EFZX-68AWUN0_WD-WXxxDA1DAS7H  DEGRADED     0     0   482  too many errors
            ata-ST4000VN008-2DR166_ZMxxWJRF           ONLINE       0     0     0

zpool status -v
  pool: tank2
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Aug 17 21:20:51 2023
        11.7G scanned at 184M/s, 12.4M issued at 195K/s, 8.32T total
        0B resilvered, 0.00% done, no estimated completion time
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank2                                           DEGRADED     0     0     0
          raidz2-0                                      DEGRADED     0     0     0
            ata-ST4000VN008-2DR166_ZMxxWREM             ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxxX4LJ             ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxx0740             ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZGxxVXEG             ONLINE       0     0     0
            replacing-4                                 DEGRADED     0     0     0
              ata-WDC_WD40EFZX-68AWUN0_WD-WXxxDA1DAS7H  DEGRADED     0     0     0  too many errors
              ata-WDC_WD40EFZX-68AWUN0_WD-WXxxDA1DAAH4  ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxxWJRF             ONLINE       0     0     0


  pool: tank2
 state: ONLINE
  scan: scrub repaired 0B in 05:43:06 with 0 errors on Fri Aug 18 13:47:23 2023
config:

        NAME                                          STATE     READ WRITE CKSUM
        tank2                                         ONLINE       0     0     0
          raidz2-0                                    ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxxWREM           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxxX4LJ           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxx0740           ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZGxxVXEG           ONLINE       0     0     0
            ata-WDC_WD40EFZX-68AWUN0_WD-WXxxDA1DAAH4  ONLINE       0     0     0
            ata-ST4000VN008-2DR166_ZMxxWJRF           ONLINE       0     0     0

What bugs me is that there is no obvious way to see if HDDs or a pool is under stress. Do you guys know anything about monitoring or tuning the ZFS pool so that this does not happen again? I do regular SMART checks on my HDDs as well as monitoring the general pool health.


Cheers!
 
What bugs me is that there is no obvious way to see if HDDs or a pool is under stress.
You can run zpool iostat -v to see how much IO is hitting the disks/pool.

Do you guys know anything about monitoring or tuning the ZFS pool so that this does not happen again?
To monitor your pool health you could use tools like zabbix or zed. And might also help to edit the systemd service+timer so the regular scrub will run more often than once each month so you identify corrupted data earlier.
There isn't much you can do to prevent this in software. There is probably somw problem with your hardware and it won't stop until you identify the problem and replace the problematic component.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!