How dire is this disk warning?

Proximate · Apr 16, 2022

I've got a 4 drive RAIDZ1 array and one of the disks is warning 'failure prediction threshold exceeded'.
It's a 4hr drive to the server to replace the drive and I'm waiting on some other hardware to come in that needs to also be installed there.

I already understand the consequences but not how RAIDz1 works completely. This is what I found;

Code:

RAID-Z (sometimes called RAID-Z1) will provide a record of each unique data block so that
it can recover from the failure of any single disk on vdev. In this case, the data is
automatically distributed across the disk in the most optimal way. RAID-Z1 is practically
an analogue of RAID 5, as it uses single parity.

RAID5 is a minimum of three drives but I have found drives. I see the write errors but it's not clear to me just how dangerous this is.
It seems to me that if I'm looking at RAID5 like setup, then I'm safe up to two drives going bad which gives me time to get there in a week

.

The status seems to show that things are ok, other than some write errors. Am I correct in my assumptions above?

Dunuin · Apr 16, 2022

Raidz1 is indeed like a raid5. Means a error on one single drive is fine and that single drive even might completely fail. But as soon as the single drive failed even the tiniest error after that will cause uncorrectible data corruption. Not sure what you mean with "I'm safe up to two drives going bad". As soon as the second drives starts going bad all data is lost. "Up to two disks" might only fail when using a striped mirror (raid10) or raidz2 (raid6). And also keep in mind that replacing a disk might take many days or even weeks where your pool basically isn't really usable because it is working 24/7 at the absolute limits doing the resilvering.

Proximate · Apr 16, 2022

Hi,

I was thinking that RAID5 means 3 drives minimum and one can fail without losing data.
Therefore, since my RADz1 is like RAID5, I figured I still had one drive to spare so had time to replace it.

Dunuin · Apr 16, 2022

Jup, 3 disks is the minimum. But you can create a raid5 of 3,4,5,6,7,8,9 and so on disks. Increasing the number of disks won't increase the reliability. No matter if your raid5 consists of 3 or 30 disks. A second failing disk and everything is lost.

But atleast your disk with the write errors hasn't completely failed yet.

Proximate · Apr 16, 2022

Well, I've got a cluster since I'm testing all this in production so maybe it's time to learn a little about fail over and to make sure nothing critical is on that host.

I appreciate your input a lot, thank you.

Search

Search

How dire is this disk warning?

Proximate

Member

Dunuin

Distinguished Member

Proximate

Member

Dunuin

Distinguished Member

Proximate

Member