How dire is this disk warning?

Proximate

Member
Feb 13, 2022
219
11
23
64
I've got a 4 drive RAIDZ1 array and one of the disks is warning 'failure prediction threshold exceeded'.
It's a 4hr drive to the server to replace the drive and I'm waiting on some other hardware to come in that needs to also be installed there.

I already understand the consequences but not how RAIDz1 works completely. This is what I found;

Code:
RAID-Z (sometimes called RAID-Z1) will provide a record of each unique data block so that
it can recover from the failure of any single disk on vdev. In this case, the data is
automatically distributed across the disk in the most optimal way. RAID-Z1 is practically
an analogue of RAID 5, as it uses single parity.

RAID5 is a minimum of three drives but I have found drives. I see the write errors but it's not clear to me just how dangerous this is.
It seems to me that if I'm looking at RAID5 like setup, then I'm safe up to two drives going bad which gives me time to get there in a week :).

The status seems to show that things are ok, other than some write errors. Am I correct in my assumptions above?

2022-04-15_153745.jpg
 
Raidz1 is indeed like a raid5. Means a error on one single drive is fine and that single drive even might completely fail. But as soon as the single drive failed even the tiniest error after that will cause uncorrectible data corruption. Not sure what you mean with "I'm safe up to two drives going bad". As soon as the second drives starts going bad all data is lost. "Up to two disks" might only fail when using a striped mirror (raid10) or raidz2 (raid6). And also keep in mind that replacing a disk might take many days or even weeks where your pool basically isn't really usable because it is working 24/7 at the absolute limits doing the resilvering.
 
Hi,

I was thinking that RAID5 means 3 drives minimum and one can fail without losing data.
Therefore, since my RADz1 is like RAID5, I figured I still had one drive to spare so had time to replace it.
 
Jup, 3 disks is the minimum. But you can create a raid5 of 3,4,5,6,7,8,9 and so on disks. Increasing the number of disks won't increase the reliability. No matter if your raid5 consists of 3 or 30 disks. A second failing disk and everything is lost.

But atleast your disk with the write errors hasn't completely failed yet.
 
Well, I've got a cluster since I'm testing all this in production so maybe it's time to learn a little about fail over and to make sure nothing critical is on that host.

I appreciate your input a lot, thank you.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!