ZFS: Failed to replace failing disk

Mar 14, 2014
23
0
66
Hi,

one of my disks reported a SMART-error, so I set out to replace the failed disk - but failed.

Here's what I did, as reported by zpool history, with some remark indented:

2021-04-12.09:48:43 zpool offline rpool sdc2
I removed the failed disk and shut down the system, replacing the drive physically by a new disk of the same model

2021-04-12.10:13:21 zpool import -N rpool
I guess, this was done by the reboot

2021-04-12.10:17:20 zpool add rpool /dev/sdc -f
I tried to add the new disk, as there was some data on it, I figured -f would be a good idea...

As it turns out, this it was no good idea. I ended up with this layout as reported by zpool status:
Code:
pool: rpool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
scan: scrub repaired 0B in 01:19:00 with 0 errors on Sun Mar 14 01:43:03 2021
config:

NAME        STATE     READ WRITE CKSUM
rpool       DEGRADED     0     0     0
raidz1-0  DEGRADED     0     0     0
sda2    ONLINE       0     0     0
sdb2    ONLINE       0     0     0
sdc2    OFFLINE      0     0     0
sdd2    ONLINE       0     0     0
sdc       DEGRADED     0     0     0  too many errors
To remove the new disk, I set it offline.
2021-04-12.10:21:35 zpool offline rpool sdc -f
Now zpool reports sdc as degrade, but I can not remove it...

OK, so here I am. Stuck. Luckily the system is part of a cluster, so I moved all VMs off to other servers.

What options do I have, beside from re-installing the failed node? Any idea is greatly appreciated.
 
Last edited: