On my Proxmox 7.4-3 system, I have an 8-wide raidz2 ZFS pool consisting of eight 6 TB HGST HDDs.
One of the drives was reporting state "FAULTED" due to too many errors, and according to the drive activity lights, it wasn't being used by ZFS anymore.
I replaced the drive with a cold spare that I had and issued this command:
And it starts working on replacing the disk.
After a while though, this is what it shows:
For a raidz2 array, which is supposed to be two-drive fault tolerant, why would I get an "insufficient replicas" comment when I am trying to replace a failed disk?
I tried googling this issue, but either people were using ZFS mirrors and/or people were using raidz1s. There weren't very many results where people were using raidz2.
Some of the results, people were talking about how they had rebooted their system and/or powered it down, did their drive placement, and then powered their system back up, and it was failing to do the ZFS import during boot up, but I am trying to make sure that I don't reboot nor power down my system, JUST in case of something like that happening or otherwise resulting in that mode of failure.
I get the impression that either I did something wrong or the system is behaving in a way that's different than what I would have otherwise expected.
Any insights into this error would be greatly appreciated.
Thank you.
One of the drives was reporting state "FAULTED" due to too many errors, and according to the drive activity lights, it wasn't being used by ZFS anymore.
I replaced the drive with a cold spare that I had and issued this command:
# zpool replace export_pve ata-HGST_HDN726060ALE614_K8GZ103D ata-HGST_HUS726060ALE610_NCHANBUS -f
And it starts working on replacing the disk.
After a while though, this is what it shows:
Code:
# zpool status export_pve
pool: export_pve
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Thu May 25 03:13:56 2023
315M scanned at 9.55M/s, 111M issued at 3.36M/s, 13.7T total
0B resilvered, 0.00% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
export_pve DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
ata-HGST_HUS726060ALE610_NCHAJ3US ONLINE 0 0 0
ata-HGST_HDN726060ALE614_K8GYJNDD ONLINE 0 0 0
ata-HGST_HDN726060ALE614_K1KK32XD ONLINE 0 0 0
ata-HGST_HDN726060ALE614_K8GYTBEN ONLINE 0 0 0
replacing-4 UNAVAIL 0 202 0 insufficient replicas
ata-HGST_HDN726060ALE614_K8GZ103D REMOVED 0 0 0
ata-HGST_HUS726060ALE610_NCHANBUS REMOVED 0 0 0
ata-HGST_HDN726060ALE614_K1HZ8Y9D ONLINE 0 0 0
ata-HGST_HUS726060ALE610_NCHALTBS ONLINE 0 0 0
ata-HGST_HDN726060ALE610_NCGU5B6S ONLINE 0 0 0
For a raidz2 array, which is supposed to be two-drive fault tolerant, why would I get an "insufficient replicas" comment when I am trying to replace a failed disk?
I tried googling this issue, but either people were using ZFS mirrors and/or people were using raidz1s. There weren't very many results where people were using raidz2.
Some of the results, people were talking about how they had rebooted their system and/or powered it down, did their drive placement, and then powered their system back up, and it was failing to do the ZFS import during boot up, but I am trying to make sure that I don't reboot nor power down my system, JUST in case of something like that happening or otherwise resulting in that mode of failure.
I get the impression that either I did something wrong or the system is behaving in a way that's different than what I would have otherwise expected.
Any insights into this error would be greatly appreciated.
Thank you.