ZFS raidz2 "insufficient replicas"

alpha754293

Member
Jan 8, 2023
94
18
8
On my Proxmox 7.4-3 system, I have an 8-wide raidz2 ZFS pool consisting of eight 6 TB HGST HDDs.

One of the drives was reporting state "FAULTED" due to too many errors, and according to the drive activity lights, it wasn't being used by ZFS anymore.

I replaced the drive with a cold spare that I had and issued this command:

# zpool replace export_pve ata-HGST_HDN726060ALE614_K8GZ103D ata-HGST_HUS726060ALE610_NCHANBUS -f

And it starts working on replacing the disk.

After a while though, this is what it shows:
Code:
# zpool status export_pve
  pool: export_pve
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu May 25 03:13:56 2023
        315M scanned at 9.55M/s, 111M issued at 3.36M/s, 13.7T total
        0B resilvered, 0.00% done, no estimated completion time
config:

        NAME                                     STATE     READ WRITE CKSUM
        export_pve                               DEGRADED     0     0     0
          raidz2-0                               DEGRADED     0     0     0
            ata-HGST_HUS726060ALE610_NCHAJ3US    ONLINE       0     0     0
            ata-HGST_HDN726060ALE614_K8GYJNDD    ONLINE       0     0     0
            ata-HGST_HDN726060ALE614_K1KK32XD    ONLINE       0     0     0
            ata-HGST_HDN726060ALE614_K8GYTBEN    ONLINE       0     0     0
            replacing-4                          UNAVAIL      0   202     0  insufficient replicas
              ata-HGST_HDN726060ALE614_K8GZ103D  REMOVED      0     0     0
              ata-HGST_HUS726060ALE610_NCHANBUS  REMOVED      0     0     0
            ata-HGST_HDN726060ALE614_K1HZ8Y9D    ONLINE       0     0     0
            ata-HGST_HUS726060ALE610_NCHALTBS    ONLINE       0     0     0
            ata-HGST_HDN726060ALE610_NCGU5B6S    ONLINE       0     0     0

For a raidz2 array, which is supposed to be two-drive fault tolerant, why would I get an "insufficient replicas" comment when I am trying to replace a failed disk?

I tried googling this issue, but either people were using ZFS mirrors and/or people were using raidz1s. There weren't very many results where people were using raidz2.

Some of the results, people were talking about how they had rebooted their system and/or powered it down, did their drive placement, and then powered their system back up, and it was failing to do the ZFS import during boot up, but I am trying to make sure that I don't reboot nor power down my system, JUST in case of something like that happening or otherwise resulting in that mode of failure.

I get the impression that either I did something wrong or the system is behaving in a way that's different than what I would have otherwise expected.

Any insights into this error would be greatly appreciated.

Thank you.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!