Tried to replace a disk in a ZFS pool. After a reboot whole pool is gone.

damarges

New Member
Jul 25, 2022
27
2
3
39
Bad Kreuznach, Germany
According to a tutorial I found in the internet, I first took the old hdd offline

Code:
zpool offline tank ata-ST2000DM008-2FR102_ZFL688AZ

After shutting down the pc I physically removed the old hdd, connected the new one. Boot up the PC. ZFS reported a degraded state of my pool (as expected). Then I took the new hdd online (instead of replacing it, my mistake):

Code:
zpool online tank ata-ST8000DM004-2CX188_ZCT0DCK7

The command was accepted by the OS and zpool status tank gave me the new hdd online but seemed like it was not part of the tank pool:


Code:
        tank                                    DEGRADED
          raidz2-0                              DEGRADED
            ata-ST2000DM008-2FR102_ZFL688AZ     OFFLINE
            ata-ST2000DM001-1CH164_W1F4C8RS     ONLINE
            ata-ST2000DM001-9YN164_Z2F0GPQR     ONLINE
            ata-ST2000DM008-2UB102_WFL6G6CL     ONLINE
            ata-ST8000DM004-2U9188_ZR14YM54     ONLINE
            ata-ST2000DM008-2FR102_ZFL689C5     ONLINE
          ata-ST8000DM004-2CX188_ZCT0DCK7       ONLINE

No resilvering took place, no scrubbing. I then tried to replace the old with the new hdd by zpool replace pool oldhdd newhdd. But ZFS gave me an error and said the (new) hdd would be in use already.

After another reboot everything was gone. No pool could be found. I reconnected the "old" still good hdd.
Putting the old hdd back online failed as no pool was found.


But I am stuck:

Code:
root@pve:~# zpool status
no pools available
root@pve:~# zpool status tank
cannot open 'tank': no such pool
root@pve:~# zpool import tank
cannot import 'tank': no such pool or dataset
        Destroy and re-create the pool from
        a backup source.
root@pve:~# zpool import -f tank
cannot import 'tank': no such pool or dataset
        Destroy and re-create the pool from
        a backup source.
root@pve:~# zpool import
   pool: tank
     id: 14700457213808848307
  state: UNAVAIL
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
 config:

        tank                                    UNAVAIL  insufficient replicas
          raidz2-0                              DEGRADED
            ata-ST2000DM008-2FR102_ZFL688AZ     OFFLINE
            ata-ST2000DM001-1CH164_W1F4C8RS     ONLINE
            ata-ST2000DM001-9YN164_Z2F0GPQR     ONLINE
            ata-ST2000DM008-2UB102_WFL6G6CL     ONLINE
            ata-ST8000DM004-2U9188_ZR14YM54     ONLINE
            ata-ST2000DM008-2FR102_ZFL689C5     ONLINE
          ata-ST8000DM004-2CX188_ZCT0DCK7       UNAVAIL
        logs
          ata-INTENSO_SSD_AA000000000000003108  ONLINE

What can I do now to get my data back?
Any help is highly appreciated
 
Last edited:
you need to have all disks available (or at least the new disk + enough of the raidz vdev to be importable), then do a full backup, recreate the pool from scratch, and restore your backup. you added your new disk as (non-redundant!) top-level vdev, it's not possible to undo that with raidz.

by the way - your commands seem to be incomplete:
- online doesn't work unless the disk is added first
- adding a non-redundant vdev to a redundant pool errors out with a prominent warning unless "-f" is passed, exactly to protect against this kind of situation.
 
then do a full backup, recreate the pool from scratch, and restore your backup
How exactly do I do that?
I have all my (previous) disks available and connected to my system.

I have not added, offlined, onlined any Disk with a -f tag.

Whats next?

Thanks
 
Last edited:
Best case you got a another ZFS pool with 6+TB of free space in the same server or on another server. Then you could use a recursive "zfs send | zfs recv" to copy all data from one pool to the other one which will also keep all the snapshots intact.
 
Last edited:
Best case you got a another ZFS pool with 6+TB of free space in the same server or on another server. Then you could use a recursive "zfs send | zfs recv" to copy all data from one pool to the other one which will also keep all the snapshots intact.
How could I send from a pool that zfs won't see at all in the first place?
 
If you got a 6+TB nonZFS disk/NAS you could also pipe the output of "zfs send" into a file to later pipe it back into "zfs recv" after recreating the pool.

Or you back just the data up. But how that can be done depends on what you are storing on it. For example VZDump for VMs/LXCs and "rsync -avcr" for files/folders.

How could I send from a pool that zfs won't see at all in the first place?
It complains that "ata-ST8000DM004-2CX188_ZCT0DCK7" isn't available. Put that disk back in and the pool should be importable.
 
Last edited:
  • Like
Reactions: Darkk
What are zpool import, zpool status and ls -la /dev/disk/by-id reporting now?
Code:
root@pve:/# zpool import
   pool: tank
     id: 14700457213808848307
  state: UNAVAIL
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
 config:

        tank                                    UNAVAIL  insufficient replicas
          raidz2-0                              DEGRADED
            ata-ST2000DM008-2FR102_ZFL688AZ     OFFLINE
            ata-ST2000DM001-1CH164_W1F4C8RS     ONLINE
            ata-ST2000DM001-9YN164_Z2F0GPQR     ONLINE
            ata-ST2000DM008-2UB102_WFL6G6CL     ONLINE
            ata-ST8000DM004-2U9188_ZR14YM54     ONLINE
            ata-ST2000DM008-2FR102_ZFL689C5     ONLINE
          ata-ST8000DM004-2CX188_ZCT0DCK7       UNAVAIL
        logs
          ata-INTENSO_SSD_AA000000000000003108  UNAVAIL

Code:
root@pve:/# zpool status
no pools available


Code:
root@pve:/# ls -la /dev/disk/by-id/
total 0
drwxr-xr-x 2 root root 1220 Nov 15 14:31 .
drwxr-xr-x 9 root root  180 Nov 15 14:30 ..
lrwxrwxrwx 1 root root    9 Nov 15 14:30 ata-ST2000DM001-1CH164_W1F4C8RS -> ../../sdb
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM001-1CH164_W1F4C8RS-part1 -> ../../sdb1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM001-1CH164_W1F4C8RS-part9 -> ../../sdb9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 ata-ST2000DM001-1CH164_W1F4DRZ7 -> ../../sdf
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM001-1CH164_W1F4DRZ7-part1 -> ../../sdf1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM001-1CH164_W1F4DRZ7-part9 -> ../../sdf9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 ata-ST2000DM001-9YN164_Z2F0GPQR -> ../../sdc
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM001-9YN164_Z2F0GPQR-part1 -> ../../sdc1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM001-9YN164_Z2F0GPQR-part9 -> ../../sdc9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 ata-ST2000DM008-2FR102_ZFL688AZ -> ../../sda
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM008-2FR102_ZFL688AZ-part1 -> ../../sda1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM008-2FR102_ZFL688AZ-part9 -> ../../sda9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 ata-ST2000DM008-2FR102_ZFL689C5 -> ../../sde
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM008-2FR102_ZFL689C5-part1 -> ../../sde1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM008-2FR102_ZFL689C5-part9 -> ../../sde9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 ata-ST2000DM008-2UB102_WFL6G6CL -> ../../sdd
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM008-2UB102_WFL6G6CL-part1 -> ../../sdd1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST2000DM008-2UB102_WFL6G6CL-part9 -> ../../sdd9
lrwxrwxrwx 1 root root    9 Nov 15 14:31 ata-ST8000DM004-2CX188_ZCT0DCK7 -> ../../sdh
lrwxrwxrwx 1 root root    9 Nov 15 14:30 ata-ST8000DM004-2U9188_ZR14YM54 -> ../../sdg
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST8000DM004-2U9188_ZR14YM54-part1 -> ../../sdg1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 ata-ST8000DM004-2U9188_ZR14YM54-part9 -> ../../sdg9
lrwxrwxrwx 1 root root   10 Nov 15 14:30 dm-name-pve-root -> ../../dm-1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 dm-name-pve-swap -> ../../dm-0
lrwxrwxrwx 1 root root   10 Nov 15 14:30 dm-uuid-LVM-k8uFdxgqkyh1qUBOym9wpSGGPfqth2yi6imBctMJeeID70KffQlfd6uYByPPO6QW -> ../../dm-0
lrwxrwxrwx 1 root root   10 Nov 15 14:30 dm-uuid-LVM-k8uFdxgqkyh1qUBOym9wpSGGPfqth2yiwiFI32gY9Z0sdhnqyZ768w9gMyQ92JWw -> ../../dm-1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 lvm-pv-uuid-FSVXXq-BSjH-Xiim-R9t0-dEiP-8eFY-Z57pEc -> ../../sdi3
lrwxrwxrwx 1 root root   13 Nov 15 14:30 nvme-eui.e8238fa6bf530001001b448b45da7f71 -> ../../nvme0n1
lrwxrwxrwx 1 root root   15 Nov 15 14:30 nvme-eui.e8238fa6bf530001001b448b45da7f71-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root   13 Nov 15 14:30 nvme-WD_Blue_SN570_1TB_220950801779 -> ../../nvme0n1
lrwxrwxrwx 1 root root   13 Nov 15 14:30 nvme-WD_Blue_SN570_1TB_220950801779_1 -> ../../nvme0n1
lrwxrwxrwx 1 root root   15 Nov 15 14:30 nvme-WD_Blue_SN570_1TB_220950801779_1-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root   15 Nov 15 14:30 nvme-WD_Blue_SN570_1TB_220950801779-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root    9 Nov 15 14:30 usb-SanDisk_Extreme_Pro_57345678925A-0:0 -> ../../sdi
lrwxrwxrwx 1 root root   10 Nov 15 14:30 usb-SanDisk_Extreme_Pro_57345678925A-0:0-part1 -> ../../sdi1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 usb-SanDisk_Extreme_Pro_57345678925A-0:0-part2 -> ../../sdi2
lrwxrwxrwx 1 root root   10 Nov 15 14:30 usb-SanDisk_Extreme_Pro_57345678925A-0:0-part3 -> ../../sdi3
lrwxrwxrwx 1 root root    9 Nov 15 14:30 wwn-0x5000c5004ec6de4f -> ../../sdc
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c5004ec6de4f-part1 -> ../../sdc1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c5004ec6de4f-part9 -> ../../sdc9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 wwn-0x5000c5006e7851b9 -> ../../sdb
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c5006e7851b9-part1 -> ../../sdb1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c5006e7851b9-part9 -> ../../sdb9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 wwn-0x5000c5006e83d004 -> ../../sdf
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c5006e83d004-part1 -> ../../sdf1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c5006e83d004-part9 -> ../../sdf9
lrwxrwxrwx 1 root root    9 Nov 15 14:31 wwn-0x5000c500b332541a -> ../../sdh
lrwxrwxrwx 1 root root    9 Nov 15 14:30 wwn-0x5000c500e47eaf56 -> ../../sda
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500e47eaf56-part1 -> ../../sda1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500e47eaf56-part9 -> ../../sda9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 wwn-0x5000c500e47eb1e6 -> ../../sde
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500e47eb1e6-part1 -> ../../sde1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500e47eb1e6-part9 -> ../../sde9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 wwn-0x5000c500e737d46e -> ../../sdg
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500e737d46e-part1 -> ../../sdg1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500e737d46e-part9 -> ../../sdg9
lrwxrwxrwx 1 root root    9 Nov 15 14:30 wwn-0x5000c500f16f068c -> ../../sdd
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500f16f068c-part1 -> ../../sdd1
lrwxrwxrwx 1 root root   10 Nov 15 14:30 wwn-0x5000c500f16f068c-part9 -> ../../sdd9
 
"ata-ST8000DM004-2CX188_ZCT0DCK7" doesn't got any partitions on it. There should be partitions 1+9 too. Did you wipe it after removing it from the machine?
 
"ata-ST8000DM004-2CX188_ZCT0DCK7" doesn't got any partitions on it. There should be partitions 1+9 too. Did you wipe it after removing it from the machine?
i cannot recall whether I wiped it. what I don't get is that actually the ata-ST2000DM008-2FR102_ZFL688AZ is connected and actually available but zfs continues to ignore it. i can't put it back online as the pool seems to be gone.
 
i can't put it back online as the pool seems to be gone.
Yes. Instead of replacing one of the disks of the raidz2 with "zpool replace" you added it as a new single disk vdev via "zpool add". This turned your pool into a stripe (so basically a raid0) and you then removed that disk again which behaves like any raid0. Once a single member is missing the whole pool isn't working anymore.

I think this is a case for data recovery in case you didn't got any backups.
 
Last edited:
No backups. No really important data on that. Only media from 3 years collecting. But next time I should dig deeper into backups that might help to recreate a pool at a later time if some kind of error like that happens again.

thanks for all your efforts
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!