zfs pool problem?

Kosh

Well-Known Member
Dec 24, 2019
98
11
48
45
Hi
I booted from a live CD and imported rpool to perform recovery operations on an already mounted partition.
After that, when I booted into PVE, one of my drives crashed.

Code:
zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 06:20:20 with 0 errors on Sun Oct 12 06:44:21 2025
config:

        NAME                                                 STATE     READ WRITE CKSUM
        rpool                                                DEGRADED     0     0     0
          raidz2-0                                           DEGRADED     0     0     0
            nvme-eui.36344830583444940025384e00000001-part3  ONLINE       0     0     0
            nvme-eui.36344830583444920025384e00000001-part3  ONLINE       0     0     0
            nvme-eui.36344830583444880025384e00000001-part3  ONLINE       0     0     0
            nvme-eui.36344830583444970025384e00000001-part3  ONLINE       0     0     0
            nvme-eui.36344830583444960025384e00000001-part3  ONLINE       0     0     0
            nvme-eui.36344830583444950025384e00000001-part3  ONLINE       0     0     0
            nvme-eui.36344830583444840025384e00000001-part3  ONLINE       0     0     0
            4507037677091464003                              FAULTED      0     0     0  was /dev/nvme7n1p3

errors: No known data errors


Code:
zdb
rpool:
    version: 5000
    name: 'rpool'
    state: 0
    txg: 7680916
    pool_guid: 16987673894477835616
    errata: 0
    hostid: 4228784363
    hostname: 'cloud-v002'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 16987673894477835616
        create_txg: 4
        com.klarasystems:vdev_zap_root: 129
        children[0]:
            type: 'raidz'
            id: 0
            guid: 6309061694612082912
            nparity: 2
            metaslab_array: 139
            metaslab_shift: 34
            ashift: 12
            asize: 30717411065856
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 130
            children[0]:
                type: 'disk'
                id: 0
                guid: 9808093091410395546
                path: '/dev/disk/by-id/nvme-eui.36344830583444940025384e00000001-part3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/4'
                whole_disk: 0
                DTL: 109016
                create_txg: 4
                com.delphix:vdev_zap_leaf: 131
            children[1]:
                type: 'disk'
                id: 1
                guid: 12412313369067703708
                path: '/dev/disk/by-id/nvme-eui.36344830583444920025384e00000001-part3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/12'
                whole_disk: 0
                DTL: 109015
                create_txg: 4
                com.delphix:vdev_zap_leaf: 132
            children[2]:
                type: 'disk'
                id: 2
                guid: 14627112648788585688
                path: '/dev/disk/by-id/nvme-eui.36344830583444880025384e00000001-part3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/3'
                whole_disk: 0
                DTL: 109014
                create_txg: 4
                com.delphix:vdev_zap_leaf: 133
            children[3]:
                type: 'disk'
                id: 3
                guid: 5227534298990620931
                path: '/dev/disk/by-id/nvme-eui.36344830583444970025384e00000001-part3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/11'
                whole_disk: 0
                DTL: 109013
                create_txg: 4
                com.delphix:vdev_zap_leaf: 134
            children[4]:
                type: 'disk'
                id: 4
                guid: 1011566563601879184
                path: '/dev/disk/by-id/nvme-eui.36344830583444960025384e00000001-part3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/2'
                whole_disk: 0
                DTL: 109012
                create_txg: 4
                com.delphix:vdev_zap_leaf: 135
            children[5]:
                type: 'disk'
                id: 5
                guid: 7182381524902485433
                path: '/dev/disk/by-id/nvme-eui.36344830583444950025384e00000001-part3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/10'
                whole_disk: 0
                DTL: 109011
                create_txg: 4
                com.delphix:vdev_zap_leaf: 136
            children[6]:
                type: 'disk'
                id: 6
                guid: 6474831926635961944
                path: '/dev/disk/by-id/nvme-eui.36344830583444840025384e00000001-part3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/1'
                whole_disk: 0
                DTL: 109010
                create_txg: 4
                com.delphix:vdev_zap_leaf: 137
            children[7]:
                type: 'disk'
                id: 7
                guid: 4507037677091464003
                path: '/dev/nvme7n1p3'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/2'
                whole_disk: 0
                not_present: 1
                DTL: 90305
                create_txg: 4
                expansion_time: 1761130666
                com.delphix:vdev_zap_leaf: 87501
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.klarasystems:vdev_zaps_v2

ZFS_DBGMSG(zdb) START:
metaslab.c:1703:spa_set_allocator(): spa allocator: dynamic
ZFS_DBGMSG(zdb) END


Is the drive OK How do I correctly re-enable it in rpool and add an ID name like the other drives?
After reading, they recommend booting from a live CD again and then exporting and importing. Is there an easier way?
 
What happens if you simply try to replace the device with the /dev/disk/by-id path?

zpool replace rpool /dev/nvme7n1p3 /dev/disk/by-id/nvme-eui.<rest-of-ID-here>-part3
 
Code:
 nvme-eui.34595030529002970025384700000002 -> ../../nvme0n1
 nvme-eui.34595030529002970025384700000002-part1 -> ../../nvme0n1p1
 nvme-eui.34595030529002970025384700000002-part2 -> ../../nvme0n1p2
 nvme-eui.34595030529002970025384700000002-part3 -> ../../nvme0n1p3
 nvme-eui.36344830583444840025384e00000001 -> ../../nvme4n1
 nvme-eui.36344830583444840025384e00000001-part1 -> ../../nvme4n1p1
 nvme-eui.36344830583444840025384e00000001-part2 -> ../../nvme4n1p2
 nvme-eui.36344830583444840025384e00000001-part3 -> ../../nvme4n1p3
 nvme-eui.36344830583444880025384e00000001 -> ../../nvme2n1
 nvme-eui.36344830583444880025384e00000001-part1 -> ../../nvme2n1p1
 nvme-eui.36344830583444880025384e00000001-part2 -> ../../nvme2n1p2
 nvme-eui.36344830583444880025384e00000001-part3 -> ../../nvme2n1p3
 nvme-eui.36344830583444920025384e00000001 -> ../../nvme6n1
 nvme-eui.36344830583444920025384e00000001-part1 -> ../../nvme6n1p1
 nvme-eui.36344830583444920025384e00000001-part2 -> ../../nvme6n1p2
 nvme-eui.36344830583444920025384e00000001-part3 -> ../../nvme6n1p3
 nvme-eui.36344830583444940025384e00000001 -> ../../nvme1n1
 nvme-eui.36344830583444940025384e00000001-part1 -> ../../nvme1n1p1
 nvme-eui.36344830583444940025384e00000001-part2 -> ../../nvme1n1p2
 nvme-eui.36344830583444940025384e00000001-part3 -> ../../nvme1n1p3
 nvme-eui.36344830583444950025384e00000001 -> ../../nvme5n1
 nvme-eui.36344830583444950025384e00000001-part1 -> ../../nvme5n1p1
 nvme-eui.36344830583444950025384e00000001-part2 -> ../../nvme5n1p2
 nvme-eui.36344830583444950025384e00000001-part3 -> ../../nvme5n1p3
 nvme-eui.36344830583444960025384e00000001 -> ../../nvme7n1
 nvme-eui.36344830583444960025384e00000001-part1 -> ../../nvme7n1p1
 nvme-eui.36344830583444960025384e00000001-part2 -> ../../nvme7n1p2
 nvme-eui.36344830583444960025384e00000001-part3 -> ../../nvme7n1p3
 nvme-eui.36344830583444970025384e00000001 -> ../../nvme3n1
 nvme-eui.36344830583444970025384e00000001-part1 -> ../../nvme3n1p1
 nvme-eui.36344830583444970025384e00000001-part2 -> ../../nvme3n1p2
 nvme-eui.36344830583444970025384e00000001-part3 -> ../../nvme3n1p3


After comparing the unique disk numbers, I found that /nvme0n1p3 was missing.
nvme-eui.34595030529002970025384700000002-part3 -> ../../nvme0n1p3

I run a command and get a suspicious error

zpool replace rpool 4507037677091464003 /dev/disk/by-id/nvme-eui.34595030529002970025384700000002-part3
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/nvme-eui.34595030529002970025384700000002-part3 is part of active pool 'rpool'
 
I solved the problem.
1- Find the disk that's not in use in our pool using ls -n /dev/disk/by-id/ |grep nvme-eui
2- Moved the disk offline using zpool offline rpool 4507037677091464003
3- Add it to the pool as is, but it won't let me, complaining that the disk is already in the pool.
4- Remove the labels.
zpool labelclear -f /dev/disk/by-id/nvme-eui.34595030529002970025384700000002-part3
5- Now I'm installing it into the pool.
zpool replace rpool 4507037677091464003 /dev/disk/by-id/nvme-eui.34595030529002970025384700000002-part3
6- Check zpool status for replacement.
 
Last edited: