ZFS pool unavailable after 7.1-5 upgrade

sharkadmiral

Member
Nov 17, 2021
3
0
6
34
Disks showing in GUI with no GPT, smart values all ok for the 4 disks in raidz1.
I set a rootdelay of 60s for grub during my initial google trouble shooting.

Code:
root@R310A:~# zpool import
   pool: ZFS0
     id: 9075999533737999338
  state: ONLINE
status: One or more devices were being resilvered.
 action: The pool can be imported using its name or numeric identifier.
 config:

        ZFS0        ONLINE
          raidz1-0  ONLINE
            sda     ONLINE
            sdb     ONLINE
            sdc     ONLINE
            sdd     ONLINE

Code:
root@R310A:~# zpool import 9075999533737999338
cannot import 'ZFS0': one or more devices is currently unavailable

Code:
root@R310A:~# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                            8:0    0 931.5G  0 disk
├─sda1                         8:1    0 931.5G  0 part
└─sda9                         8:9    0     8M  0 part
sdb                            8:16   0 931.5G  0 disk
├─sdb1                         8:17   0 931.5G  0 part
└─sdb9                         8:25   0     8M  0 part
sdc                            8:32   0 931.5G  0 disk
├─sdc1                         8:33   0 931.5G  0 part
└─sdc9                         8:41   0     8M  0 part
sdd                            8:48   0 931.5G  0 disk
├─sdd1                         8:49   0 931.5G  0 part
└─sdd9                         8:57   0     8M  0 part
sde                            8:64   0 223.6G  0 disk
├─sde1                         8:65   0  1007K  0 part
├─sde2                         8:66   0   512M  0 part
└─sde3                         8:67   0 223.1G  0 part
  ├─pve-swap                 253:0    0     8G  0 lvm  [SWAP]
  ├─pve-root                 253:1    0  55.8G  0 lvm  /
  ├─pve-data_tmeta           253:2    0   1.4G  0 lvm 
  │ └─pve-data-tpool         253:4    0 140.5G  0 lvm 
  │   ├─pve-data             253:5    0 140.5G  1 lvm 
  │   ├─pve-vm--100--disk--0 253:6    0    30G  0 lvm 
  │   └─pve-vm--101--disk--0 253:7    0    40G  0 lvm 
  └─pve-data_tdata           253:3    0 140.5G  0 lvm 
    └─pve-data-tpool         253:4    0 140.5G  0 lvm 
      ├─pve-data             253:5    0 140.5G  1 lvm 
      ├─pve-vm--100--disk--0 253:6    0    30G  0 lvm 
      └─pve-vm--101--disk--0 253:7    0    40G  0 lvm
 
Hi,
Disks showing in GUI with no GPT, smart values all ok for the 4 disks in raidz1.
I set a rootdelay of 60s for grub during my initial google trouble shooting.

Code:
root@R310A:~# zpool import
   pool: ZFS0
     id: 9075999533737999338
  state: ONLINE
status: One or more devices were being resilvered.
 action: The pool can be imported using its name or numeric identifier.
 config:

        ZFS0        ONLINE
          raidz1-0  ONLINE
            sda     ONLINE
            sdb     ONLINE
            sdc     ONLINE
            sdd     ONLINE

Code:
root@R310A:~# zpool import 9075999533737999338
cannot import 'ZFS0': one or more devices is currently unavailable

Code:
root@R310A:~# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                            8:0    0 931.5G  0 disk
├─sda1                         8:1    0 931.5G  0 part
└─sda9                         8:9    0     8M  0 part
sdb                            8:16   0 931.5G  0 disk
├─sdb1                         8:17   0 931.5G  0 part
└─sdb9                         8:25   0     8M  0 part
sdc                            8:32   0 931.5G  0 disk
├─sdc1                         8:33   0 931.5G  0 part
└─sdc9                         8:41   0     8M  0 part
sdd                            8:48   0 931.5G  0 disk
├─sdd1                         8:49   0 931.5G  0 part
└─sdd9                         8:57   0     8M  0 part
sde                            8:64   0 223.6G  0 disk
├─sde1                         8:65   0  1007K  0 part
├─sde2                         8:66   0   512M  0 part
└─sde3                         8:67   0 223.1G  0 part
  ├─pve-swap                 253:0    0     8G  0 lvm  [SWAP]
  ├─pve-root                 253:1    0  55.8G  0 lvm  /
  ├─pve-data_tmeta           253:2    0   1.4G  0 lvm
  │ └─pve-data-tpool         253:4    0 140.5G  0 lvm
  │   ├─pve-data             253:5    0 140.5G  1 lvm
  │   ├─pve-vm--100--disk--0 253:6    0    30G  0 lvm
  │   └─pve-vm--101--disk--0 253:7    0    40G  0 lvm
  └─pve-data_tdata           253:3    0 140.5G  0 lvm
    └─pve-data-tpool         253:4    0 140.5G  0 lvm
      ├─pve-data             253:5    0 140.5G  1 lvm
      ├─pve-vm--100--disk--0 253:6    0    30G  0 lvm
      └─pve-vm--101--disk--0 253:7    0    40G  0 lvm
does it work when you add -d /dev/disk/by-id to the import command? Does it work with an older kernel? What is the output of zdb -l /dev/sda1 and for the other devices with ZFS?
 
Hi,

does it work when you add -d /dev/disk/by-id to the import command? Does it work with an older kernel? What is the output of zdb -l /dev/sda1 and for the other devices with ZFS?
Well it worked with the previous kernel before I updated today to 7.1. Looks like the commands are returning some tags for degraded and faults. Maybe hardware issue?

Code:
root@R310A:~# zpool import -d /dev/disk/by-id
   pool: ZFS0
     id: 9075999533737999338
  state: ONLINE
status: One or more devices were being resilvered.
 action: The pool can be imported using its name or numeric identifier.
 config:

        ZFS0                        ONLINE
          raidz1-0                  ONLINE
            wwn-0x5000c50056b1bb73  ONLINE
            wwn-0x5000c50057645d07  ONLINE
            wwn-0x5000c50057669f37  ONLINE
            wwn-0x5000c500576472c3  ONLINE





root@R310A:~# zpool import 9075999533737999338
cannot import 'ZFS0': one or more devices is currently unavailable





root@R310A:~# zdb -l /dev/sda1
failed to read label 2
failed to read label 3
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'ZFS0'
    state: 0
    txg: 488358
    pool_guid: 9075999533737999338
    errata: 0
    hostid: 2524301996
    hostname: 'R310A'
    top_guid: 9666786271144064857
    guid: 1680326545989352587
    vdev_children: 1
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 9666786271144064857
        nparity: 1
        metaslab_array: 256
        metaslab_shift: 34
        ashift: 12
        asize: 4000759939072
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 1680326545989352587
            path: '/dev/sda1'
            devid: 'scsi-35000c50056b1bb73-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy7-lun-0'
            whole_disk: 1
            DTL: 165
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 12068977034659978685
            path: '/dev/sdb1'
            devid: 'scsi-35000c50057645d07-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy4-lun-0'
            whole_disk: 1
            DTL: 164
            create_txg: 4
            degraded: 1
        children[2]:
            type: 'disk'
            id: 2
            guid: 13674400842796361612
            path: '/dev/sdc1'
            devid: 'scsi-35000c50057669f37-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy5-lun-0'
            whole_disk: 1
            DTL: 163
            create_txg: 4
        children[3]:
            type: 'disk'
            id: 3
            guid: 7963176826281990900
            path: '/dev/sdd1'
            devid: 'scsi-35000c500576472c3-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy6-lun-0'
            whole_disk: 1
            DTL: 78947
            create_txg: 4
            faulted: 1
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1

root@R310A:~# zdb -l /dev/sdb1
failed to read label 2
failed to read label 3
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'ZFS0'
    state: 0
    txg: 488358
    pool_guid: 9075999533737999338
    errata: 0
    hostid: 2524301996
    hostname: 'R310A'
    top_guid: 9666786271144064857
    guid: 12068977034659978685
    vdev_children: 1
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 9666786271144064857
        nparity: 1
        metaslab_array: 256
        metaslab_shift: 34
        ashift: 12
        asize: 4000759939072
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 1680326545989352587
            path: '/dev/sda1'
            devid: 'scsi-35000c50056b1bb73-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy7-lun-0'
            whole_disk: 1
            DTL: 165
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 12068977034659978685
            path: '/dev/sdb1'
            devid: 'scsi-35000c50057645d07-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy4-lun-0'
            whole_disk: 1
            DTL: 164
            create_txg: 4
            degraded: 1
        children[2]:
            type: 'disk'
            id: 2
            guid: 13674400842796361612
            path: '/dev/sdc1'
            devid: 'scsi-35000c50057669f37-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy5-lun-0'
            whole_disk: 1
            DTL: 163
            create_txg: 4
        children[3]:
            type: 'disk'
            id: 3
            guid: 7963176826281990900
            path: '/dev/sdd1'
            devid: 'scsi-35000c500576472c3-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy6-lun-0'
            whole_disk: 1
            DTL: 78947
            create_txg: 4
            faulted: 1
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1

root@R310A:~# zdb -l /dev/sdc1
failed to read label 2
failed to read label 3
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'ZFS0'
    state: 0
    txg: 488358
    pool_guid: 9075999533737999338
    errata: 0
    hostid: 2524301996
    hostname: 'R310A'
    top_guid: 9666786271144064857
    guid: 13674400842796361612
    vdev_children: 1
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 9666786271144064857
        nparity: 1
        metaslab_array: 256
        metaslab_shift: 34
        ashift: 12
        asize: 4000759939072
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 1680326545989352587
            path: '/dev/sda1'
            devid: 'scsi-35000c50056b1bb73-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy7-lun-0'
            whole_disk: 1
            DTL: 165
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 12068977034659978685
            path: '/dev/sdb1'
            devid: 'scsi-35000c50057645d07-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy4-lun-0'
            whole_disk: 1
            DTL: 164
            create_txg: 4
            degraded: 1
        children[2]:
            type: 'disk'
            id: 2
            guid: 13674400842796361612
            path: '/dev/sdc1'
            devid: 'scsi-35000c50057669f37-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy5-lun-0'
            whole_disk: 1
            DTL: 163
            create_txg: 4
        children[3]:
            type: 'disk'
            id: 3
            guid: 7963176826281990900
            path: '/dev/sdd1'
            devid: 'scsi-35000c500576472c3-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy6-lun-0'
            whole_disk: 1
            DTL: 78947
            create_txg: 4
            faulted: 1
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1

root@R310A:~# zdb -l /dev/sdd1
failed to read label 2
failed to read label 3
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'ZFS0'
    state: 0
    txg: 488354
    pool_guid: 9075999533737999338
    errata: 0
    hostid: 2524301996
    hostname: 'R310A'
    top_guid: 9666786271144064857
    guid: 7963176826281990900
    vdev_children: 1
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 9666786271144064857
        nparity: 1
        metaslab_array: 256
        metaslab_shift: 34
        ashift: 12
        asize: 4000759939072
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 1680326545989352587
            path: '/dev/sda1'
            devid: 'scsi-35000c50056b1bb73-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy7-lun-0'
            whole_disk: 1
            DTL: 165
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 12068977034659978685
            path: '/dev/sdb1'
            devid: 'scsi-35000c50057645d07-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy4-lun-0'
            whole_disk: 1
            DTL: 164
            create_txg: 4
            degraded: 1
        children[2]:
            type: 'disk'
            id: 2
            guid: 13674400842796361612
            path: '/dev/sdc1'
            devid: 'scsi-35000c50057669f37-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy5-lun-0'
            whole_disk: 1
            DTL: 163
            create_txg: 4
        children[3]:
            type: 'disk'
            id: 3
            guid: 7963176826281990900
            path: '/dev/sdd1'
            devid: 'scsi-35000c500576472c3-part1'
            phys_path: 'pci-0000:05:00.0-sas-phy6-lun-0'
            whole_disk: 1
            DTL: 78947
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1
 
Well it worked with the previous kernel before I updated today to 7.1. Looks like the commands are returning some tags for degraded and faults. Maybe hardware issue?
You should still be able to boot with an older kernel to test (assuming you haven't uninstalled all of them).
I'd guess there was some issue (hence the resilvering), but at least ZFS thinks it was able to fix it.

root@R310A:~# zpool import -d /dev/disk/by-id
pool: ZFS0
id: 9075999533737999338
state: ONLINE
status: One or more devices were being resilvered.
action: The pool can be imported using its name or numeric identifier.
config:

ZFS0 ONLINE
raidz1-0 ONLINE
wwn-0x5000c50056b1bb73 ONLINE
wwn-0x5000c50057645d07 ONLINE
wwn-0x5000c50057669f37 ONLINE
wwn-0x5000c500576472c3 ONLINE





root@R310A:~# zpool import 9075999533737999338
cannot import 'ZFS0': one or more devices is currently unavailable

I was thinking of zpool import -d /dev/disk/by-id 9075999533737999338.
 
You should still be able to boot with an older kernel to test (assuming you haven't uninstalled all of them).
I'd guess there was some issue (hence the resilvering), but at least ZFS thinks it was able to fix it.



I was thinking of zpool import -d /dev/disk/by-id 9075999533737999338.
Code:
root@R310A:~# zpool import -d /dev/disk/by-id 9075999533737999338
cannot import 'ZFS0': one or more devices is currently unavailable