Repair ZFS pool

Partake5177

New Member
May 8, 2024
17
2
3
Hello everyone,

I have a proxmox ve running with 4x NVME in a ZFS pool.
I did physically add a fifth drive to the server (an older SAS HDD).

Suddenly I got my warnings that my zfs-pool is degraded and the first nvme was stated as "faulted". I did remove the hdd drive - can someone help me to fix my zfs pool?

When I run ls -l /dev/disk/by-id/ | grep nvme can see the drive.

Code:
root@pve1:~# ls -l /dev/disk/by-id/ | grep nvme
lrwxrwxrwx 1 root root 13 Oct 29 10:17 nvme-eui.01000000000000008ce38ee309d73306 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Oct 29 10:17 nvme-eui.01000000000000008ce38ee309d73306-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Oct 29 10:17 nvme-eui.01000000000000008ce38ee309d73306-part9 -> ../../nvme0n1p9
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-eui.01000000000000008ce38ee309d73388 -> ../../nvme2n1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-eui.01000000000000008ce38ee309d73388-part1 -> ../../nvme2n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-eui.01000000000000008ce38ee309d73388-part9 -> ../../nvme2n1p9
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-eui.01000000000000008ce38ee309d733c9 -> ../../nvme3n1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-eui.01000000000000008ce38ee309d733c9-part1 -> ../../nvme3n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-eui.01000000000000008ce38ee309d733c9-part9 -> ../../nvme3n1p9
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-eui.01000000000000008ce38ee309d7348c -> ../../nvme1n1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-eui.01000000000000008ce38ee309d7348c-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-eui.01000000000000008ce38ee309d7348c-part9 -> ../../nvme1n1p9
lrwxrwxrwx 1 root root 13 Oct 29 10:17 nvme-KIOXIA_KCMYXVUG3T20_6FT0A0190L43 -> ../../nvme0n1
lrwxrwxrwx 1 root root 13 Oct 29 10:17 nvme-KIOXIA_KCMYXVUG3T20_6FT0A0190L43_1 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Oct 29 10:17 nvme-KIOXIA_KCMYXVUG3T20_6FT0A0190L43_1-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Oct 29 10:17 nvme-KIOXIA_KCMYXVUG3T20_6FT0A0190L43_1-part9 -> ../../nvme0n1p9
lrwxrwxrwx 1 root root 15 Oct 29 10:17 nvme-KIOXIA_KCMYXVUG3T20_6FT0A0190L43-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Oct 29 10:17 nvme-KIOXIA_KCMYXVUG3T20_6FT0A0190L43-part9 -> ../../nvme0n1p9
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01B0L43 -> ../../nvme2n1
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01B0L43_1 -> ../../nvme2n1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01B0L43_1-part1 -> ../../nvme2n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01B0L43_1-part9 -> ../../nvme2n1p9
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01B0L43-part1 -> ../../nvme2n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01B0L43-part9 -> ../../nvme2n1p9
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01C0L43 -> ../../nvme3n1
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01C0L43_1 -> ../../nvme3n1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01C0L43_1-part1 -> ../../nvme3n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01C0L43_1-part9 -> ../../nvme3n1p9
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01C0L43-part1 -> ../../nvme3n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01C0L43-part9 -> ../../nvme3n1p9
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01F0L43 -> ../../nvme1n1
lrwxrwxrwx 1 root root 13 Sep 23 12:14 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01F0L43_1 -> ../../nvme1n1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01F0L43_1-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01F0L43_1-part9 -> ../../nvme1n1p9
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01F0L43-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root 15 Sep 29 07:36 nvme-KIOXIA_KCMYXVUG3T20_6FT0A01F0L43-part9 -> ../../nvme1n1p9


1761731254310.png
 
I did run root@pve1:~# zpool online NVME-ZFS-pool nvme0n1p1 and the pool is healthy again. Anyone knows how this can happen? Adding another disk on the hardware should have no impact to the existing pool?
 
Since it is marked as “Deleted,” detailed operations such as booting the system with the NVMe SSD removed may be required.
 
Last edited:
there are a few things you should be aware of:

using /dev/nvme* for zpool vdev markers is dangerous, since that nomenclature is POSITIONAL (which means, its specific to the slot you used, not the drive.) use WWNs instead. to do that, simply export your pool like so:

zpool export NVME-ZFS-Pool

and reimport using -d /dev/disk/by-id switch like so:

zpool import -d /dev/disk/by-id NVME-ZFS-Pool

Anyone knows how this can happen?
flakey backplane and/or mechanical misalignment. make sure there is no play when your disk is fully inserted, and that the cable connecting the midplane to your motherboard is properly seated on both ends. you can also try a different slot.