One or more devices could not be used because the label is missing or invalid

justjosh

Well-Known Member
Nov 4, 2019
103
2
58
60
Hello,

For some reason Proxmox seems to have lost 2 of the drives in my zpool even though they are currently there. I noticed that the drives might have had a letter shifted (the disks currently exist as /dev/sdk and /dev/sdl).

Code:
# zpool status
  pool: HDD
state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 0 days 04:57:09 with 0 errors on Sun Mar  8 05:21:10 2020
config:

        NAME                      STATE     READ WRITE CKSUM
        HDD                       DEGRADED     0     0     0
          raidz2-0                DEGRADED     0     0     0
            sda                   ONLINE       0     0     0
            sdb                   ONLINE       0     0     0
            sdc                   ONLINE       0     0     0
            sdd                   ONLINE       0     0     0
            15673135534162611596  FAULTED      0     0     0  was /dev/sdl1
            820380648744883658    UNAVAIL      0     0     0  was /dev/sdm1

My theory is that ZFS is looking for /dev/sdk under /dev/sdl but the headers don't match because /dev/sdl used to be /dev/sdm.

I'm thinking I should be doing:
#zpool replace 15673135534162611596 /dev/sdk
#zpool replace 820380648744883658 /dev/sdl

But I would like to be sure that these drives were indeed shifted. How do I investigate further?

Thanks!
 
My theory is that ZFS is looking for /dev/sdk under /dev/sdl but the headers don't match because /dev/sdl used to be /dev/sdm.
ZFS usually does not blindly rely on drive-letters for pool import - so am not sure that everything is just due to a mixup of drive letters.

you could try to check the drives for information and their zfs label with
Code:
zdb -lll /dev/sdl1
replace /dev/sdl1 with the disk/partition you want to investigate

also the output of `lsblk` should give you some hint which drives contains what

I hope this helps!
 
ZFS usually does not blindly rely on drive-letters for pool import - so am not sure that everything is just due to a mixup of drive letters.

you could try to check the drives for information and their zfs label with
Code:
zdb -lll /dev/sdl1
replace /dev/sdl1 with the disk/partition you want to investigate

also the output of `lsblk` should give you some hint which drives contains what

I hope this helps!
Hello,

So using the command zdb to look at the labels, the drives are indeed mapped wrongly.

Code:
        children[4]:
            type: 'disk'
            id: 4
            guid: 15673135534162611596
            path: '/dev/sdl1'
            devid: 'ata-<drive serial>-part1'
            phys_path: 'pci-0000:00:1f.2-ata-3'
            whole_disk: 1
            DTL: 3154
            create_txg: 4

When comparing the drive serial under devid for guid 15673135534162611596, I notice that the drive with this serial is mapped to /dev/sdk instead of /dev/sdl which is where it is looking for the drive. Am I right to say that I can safely replace the drive with the command zpool replace 15673135534162611596 /dev/sdk ?

Any ideas how something like that could have happened?
 
hmm - as always in those cases - make sure you have a working and tested backup!

The device nodes /dev/sdX are not always stable - hence the recommendation to import zpools with the /dev/disk/by-id labels instead of the /dev/sdX special-files...
you could try exporting the pool and reimporting using the /dev/disk/by-id links:
Code:
zpool export HDD
zpool import -d /dev/disk/by-id HDD
should work - going by `man zpool`

if this fixes the issue - make sure to update the cache-file of the pool and to regenerate the intramfs
Code:
zpool set cachefile=/etc/zfs/zpool.cache HDD
update-initramfs -k all -u

I hope this helps!
 
hmm - as always in those cases - make sure you have a working and tested backup!

The device nodes /dev/sdX are not always stable - hence the recommendation to import zpools with the /dev/disk/by-id labels instead of the /dev/sdX special-files...
you could try exporting the pool and reimporting using the /dev/disk/by-id links:
Code:
zpool export HDD
zpool import -d /dev/disk/by-id HDD
should work - going by `man zpool`

if this fixes the issue - make sure to update the cache-file of the pool and to regenerate the intramfs
Code:
zpool set cachefile=/etc/zfs/zpool.cache HDD
update-initramfs -k all -u

I hope this helps!
Hello,

It appears that the import /dev/disk/by-id works. Am currently using an encrypted ZFS volume that is decrypted by keyfile on system boot. Is this the right output?

Code:
# update-initramfs -k all -u
update-initramfs: Generating /boot/initrd.img-5.3.10-1-pve
Running hook script 'zz-pve-efiboot'..
Re-executing '/etc/kernel/postinst.d/zz-pve-efiboot' in new private mount namespace..
No /etc/kernel/cmdline found - falling back to /proc/cmdline
Copying and configuring kernels on /dev/disk/by-uuid/CAEA-1C6B
        Copying kernel and creating boot-entry for 5.3.10-1-pve
Copying and configuring kernels on /dev/disk/by-uuid/CAEA-B366
        Copying kernel and creating boot-entry for 5.3.10-1-pve
 
Last edited:
If you've installed PVE with root on ZFS on a UEFI system with an installer newer than 6.0 then it looks about right :)
 
but definitely not UEFI.
That's odd since PVE uses systemd-boot only if you install on a uefi system (otherwise systemd-boot does not work)
You can check with:
Code:
find /sys/firmware/efi

(if there are files then the system is booted in uefi mode)
 
That's odd since PVE uses systemd-boot only if you install on a uefi system (otherwise systemd-boot does not work)
You can check with:
Code:
find /sys/firmware/efi

(if there are files then the system is booted in uefi mode)
# find /sys/firmware/efi
find: ‘/sys/firmware/efi’: No such file or directory
 
the initramfs gets updated with `update-initramfs -k all -u`
the sync to the esp's is only necessary if booting with systemd-boot
if you boot using grub, grub should take the initramfs from /boot (where update-intramfs puts it)

I hope this explains it!
 
the initramfs gets updated with `update-initramfs -k all -u`
the sync to the esp's is only necessary if booting with systemd-boot
if you boot using grub, grub should take the initramfs from /boot (where update-intramfs puts it)

I hope this explains it!
How should I check if I'm using grub or systemd-boot? I'm certain it is grub but just want to make sure.