Need some help to recover degraded ZFS raid

Emil Makariev

Member
Jun 17, 2016
10
0
21
35
Hello guys! I have a bit of an issue with ZFS on proxmox 4.4...
Yesterday, I tried to expand my zfs pool. Originally has 4HDD's in Raid 10.
What I did was:

zpool add rpool mirror /dev/sde /dev/sdf

then I wanted to add 1 SSD as Cache:
zpool add rpool cache /dev/nvme0n1

all the disks appeared online:

root@supermicro:~# zpool status
pool: rpool
state: ONLINE
scan: scrub repaired 184K in 3h59m with 0 errors on Sun May 14 04:23:54 2017
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3F5HSPX2S-part2 ONLINE 0 0 0
ata-WDC_WD10EZEX-21WN4A0_WCC6Y6UR4A5V-part2 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3F2JZ5LXR ONLINE 0 0 0
ata-WDC_WD10EZEX-21WN4A0_WCC6Y0KA3X41 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
cache
nvme0n1 ONLINE 0 0 0

After that, I rebooted the system, to be sure that everything will works and received the error:

Loading, please wait...
PANIC: blkptr at ffff88100eca4848 DVA 1 has invalid VDEV 2

There is no console or some command line that I can use, so i tried to boot from USB as rescue system:

root@USB:~# zpool status
no pools available
root@USB:~# zpool import rpool
cannot import 'rpool': one or more devices is currently unavailable

root@USB:~# zpool import -d /dev/disk/by-id/
pool: rpool
id: 17715479149632960048
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://zfsonlinux.org/msg/ZFS-8000-6X
config:

rpool UNAVAIL missing device
mirror-0 ONLINE
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3F5HSPX2S-part2 ONLINE
ata-WDC_WD10EZEX-21WN4A0_WCC6Y6UR4A5V-part2 ONLINE
mirror-1 ONLINE
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3F2JZ5LXR ONLINE
ata-WDC_WD10EZEX-21WN4A0_WCC6Y0KA3X41 ONLINE
cache
lvm-pv-uuid-Rabs6y-HGip-lnW2-6XMT-2kmd-08gQ-6UZWDe

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.

Any idea, how can I boot the system, even only with my old 4 HDDs? or how can I bring up the status of the pool to online? I tried also to remove the last pair physically. Same shit...

Thanks a lot in advance!
 
I had a similar error when I had my cache and log devices on LVM volumes. I changed them to use simple partitions and the pool was imported at boot. It was possible to fix in other ways but the solution was sufficient.

I recommend doing the same, though your situation is different as your mirror-2 is missing entirely. Are your sde, sdf devices visible in the system?
 
what does "zpool import" say?
 
do the zpool import -N -d /dev/disk/by-id/ trick to import the pool, then`exit` to continue the boot, then `update-grub` or `update-initramfs -u` (I don't remember, do both :) after to get the /etc/zfs/zpool.cache in the boot image.
 
Hello, thanks for the response! I had no time, so I did reinstall and restore from backup

That also works :D

Just consider to get familiar with recovery process. You learned now the hard way and had downtime, recovery time and maybe data loss. You can play around inside a VM with all aspects of running Proxmox VE including ZFS disks. You can even hotplug then online to "play around". Comes in really handy at "simulating" stuff.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!