Thin pool metadata repair

alter

New Member
Feb 22, 2019
1
0
1
36
Hi everybody!

We ran into a trouble with our thin pool. The situation is quite usual, but... not.

After a power failure server boots, but no VM start. After a short investigation turned out that thin pool pve/data is inactive.
Found similar solution on RedHat wiki and tried following steps:
Code:
lvremove pve/lvol0_pmspare
lvconvert --repair pve/data

But the last command`s output was:
Code:
Repair of thin metadata volume of thin pool pve/data failed (status:1). Manual repair required!

And here we stuck. All manuals we found are based on fact, that metadata volume is present at least in /dev/pve. But there are only root and swap volumes.
So we can`t make a dump of metadata. All we got is /etc/lvm/backup and /etc/lvm/archive with configs

Made thin_check on /dev/pve/root:
Code:
thin_check /dev/pve/root
examining superblock
  superblock is corrupt
    bad checksum in superblock

Checked /dev/pve/root under live-usb. Nothing changed.
Should we try to make something like e2fsck -f -b 32768 /dev/sda3 (sda3 is pve/data) ? Not sure about results on LVM-based partition.

Would appreciate any help! What could we try to get our pve/data alive?

Thanks in advance!

-----
Code:
lvs -a
  LV              VG   Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data            pve  twi---tz--   2.63t
  [data_tdata]    pve  Twi-------   2.63t
  [data_tmeta]    pve  ewi-------   2.23g
  [lvol0_pmspare] pve  ewi-------   2.23g
  repaired_01     pve  -wi-------   3.00g
  root            pve  -wi-ao----  96.00g
  swap            pve  -wi-a----- 500.00m
  vm-300-disk-1   pve  Vwi---tz--   5.00g data
  vm-300-disk-2   pve  Vwi---tz-- 800.00g data
  vm-301-disk-1   pve  Vwi---tz--  15.00g data
  vm-301-disk-2   pve  Vwi---tz--   1.81t data

lvscan
  ACTIVE            '/dev/pve/swap' [500.00 MiB] inherit
  ACTIVE            '/dev/pve/root' [96.00 GiB] inherit
  inactive          '/dev/pve/data' [2.63 TiB] inherit
  inactive          '/dev/pve/vm-301-disk-1' [15.00 GiB] inherit
  inactive          '/dev/pve/vm-301-disk-2' [1.81 TiB] inherit
  inactive          '/dev/pve/vm-300-disk-2' [800.00 GiB] inherit
  inactive          '/dev/pve/vm-300-disk-1' [5.00 GiB] inherit
  inactive          '/dev/pve/repaired_01' [3.00 GiB] inherit

lvchange -ay pve/data
  Check of pool pve/data failed (status:1). Manual repair required!

repaired_01 — is from our axperiments. It`s free from any data
 
The only real "help" I got was someone saying make backups . LVM thin is not as awesome as they say ...at least for the uninitiated . Crying of my lost VM right now. I just thought "snapshots are cool, I'll use that one".

So many of these posts with no reply where someone is desperately trying to recover something that went horribly wrong.