Recovering from lvm thin metadata exhaustion

djzort

Member
Aug 8, 2013
29
1
23
Ok so this happened:
Code:
Jul  3 13:16:24 saito kernel: [131695.910332] device-mapper: space map metadata: unable to allocate new metadata b
lock
Jul  3 13:16:24 saito kernel: [131695.910762] device-mapper: thin: 253:4: metadata operation 'dm_thin_remove_range' failed: error = -28
Jul  3 13:16:24 saito kernel: [131695.911019] device-mapper: thin: 253:4: aborting current metadata transaction
Jul  3 13:16:24 saito kernel: [131695.974977] device-mapper: thin: 253:4: switching pool to read-only mode
Jul  3 13:16:33 saito kernel: [131705.274889] device-mapper: thin: dm_thin_get_highest_mapped_block returned -61
Jul  3 13:16:43 saito kernel: [131715.351896] device-mapper: thin: dm_thin_get_highest_mapped_block returned -61
Jul  3 13:16:53 saito kernel: [131725.446482] device-mapper: thin: dm_thin_get_highest_mapped_block returned -61

And sure enough
Code:
root@saito:/var/log# lvs -a
  Failed to parse thin params: Error.
  LV              VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data            pve twi-cotzM- 500.00g             37.28  96.39                           
  [data_tdata]    pve Twi-ao---- 500.00g                                                   
  [data_tmeta]    pve ewi-ao---- 100.00m                                                   
  [lvol0_pmspare] pve ewi------- 100.00m                                                   
  root            pve -wi-ao----  93.13g                                                   
  swap            pve -wi-ao----  14.90g                                                   
  vm-100-disk-1   pve Vwi-XXtzX- 200.00g data                                               
  vm-100-disk-2   pve Vwi-a-tz-- 100.00g data        23.25

So i added more:
Code:
root@saito:/var/log# lvextend --poolmetadatasize +1G pve/data
  Size of logical volume pve/data_tmeta changed from 100.00 MiB (25 extents) to 1.10 GiB (281 extents).
  Logical volume pve/data_tmeta successfully resized.

killed off the stuck qemu processess then

Code:
root@saito:/var/log# lvchange -an -v /dev/pve/vm-100-disk-1
    Deactivating logical volume pve/vm-100-disk-1.
    Removing pve-vm--100--disk--1 (253:6)
root@saito:/var/log# lvchange -an -v /dev/pve/vm-100-disk-2
    Deactivating logical volume pve/vm-100-disk-2.
    Removing pve-vm--100--disk--2 (253:7)
root@saito:/var/log# lvchange -an -v /dev/pve/data
    Deactivating logical volume pve/data.
    Not monitoring pve/data with libdevmapper-event-lvm2thin.so
    Removing pve-data (253:5)
    Removing pve-data-tpool (253:4)
    Executing: /usr/sbin/thin_check -q --clear-needs-check-flag /dev/mapper/pve-data_tmeta
    /usr/sbin/thin_check failed: 1
  WARNING: Integrity check of metadata for pool pve/data failed.
    Removing pve-data_tdata (253:3)
    Removing pve-data_tmeta (253:2)

then do repair
Code:
root@saito:/var/log# lvconvert --repair pve/data
  Using default stripesize 64.00 KiB.
  WARNING: recovery of pools without pool metadata spare LV is not automated.
  WARNING: If everything works, remove pve/data_meta0 volume.
  WARNING: Use pvmove command to move pve/data_tmeta on the best fitting PV.

looks good, bring it back up and check metadata state:
Code:
root@saito:/var/log# lvchange -ay -v /dev/pve/data
    Activating logical volume pve/data exclusively.
    activation/volume_list configuration setting not defined: Checking only host tags for pve/data.
    Creating pve-data_tmeta
    Loading pve-data_tmeta table (253:2)
    Resuming pve-data_tmeta (253:2)
    Creating pve-data_tdata
    Loading pve-data_tdata table (253:3)
    Resuming pve-data_tdata (253:3)
    Executing: /usr/sbin/thin_check -q --clear-needs-check-flag /dev/mapper/pve-data_tmeta
    Creating pve-data-tpool
    Loading pve-data-tpool table (253:4)
    Resuming pve-data-tpool (253:4)
    Creating pve-data
    Loading pve-data table (253:5)
    Resuming pve-data (253:5)
    Monitoring pve/data
root@saito:/var/log# lvs -a
  LV            VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- 500.00g             4.65   1.19                           
  data_meta0    pve -wi-------   1.15g                                                   
  [data_tdata]  pve Twi-ao---- 500.00g                                                   
  [data_tmeta]  pve ewi-ao----   1.15g                                                   
  root          pve -wi-ao----  93.13g                                                   
  swap          pve -wi-ao----  14.90g                                                   
  vm-100-disk-1 pve Vwi---tz-- 200.00g data                                               
  vm-100-disk-2 pve Vwi---tz-- 100.00g data                                               
root@saito:/var/log# pvdisplay

good news for disk 2...
Code:
root@saito:/var/log# lvchange -ay -v /dev/pve/vm-100-disk-2
    Activating logical volume pve/vm-100-disk-2 exclusively.
    activation/volume_list configuration setting not defined: Checking only host tags for pve/vm-100-disk-2.
    Loading pve-data_tdata table (253:3)
    Suppressed pve-data_tdata (253:3) identical table reload.
    Loading pve-data_tmeta table (253:2)
    Suppressed pve-data_tmeta (253:2) identical table reload.
    Loading pve-data-tpool table (253:4)
    Suppressed pve-data-tpool (253:4) identical table reload.
    Creating pve-vm--100--disk--2
    Loading pve-vm--100--disk--2 table (253:6)
    Resuming pve-vm--100--disk--2 (253:6)
    pve/data already monitored.

now bad news for disk 1...
Code:
root@saito:/var/log# lvchange -ay -v /dev/pve/vm-100-disk-1
    Activating logical volume pve/vm-100-disk-1 exclusively.
    activation/volume_list configuration setting not defined: Checking only host tags for pve/vm-100-disk-1.
    Loading pve-data_tdata table (253:3)
    Suppressed pve-data_tdata (253:3) identical table reload.
    Loading pve-data_tmeta table (253:2)
    Suppressed pve-data_tmeta (253:2) identical table reload.
    Loading pve-data-tpool table (253:4)
    Suppressed pve-data-tpool (253:4) identical table reload.
    Creating pve-vm--100--disk--1
    Loading pve-vm--100--disk--1 table (253:7)
  device-mapper: reload ioctl on (253:7) failed: No data available
    Removing pve-vm--100--disk--1 (253:7)

and from dmesg regarding disk one:
Code:
[481216.385943] device-mapper: table: 253:7: thin: Couldn't open thin internal device
[481216.386433] device-mapper: ioctl: error adding target to table

any thoughts on how to bring this disk back?
 
I think i may be in serious trouble
Code:
root@saito:/var/log# thin_dump /dev/pve/data_meta0 > /tmp/foo.txt
root@saito:/var/log# grep superblock /tmp/foo.txt
<superblock uuid="" time="0" transaction="6" data_block_size="128" nr_data_blocks="8192000">
</superblock>
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!