All VM Disks inactive after power failure

lektech · Feb 17, 2022

Hi all, I've seen a few other people have had a similar issue from what I can see/do - nothing seems to work.

Sometime over the previous night the system experienced a power failure and rebooted - but the virtual machines are all unable to start.

Run of qm start 103

Code:

root@pve:~# qm start 103
kvm: -drive file=/dev/pve/vm-103-disk-0,if=none,id=drive-sata1,format=raw,cache=none,aio=native,detect-zeroes=on: Could not open '/dev/pve/vm-103-disk-0': No such file or directory
start failed: QEMU exited with code 1

Run of lvs

Code:

root@pve:~# lvs
  LV            VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz--   4.76t             0.00   0.22                           
  data_meta0    pve -wi-a-----  15.81g                                                   
  root          pve -wi-ao----  96.00g                                                   
  swap          pve -wi-ao----   8.00g                                                   
  vm-100-disk-0 pve Vwi---tz--  10.00g data                                              
  vm-101-disk-0 pve Vwi---tz-- 180.00g data                                              
  vm-102-disk-0 pve Vwi---tz-- 180.00g data                                              
  vm-103-disk-0 pve Vwi---tz-- 180.00g data                                              
  vm-104-disk-0 pve Vwi---tz-- 100.00g data

All of the VM disks are inactive
When I try to run lvchange -ay

Code:

root@pve:~# lvchange -ay /dev/pve/vm-103-disk-0
  device-mapper: reload ioctl on  (253:5) failed: No data available

When I do an lvscan

Code:

root@pve:~# lvscan
  ACTIVE            '/dev/pve/swap' [8.00 GiB] inherit
  ACTIVE            '/dev/pve/root' [96.00 GiB] inherit
  ACTIVE            '/dev/pve/data' [4.76 TiB] inherit
  inactive          '/dev/pve/vm-101-disk-0' [180.00 GiB] inherit
  inactive          '/dev/pve/vm-100-disk-0' [10.00 GiB] inherit
  inactive          '/dev/pve/vm-102-disk-0' [180.00 GiB] inherit
  inactive          '/dev/pve/vm-103-disk-0' [180.00 GiB] inherit
  inactive          '/dev/pve/vm-150-disk-0' [75.00 GiB] inherit
  inactive          '/dev/pve/vm-150-disk-1' [2.00 TiB] inherit
  inactive          '/dev/pve/vm-120-disk-0' [160.00 GiB] inherit
  ---
  ACTIVE            '/dev/pve/data_meta0' [15.81 GiB] inherit

I don't know what to do next. There are 34 virtual machines in total.
The hardware is running 4x 2TB SAS disks configured in a hardware RAID controller in RAID 5.
Proxmox is installed on the RAID 5 volume, and local-lvm which contained the VM disks was also on the RAID volume.
I doubt there is hardware failure..

There is critical data on vm-150-disk-1
The rest of the data on all of the other disks is unimportant in comparison.

Please can anyone help? Do you think the data is recoverable?
Would a Proxmox VE Subscription make the chance of success any higher?

filipealvarez · Feb 17, 2022

Try lvchange -ay data

Show the output of vgs.

lektech · Feb 17, 2022

output of lvchange -ay data

Code:

root@pve:~# lvchange -ay data
  Volume group "data" not found
  Cannot process volume group data

output of vgs

Code:

root@pve:~# vgs
  VG  #PV #LV #SN Attr   VSize VFree 
  pve   1  39   0 wz--n- 4.91t 576.00m

if I try run lvconvert --repair pve/data

Code:

root@pve:~# lvconvert --repair pve/data
  WARNING: Sum of all thin volume sizes (5.16 TiB) exceeds the size of thin pools and the size of whole volume group (4.91 TiB).
  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
  WARNING: LV pve/data_meta0 holds a backup of the unrepaired metadata. Use lvremove when no longer required.

Thanks,

filipealvarez · Feb 17, 2022

Can you try:

lvchange -ay data/vm-103-disk-0

lektech · Feb 17, 2022

output of lvchange -ay data/vm-103-disk-0

Code:

root@pve:~# lvchange -ay data/vm-103-disk-0
  Volume group "data" not found
  Cannot process volume group data

but the output of lvchange -ay pve/vm-103-disk-0

Code:

root@pve:~# lvchange -ay pve/vm-103-disk-0
  device-mapper: reload ioctl on  (253:6) failed: No data available

filipealvarez · Feb 17, 2022

Try this:

vgchange -a y

filipealvarez · Feb 17, 2022

And vchange -ay

lektech · Feb 17, 2022

after running vgchange -a y

Code:

root@pve:~# vgchange -a y
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  ------
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  device-mapper: reload ioctl on  (253:6) failed: No data available
  5 logical volume(s) in volume group "pve" now active

after running vchange -ay

Code:

root@pve:~# vchange -ay
-bash: vchange: command not found

pveversion

Code:

root@pve:~# pveversion
pve-manager/6.3-2/22f57405 (running kernel: 5.4.73-1-pve)

filipealvarez · Feb 17, 2022

Sorry lvchange -ay

lektech · Feb 17, 2022

output of lvchange -ay

Code:

root@pve:~# lvchange -ay
  No command with matching syntax recognised.  Run 'lvchange --help' for more information.
  Nearest similar command has syntax:
  lvchange -a|--activate y|n|ay VG|LV|Tag|Select ...
  Activate or deactivate an LV.

lektech · Feb 18, 2022

I'm starting to feel like all the data is lost and nothing can be done,
The original metadata backup from when I initially tried lvconvert --repair pve/data is gone.

I don't know what's caused this. I'm not sure if it was the metadata fulling up or external forces.

twasi26 · Apr 28, 2022

Hi Lektech, did you had a solution for this, i am having similar problem

lektech · Apr 29, 2022

twasi26 said:
Hi Lektech, did you had a solution for this, i am having similar problem

Hi, no I did not.
I did reply to your DM as well- but for anybody else who stumbles across this thread in the future- I did not solve it.
We bit the bullet and did a fresh install and tried to rebuild what we could.

Another power failure occured one week later and the same thing happened.
This Proxmox install had suffered countless power failures for over a year. All of which are due to national rotational load shedding (South Africa).
Only after a lightning strike (that seemed to have only damaged some networking equipment) did this whole issue with data loss occur- on the first and second power failures. The physical server is completely undamaged.

All I can say now is that we are not using Proxmox any more- we have switched to another system.

I still have, use and somewhat trust Proxmox in my homelab.
Unfortunatley, I will never trust Proxmox to securely store data again.
So now, all of my data is stored on hard drives that have been passed through to my TrueNAS virtual machine.

Search

Search

All VM Disks inactive after power failure

lektech

New Member

filipealvarez

Well-Known Member

lektech

New Member

filipealvarez

Well-Known Member

lektech

New Member

filipealvarez

Well-Known Member

filipealvarez

Well-Known Member

lektech

New Member

filipealvarez

Well-Known Member

lektech

New Member

lektech

New Member

twasi26

New Member

lektech

New Member

We value your privacy