Advice for handling a failed backup/snapshot.

Cyberco Ltd

Renowned Member
Mar 1, 2012
23
0
66
Cheshire, UK
www.cyberco.net
Some words of advice would be very much appreciated....

Background:

A recent vzdump backup failed, however I didn't notice for a couple of weeks.

I can no longer generated any backups - looking to be because an existing snapshot is blocking the system.

The fail appears to have resulted in a snapshot still being present on the server.

lvdisplay shows the usual listing, however shows an extra listing....

LV Path /dev/pve/vzsnapshot-proxmox5-0
LV Name vz-snapshot-proxmox5-0

Also, I think vs-snapshot-proxmox5-0 is out of disk space.

lvs shows:

/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 1876326023168 : Input/output error
/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 1876326080512 : Input/output error
/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 0 : Input/output error
/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 4096 : Input/output error

It also shows an additional line over standard

LV VG ATTR LSIZE Origin Data%
vz-snap-proxmox5-0 pve Swi-Ios- 1.00g data 100.00


My Question:

It appears that "vz-snap-proxmox5-0" needs removing - however because of the crash I haven't got any backups for a few weeks and this is hosting an important VM.

? Is it safe to "lvremove -f vz-snap-proxmox5-0" without bringing down the working system and loosing any data?

? Is there any other way to backup the machine before taking any action that could potentially cause losses?

? Any other advice on how to move forward?


I am sure many others would benefit from this advice too and I'm very willing to write up a more detailed guide on how to diagnose/repair for the community once I have solved this.

Thanks.

Mark.
 
Also, I think vs-snapshot-proxmox5-0 is out of disk space.

lvs shows:

/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 1876326023168 : Input/output error
/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 1876326080512 : Input/output error
/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 0 : Input/output error
/dev/pve/vz-snap-proxmox5-0: read failed after 0 of 4096 at 4096 : Input/output error

It also shows an additional line over standard

LV VG ATTR LSIZE Origin Data%
vz-snap-proxmox5-0 pve Swi-Ios- 1.00g data 100.00
I quick fix to be able to create a backup would be to increase the free space in your VG and then create the backup "manual" by using the backup option in the gui. Every VM or CT has a tab called backup which contains a button named "Backup now". Backup can be done without bringing the server down. When your backup is done you can proceed fixing your current situation.
 
Thank you for the advice Mir,

However, I think the live data disk has plenty of space and it is just the snapshot image that ran out of space, as the VM's only use about 250GB from a 4TB Raid Disk Array.

Also, when I try and do a backup from the GUI it fails every time with the message:

INFO: trying to get global lock - waiting....
 
latest version will stop if there is a backup problem (e.g. a full snapshot) - should prevent such a situation.

upgrade to latest.
 
you can stop backups also via gui.

and removed unneeded snapshots via console (lvremove).

if a vm is locked due a failed backup, unlock it manually (qm unlock VMID).
 
you can stop backups also via gui.

and removed unneeded snapshots via console (lvremove).

if a vm is locked due a failed backup, unlock it manually (qm unlock VMID).


Backups are stopped - "qm unlock" does not unlock as the Gui still says "INFO: trying to get globak lock - waiting..."

I think removing the unneeded snapshot is the answer, is it safe to do this?

I very much appreciate your help.
 
stop the job via gui, does this work?
 
You wrote: "Gui still says "INFO: trying to get globak lock - waiting...""?

what is the problem, pls describe again in detail.
 
You wrote: "Gui still says "INFO: trying to get globak lock - waiting...""?

what is the problem, pls describe again in detail.


Sorry Tom, I will try explain more clearly.

Problem is that the system generated logical volume "vzdnap_proxmox5_0" as part of a backup - which failed.

The backup failed, it is no longer running.

I can not make any more backups of any VM's on the server.

I believe the problem is because there is an extra shadow LV that exists.

Is it safe to delete the "vzsnap-proxmox5-0" volume that was created during the failed backup process without affecting any live VM's on the server?


Thank you.
 
yes, thats what I suggested earlier. remove this snapshot with:

> lvremove ...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!