Backups are failing

David_Ingram

New Member
Feb 2, 2012
5
0
1
My scheduled backups have failed on me the last couple of nights. I don't really don't know where to start on troubleshooting this one. Any help would be appreciated. The only thing unique about these guests is they are the only Windows guests in my cluster.

Code:
vzdump 100 101 102 103 104 105 --quiet 1 --mailto [EMAIL="david@clevelandlibrary.org"]x[/EMAIL][EMAIL="xx@xxxxxxx.xxx"]xx@xxxxxxx.xxx[/EMAIL] --mode snapshot --compress 1 --maxfiles 1 --storage backups
100: Feb 21 00:00:02 INFO: Starting Backup of VM 100 (qemu)
100: Feb 21 00:00:02 INFO: status = running
100: Feb 21 00:00:03 INFO: backup mode: snapshot
100: Feb 21 00:00:03 INFO: ionice priority: 7
100: Feb 21 00:00:03 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-cassiusclay-0'
100: Feb 21 00:00:03 INFO: umount: /mnt/vzsnap0: not mounted
100: [COLOR=red]Feb 21 00:00:03 ERROR: command ''umount /mnt/vzsnap0'' failed: exit code 1
[/COLOR]100: Feb 21 00:00:10 INFO: lvremove failed - trying again in 8 seconds
100: Feb 21 00:00:18 INFO: lvremove failed - trying again in 16 seconds
100: Feb 21 00:00:34 INFO: lvremove failed - trying again in 32 seconds
100: [COLOR=red]Feb 21 00:01:06 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541137408: Input/output error
100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541194752: Input/output error
100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 0: Input/output error
100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 4096: Input/output error
100: Feb 21 00:01:06 INFO:   Logical volume "vzsnap-cassiusclay-0" already exists in volume group "pve"
100: Feb 21 00:01:13 INFO: lvremove failed - trying again in 8 seconds
100: Feb 21 00:01:21 INFO: lvremove failed - trying again in 16 seconds
100: Feb 21 00:01:37 INFO: lvremove failed - trying again in 32 seconds
100: [COLOR=red]Feb 21 00:02:09 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]100: [COLOR=red]Feb 21 00:02:09 ERROR: Backup of VM 100 failed - command ''lvcreate --size 1024M --snapshot --name '\''vzsnap-cassiusclay-0'\'' '\''/dev/pve/data'\'' failed: exit code 5
[/COLOR]102: Feb 21 00:02:09 INFO: Starting Backup of VM 102 (qemu)
102: Feb 21 00:02:09 INFO: status = running
102: Feb 21 00:02:10 INFO: backup mode: snapshot
102: Feb 21 00:02:10 INFO: ionice priority: 7
102: Feb 21 00:02:10 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-cassiusclay-0'
102: Feb 21 00:02:10 INFO: umount: /mnt/vzsnap0: not mounted
102: [COLOR=red]Feb 21 00:02:10 ERROR: command ''umount /mnt/vzsnap0'' failed: exit code 1
[/COLOR]102: Feb 21 00:02:17 INFO: lvremove failed - trying again in 8 seconds
102: Feb 21 00:02:25 INFO: lvremove failed - trying again in 16 seconds
102: Feb 21 00:02:41 INFO: lvremove failed - trying again in 32 seconds
102: [COLOR=red]Feb 21 00:03:13 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541137408: Input/output error
102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541194752: Input/output error
102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 0: Input/output error
102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 4096: Input/output error
102: Feb 21 00:03:13 INFO:   Logical volume "vzsnap-cassiusclay-0" already exists in volume group "pve"
102: Feb 21 00:03:21 INFO: lvremove failed - trying again in 8 seconds
102: Feb 21 00:03:29 INFO: lvremove failed - trying again in 16 seconds
102: Feb 21 00:03:45 INFO: lvremove failed - trying again in 32 seconds
102: [COLOR=red]Feb 21 00:04:17 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]102: [COLOR=red]Feb 21 00:04:17 ERROR: Backup of VM 102 failed - command ''lvcreate --size 1024M --snapshot --name '\''vzsnap-cassiusclay-0'\'' '\''/dev/pve/data'\'' failed: exit code 5
[/COLOR]
 
Last edited:
My scheduled backups have failed on me the last couple of nights. I don't really don't know where to start on troubleshooting this one. Any help would be appreciated. The only thing unique about these guests is they are the only Windows guests in my cluster.

Code:
vzdump 100 101 102 103 104 105 --quiet 1 --mailto [EMAIL="david@clevelandlibrary.org"]x[/EMAIL][EMAIL="xx@xxxxxxx.xxx"]xx@xxxxxxx.xxx[/EMAIL] --mode snapshot --compress 1 --maxfiles 1 --storage backups
100: Feb 21 00:00:02 INFO: Starting Backup of VM 100 (qemu)
100: Feb 21 00:00:02 INFO: status = running
100: Feb 21 00:00:03 INFO: backup mode: snapshot
100: Feb 21 00:00:03 INFO: ionice priority: 7
100: Feb 21 00:00:03 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-cassiusclay-0'
100: Feb 21 00:00:03 INFO: umount: /mnt/vzsnap0: not mounted
100: [COLOR=red]Feb 21 00:00:03 ERROR: command ''umount /mnt/vzsnap0'' failed: exit code 1
[/COLOR]100: Feb 21 00:00:10 INFO: lvremove failed - trying again in 8 seconds
100: Feb 21 00:00:18 INFO: lvremove failed - trying again in 16 seconds
100: Feb 21 00:00:34 INFO: lvremove failed - trying again in 32 seconds
100: [COLOR=red]Feb 21 00:01:06 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541137408: Input/output error
100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541194752: Input/output error
100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 0: Input/output error
100: Feb 21 00:01:06 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 4096: Input/output error
100: Feb 21 00:01:06 INFO:   Logical volume "vzsnap-cassiusclay-0" already exists in volume group "pve"
100: Feb 21 00:01:13 INFO: lvremove failed - trying again in 8 seconds
100: Feb 21 00:01:21 INFO: lvremove failed - trying again in 16 seconds
100: Feb 21 00:01:37 INFO: lvremove failed - trying again in 32 seconds
100: [COLOR=red]Feb 21 00:02:09 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]100: [COLOR=red]Feb 21 00:02:09 ERROR: Backup of VM 100 failed - command ''lvcreate --size 1024M --snapshot --name '\''vzsnap-cassiusclay-0'\'' '\''/dev/pve/data'\'' failed: exit code 5
[/COLOR]102: Feb 21 00:02:09 INFO: Starting Backup of VM 102 (qemu)
102: Feb 21 00:02:09 INFO: status = running
102: Feb 21 00:02:10 INFO: backup mode: snapshot
102: Feb 21 00:02:10 INFO: ionice priority: 7
102: Feb 21 00:02:10 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-cassiusclay-0'
102: Feb 21 00:02:10 INFO: umount: /mnt/vzsnap0: not mounted
102: [COLOR=red]Feb 21 00:02:10 ERROR: command ''umount /mnt/vzsnap0'' failed: exit code 1
[/COLOR]102: Feb 21 00:02:17 INFO: lvremove failed - trying again in 8 seconds
102: Feb 21 00:02:25 INFO: lvremove failed - trying again in 16 seconds
102: Feb 21 00:02:41 INFO: lvremove failed - trying again in 32 seconds
102: [COLOR=red]Feb 21 00:03:13 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541137408: Input/output error
102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 375541194752: Input/output error
102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 0: Input/output error
102: Feb 21 00:03:13 INFO:   /dev/pve/vzsnap-cassiusclay-0: read failed after 0 of 4096 at 4096: Input/output error
102: Feb 21 00:03:13 INFO:   Logical volume "vzsnap-cassiusclay-0" already exists in volume group "pve"
102: Feb 21 00:03:21 INFO: lvremove failed - trying again in 8 seconds
102: Feb 21 00:03:29 INFO: lvremove failed - trying again in 16 seconds
102: Feb 21 00:03:45 INFO: lvremove failed - trying again in 32 seconds
102: [COLOR=red]Feb 21 00:04:17 ERROR: command 'lvremove -f /dev/pve/vzsnap-cassiusclay-0' failed: Not a CODE reference at /usr/share/perl5/PVE/Tools.pm line 352.
[/COLOR]102: [COLOR=red]Feb 21 00:04:17 ERROR: Backup of VM 102 failed - command ''lvcreate --size 1024M --snapshot --name '\''vzsnap-cassiusclay-0'\'' '\''/dev/pve/data'\'' failed: exit code 5
[/COLOR]

Try to remove the snapshot manuall

# lvremove /dev/pve/vzsnap-cassiusclay-0

If that fails you need to find out who is using the snapshot (fuser, lsof).

And please update your installation.
 
So I tried to find an open process that was associated with /dev/pve/vzsnap-cassiusclay-0 but neither fuser or lsof lists anything. So I stopped both VMs, then I was able to remove /dev/pve/vzsnap-cassiusclay-0. I then did a snapshot and both VMs backed up fine. We'll just have to wait and see if the scheduled backups work properly tonight. Thanks for the help.

PS. I will be updating to rc1 this weekend.
 
I am having the exact same problem. I run the very latest release.
Deleting the snapshot manually as suggested by Dietmar works but why the system fails to remove these snapshots automatically?

root@nd02-cl01:~# pveversion -v
pve-manager: 2.0-33 (pve-manager/2.0/c598d9e1)
running kernel: 2.6.32-7-pve
proxmox-ve-2.6.32: 2.0-60
pve-kernel-2.6.32-6-pve: 2.6.32-55
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.88-2pve1
clvm: 2.02.88-2pve1
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-1
pve-cluster: 1.0-23
qemu-server: 2.0-19
pve-firmware: 1.0-15
libpve-common-perl: 1.0-14
libpve-access-control: 1.0-16
libpve-storage-perl: 2.0-12
vncterm: 1.0-2
vzctl: 3.0.30-2pve1
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-4
ksm-control-daemon: 1.1-1
 
Deleting the snapshot manually as suggested by Dietmar works but why the system fails to remove these snapshots automatically?

It usually fails when some other software access files on the mounted snapshot (try lsof or fuser to find out).
 
I will try lsof or fuser next time it happens. Very strange however because:
1 - I installed from a barebone cd... there is no other software working at the hypervisor level
2 - i could delete it right away (using lvremove) ... did not have to kill any process
Anyway.. I will keep testing and i will let you know if it happens again. Thanks
 
I am having problem with scheduled backups:
 
Last edited by a moderator:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!