S
sruffilli
Guest
Hello,
two days after a scheduled backup of a virtual machine, I noticed I didn't receive the usual "vzdump backup status".
I checked out
/var/log/vzdump/qemu-VMID.log and, indeed
At 16:15:12 i fired a "kill vzdump_pid" and after a couple of minutes, the process was closed.
(Please note that 40 hours were past from the beginning of the backup process)
Alarmed by the lvremove error i checked out lvscan, which greeted me with
The LVM snapshot is clearly messed up, but what makes me nervous is the vm-disk status, "inactive Original".
I found in many mailing lists (http://lists.debian.org/debian-user/2006/09/msg02538.html) that lvremove-ing the snapshot should solve the problem. Being the LVM volume on a clustered-lvm SAN over iSCSI, i'm not definitely sure my scenario allows me to fire an lvremove of the snapshot.
Furthermore, the VM is running fine at the moment (or it seems to be, what "inactive Original" means then?), and I need it to be up no matter what.
Does anyone have a suggestion? Thank you in advance.
My configuration:
4 node cluster
LVM volume on a san over iSCSI
Version 2.0-57/ff6cd700 (upgrading to latest stable in 2 weeks)
two days after a scheduled backup of a virtual machine, I noticed I didn't receive the usual "vzdump backup status".
I checked out
/var/log/vzdump/qemu-VMID.log and, indeed
Code:
May 03 00:01:01 INFO: Starting Backup of VM 126 (qemu)
May 03 00:01:01 INFO: status = running
May 03 00:01:03 INFO: backup mode: snapshot
May 03 00:01:03 INFO: ionice priority: 7
May 03 00:01:04 INFO: Logical volume "vzsnap-proxmox02-0" created
May 03 00:01:04 INFO: creating archive '/mnt/pve/BUP/dump/vzdump-qemu-126-2012_05_03-00_01_01.tar.gz'
May 03 00:01:04 INFO: adding '/mnt/pve/BUP/dump/vzdump-qemu-126-2012_05_03-00_01_01.tmp/qemu-server.conf' to archive ('qemu-server.conf')
May 03 00:01:04 INFO: adding '/dev/cise-san-disk-2/vzsnap-proxmox02-0' to archive ('vm-disk-virtio0.raw')
May 04 16:15:12 INFO: lvremove failed - trying again in 8 seconds
May 04 16:15:20 INFO: lvremove failed - trying again in 16 seconds
May 04 16:15:36 INFO: lvremove failed - trying again in 32 seconds
May 04 16:16:08 ERROR: command 'lvremove -f /dev/cise-san-disk-2/vzsnap-proxmox02-0' failed: interrupted by signal
May 04 16:16:08 ERROR: Backup of VM 126 failed - command '/usr/lib/qemu-server/vmtar '/mnt/pve/BUP/dump/vzdump-qemu-126-2012_05_03-00_01_01.tmp/qemu-server.conf' 'qemu-server.conf' '/dev/cise-san-disk-2/vzsnap-proxmox02-0' 'vm-disk-virtio0.raw'|gzip >/mnt/pve/BUP/dump/vzdump-qemu-126-2012_05_03-00_01_01.tar.dat' failed: interrupted by signal
At 16:15:12 i fired a "kill vzdump_pid" and after a couple of minutes, the process was closed.
(Please note that 40 hours were past from the beginning of the backup process)
Alarmed by the lvremove error i checked out lvscan, which greeted me with
Code:
root@proxmox02:~# lvscan
[COLOR=#ff0000] /dev/cise-san-disk-2/vzsnap-proxmox02-0: read failed after 0 of 4096 at 210453331968: Input/output error[/COLOR]
[COLOR=#ff0000] /dev/cise-san-disk-2/vzsnap-proxmox02-0: read failed after 0 of 4096 at 210453389312: Input/output error[/COLOR]
[COLOR=#ff0000] /dev/cise-san-disk-2/vzsnap-proxmox02-0: read failed after 0 of 4096 at 0: Input/output error[/COLOR]
[COLOR=#ff0000] /dev/cise-san-disk-2/vzsnap-proxmox02-0: read failed after 0 of 4096 at 4096: Input/output error[/COLOR]
[...]
[COLOR=#ff0000]inactive Original[/COLOR] '/dev/cise-san-disk-2/vm-126-disk-1' [196.00 GiB] inherit
[COLOR=#ff0000] inactive Snapshot[/COLOR] '/dev/cise-san-disk-2/vzsnap-proxmox02-0' [1.00 GiB] inherit
[...]
The LVM snapshot is clearly messed up, but what makes me nervous is the vm-disk status, "inactive Original".
I found in many mailing lists (http://lists.debian.org/debian-user/2006/09/msg02538.html) that lvremove-ing the snapshot should solve the problem. Being the LVM volume on a clustered-lvm SAN over iSCSI, i'm not definitely sure my scenario allows me to fire an lvremove of the snapshot.
Furthermore, the VM is running fine at the moment (or it seems to be, what "inactive Original" means then?), and I need it to be up no matter what.
Does anyone have a suggestion? Thank you in advance.
My configuration:
4 node cluster
LVM volume on a san over iSCSI
Version 2.0-57/ff6cd700 (upgrading to latest stable in 2 weeks)
Last edited by a moderator: