Backup fails. Here is log.

ozgurerdogan

Renowned Member
May 2, 2010
622
6
83
Bursa, Turkey, Turkey
One kvm is successful but rest of them are not when backing them up.


Detailed backup logs:
vzdump --quiet --snapshot --storage YEDEKLER --mailto mail@domain.com 101 102 103 104 105

101: Feb 06 03:00:01 INFO: Starting Backup of VM 101 (qemu)
101: Feb 06 03:00:01 INFO: running
101: Feb 06 03:00:01 INFO: status = running
101: Feb 06 03:00:02 INFO: backup mode: snapshot
101: Feb 06 03:00:02 INFO: ionice priority: 7
101: Feb 06 03:00:02 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-vrtz1-0'
101: Feb 06 03:00:02 INFO: umount: /mnt/vzsnap0: device is busy
101: Feb 06 03:00:02 INFO: umount: /mnt/vzsnap0: device is busy
101: Feb 06 03:00:02 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
101: Feb 06 03:00:02 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
101: Feb 06 03:00:02 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
101: Feb 06 03:00:02 INFO: Logical volume "vzsnap-vrtz1-0" already exists in volume group "pve"
101: Feb 06 03:00:02 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
101: Feb 06 03:00:02 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
101: Feb 06 03:00:02 ERROR: Backup of VM 101 failed - command 'lvcreate --size 1024M --snapshot --name 'vzsnap-vrtz1-0' '/dev/pve/data'' failed with exit code 5

102: Feb 06 03:00:02 INFO: Starting Backup of VM 102 (qemu)
102: Feb 06 03:00:02 INFO: running
102: Feb 06 03:00:02 INFO: status = running
102: Feb 06 03:00:03 INFO: backup mode: snapshot
102: Feb 06 03:00:03 INFO: ionice priority: 7
102: Feb 06 03:00:03 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-vrtz1-0'
102: Feb 06 03:00:03 INFO: umount: /mnt/vzsnap0: device is busy
102: Feb 06 03:00:03 INFO: umount: /mnt/vzsnap0: device is busy
102: Feb 06 03:00:03 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
102: Feb 06 03:00:03 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
102: Feb 06 03:00:03 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
102: Feb 06 03:00:03 INFO: Logical volume "vzsnap-vrtz1-0" already exists in volume group "pve"
102: Feb 06 03:00:03 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
102: Feb 06 03:00:03 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
102: Feb 06 03:00:03 ERROR: Backup of VM 102 failed - command 'lvcreate --size 1024M --snapshot --name 'vzsnap-vrtz1-0' '/dev/pve/data'' failed with exit code 5

103: Feb 06 03:00:03 INFO: Starting Backup of VM 103 (qemu)
103: Feb 06 03:00:04 INFO: stopped
103: Feb 06 03:00:04 INFO: status = stopped
103: Feb 06 03:00:04 INFO: backup mode: stop
103: Feb 06 03:00:04 INFO: ionice priority: 7
103: Feb 06 03:00:04 INFO: creating archive '/YEDEKLER/vzdump-qemu-103-2011_02_06-03_00_03.tar'
103: Feb 06 03:00:04 INFO: adding '/YEDEKLER/vzdump-qemu-103-2011_02_06-03_00_03.tmp/qemu-server.conf' to archive ('qemu-server.conf')
103: Feb 06 03:00:04 INFO: adding '/var/lib/vz/images/103/vm-103-disk-1.raw' to archive ('vm-disk-virtio0.raw')
103: Feb 06 03:22:39 INFO: Total bytes written: 30455870976 (21.43 MiB/s)
103: Feb 06 03:22:39 INFO: archive file size: 28.36GB
103: Feb 06 03:22:40 INFO: Finished Backup of VM 103 (00:22:37)

104: Feb 06 03:22:40 INFO: Starting Backup of VM 104 (qemu)
104: Feb 06 03:22:40 INFO: running
104: Feb 06 03:22:40 INFO: status = running
104: Feb 06 03:22:41 INFO: backup mode: snapshot
104: Feb 06 03:22:41 INFO: ionice priority: 7
104: Feb 06 03:22:41 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-vrtz1-0'
104: Feb 06 03:22:41 INFO: umount: /mnt/vzsnap0: device is busy
104: Feb 06 03:22:41 INFO: umount: /mnt/vzsnap0: device is busy
104: Feb 06 03:22:41 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
104: Feb 06 03:22:41 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
104: Feb 06 03:22:41 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
104: Feb 06 03:22:41 INFO: Logical volume "vzsnap-vrtz1-0" already exists in volume group "pve"
104: Feb 06 03:22:42 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
104: Feb 06 03:22:42 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
104: Feb 06 03:22:42 ERROR: Backup of VM 104 failed - command 'lvcreate --size 1024M --snapshot --name 'vzsnap-vrtz1-0' '/dev/pve/data'' failed with exit code 5

105: Feb 06 03:22:42 INFO: Starting Backup of VM 105 (qemu)
105: Feb 06 03:22:43 INFO: running
105: Feb 06 03:22:43 INFO: status = running
105: Feb 06 03:22:43 INFO: backup mode: snapshot
105: Feb 06 03:22:43 INFO: ionice priority: 7
105: Feb 06 03:22:43 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-vrtz1-0'
105: Feb 06 03:22:43 INFO: umount: /mnt/vzsnap0: device is busy
105: Feb 06 03:22:43 INFO: umount: /mnt/vzsnap0: device is busy
105: Feb 06 03:22:43 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
105: Feb 06 03:22:43 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
105: Feb 06 03:22:43 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
105: Feb 06 03:22:43 INFO: Logical volume "vzsnap-vrtz1-0" already exists in volume group "pve"
105: Feb 06 03:22:43 INFO: Can't remove open logical volume "vzsnap-vrtz1-0"
105: Feb 06 03:22:43 ERROR: command 'lvremove -f '/dev/pve/vzsnap-vrtz1-0'' failed with exit code 5
105: Feb 06 03:22:43 ERROR: Backup of VM 105 failed - command 'lvcreate --size 1024M --snapshot --name 'vzsnap-vrtz1-0' '/dev/pve/data'' failed with exit code 5
 
I set last backup mode as stop and it was all successful. Today I will try again with suspend mode and if it fails again, I will do what you suggessted and post the result. Thank you.

My current updatedb.conf is as: PRUNEPATHS="/tmp /var/spool /media"
 
It seems to be backing up now. I will post if I get same errors again.

But during one vm's backup, in syslog I see:

Feb 8 12:35:23 kernel ata3.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0
Feb 8 12:35:23 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:23 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:23 kernel ata3.00: cmd 60/08:08:30:1c:0c/00:00:38:00:00/40 tag 1 ncq 4096 in
Feb 8 12:35:23 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:23 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:23 kernel ata3.00: error: { UNC }
Feb 8 12:35:23 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] Unhandled sense code
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Feb 8 12:35:23 kernel Descriptor sense data with sense descriptors (in hex):
Feb 8 12:35:23 kernel 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Feb 8 12:35:23 kernel 38 0c 1c 35
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] CDB: Read(10): 28 00 38 0c 1c 30 00 00 08 00
Feb 8 12:35:23 kernel end_request: I/O error, dev sda, sector 940317749
Feb 8 12:35:23 kernel ata3: EH complete
Feb 8 12:35:25 kernel ata3.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x0
Feb 8 12:35:25 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:25 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:25 kernel ata3.00: cmd 60/08:08:30:1c:0c/00:00:38:00:00/40 tag 1 ncq 4096 in
Feb 8 12:35:25 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:25 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:25 kernel ata3.00: error: { UNC }
Feb 8 12:35:25 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:25 kernel ata3: EH complete
Feb 8 12:35:27 kernel ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 8 12:35:27 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:27 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:27 kernel ata3.00: cmd 60/08:00:30:1c:0c/00:00:38:00:00/40 tag 0 ncq 4096 in
Feb 8 12:35:27 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:27 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:27 kernel ata3.00: error: { UNC }
Feb 8 12:35:27 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:27 kernel ata3: EH complete
Feb 8 12:35:29 kernel ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 8 12:35:29 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:29 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:29 kernel ata3.00: cmd 60/08:00:30:1c:0c/00:00:38:00:00/40 tag 0 ncq 4096 in
Feb 8 12:35:29 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:29 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:29 kernel ata3.00: error: { UNC }
Feb 8 12:35:29 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:29 kernel ata3: EH complete
Feb 8 12:35:31 kernel ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 8 12:35:31 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:31 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:31 kernel ata3.00: cmd 60/08:00:30:1c:0c/00:00:38:00:00/40 tag 0 ncq 4096 in
Feb 8 12:35:31 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:31 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:31 kernel ata3.00: error: { UNC }
Feb 8 12:35:31 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:31 kernel ata3: EH complete
Feb 8 12:35:33 kernel ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 8 12:35:33 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:33 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:33 kernel ata3.00: cmd 60/08:00:30:1c:0c/00:00:38:00:00/40 tag 0 ncq 4096 in
Feb 8 12:35:33 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:33 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:33 kernel ata3.00: error: { UNC }
Feb 8 12:35:33 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:33 kernel ata3: EH complete
Feb 8 12:35:35 kernel ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 8 12:35:35 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:35 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:35 kernel ata3.00: cmd 60/08:00:30:1c:0c/00:00:38:00:00/40 tag 0 ncq 4096 in
Feb 8 12:35:35 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:35 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:35 kernel ata3.00: error: { UNC }
Feb 8 12:35:35 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:35 kernel sd 2:0:0:0: [sda] Unhandled sense code
Feb 8 12:35:35 kernel sd 2:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 8 12:35:35 kernel sd 2:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Feb 8 12:35:35 kernel Descriptor sense data with sense descriptors (in hex):
Feb 8 12:35:35 kernel 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Feb 8 12:35:35 kernel 38 0c 1c 35
Feb 8 12:35:35 kernel sd 2:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
Feb 8 12:35:35 kernel sd 2:0:0:0: [sda] CDB: Read(10): 28 00 38 0c 1c 30 00 00 08 00
Feb 8 12:35:35 kernel end_request: I/O error, dev sda, sector 940317749
Feb 8 12:35:35 kernel ata3: EH complete
Feb 8 12:35:36 kernel ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Feb 8 12:35:36 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:36 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:36 kernel ata3.00: cmd 60/08:00:30:1c:0c/00:00:38:00:00/40 tag 0 ncq 4096 in
Feb 8 12:35:36 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:36 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:36 kernel ata3.00: error: { UNC }
Feb 8 12:35:36 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:36 kernel ata3: EH complete
Feb 8 12:37:19 kernel device-mapper: snapshots: Invalidating snapshot: Unable to allocate exception.
Feb 8 12:40:01 cron 27684 (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Feb 8 12:50:01 cron 27837 (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)

This is very brandnew server and fresh installed. It was running fine for couple months.
Do you think, I should consider re-installing system and restore vms from backups?

Thanks in advance.
 
It is brandnew 3 months old disk. Ok I think I should backup vms and re-install system and restore it. But I am wondering, if those errors based on vms disk image or harddisk itself.
 
I have found this like some other google searches states the same.

It's Kernel bug on ata acpi
Put "options libata noacpi=1" on /etc/modprobe.d/options
solved the problem

But I can not find /etc/modprobe.d/options there is no file options for proxmox. Do you think I can use above fix. But where must I put that option? And Can I put that for proxmox?

Thanks
 
I guess I am in trouble :(. I am trying to backup vm and restore to different node. But when I try to restore backup I am getting:

INFO: restore QemuServer backup 'vzdump-qemu-104-2011_02_08-22_00_07.tar' using ID 115
INFO: extracting 'qemu-server.conf' from archive
INFO: extracting 'vm-disk-ide0.raw' from archive
INFO: detected size difference for 'vm-disk-ide0.raw' (161061273593 != 161061273600)
INFO: tar: vm-disk-ide0.raw: Cannot write: Broken pipe
INFO: tar: 2365: Child returned status 25
INFO: tar: Skipping to next header
INFO: tar: Error exit delayed from previous errors
INFO: starting cleanup
ERROR: restore QemuServer backup 'vzdump-qemu-104-2011_02_08-22_00_07.tar' failed - command 'tar xf 'vzdump-qemu-104-2011_02_08-22_00_07.tar' '--to-command=/usr/sbin/qmrestore vzdump-qemu-104-2011_02_08-22_00_07.tar --extract 115'' failed with exit code 2


Also during backup I am see such errors:

Feb 8 12:35:23 kernel ata3.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0
Feb 8 12:35:23 kernel ata3.00: irq_stat 0x40000008
Feb 8 12:35:23 kernel ata3.00: failed command: READ FPDMA QUEUED
Feb 8 12:35:23 kernel ata3.00: cmd 60/08:08:30:1c:0c/00:00:38:00:00/40 tag 1 ncq 4096 in
Feb 8 12:35:23 kernel res 41/40:00:35:1c:0c/00:00:38:00:00/40 Emask 0x409 (media error) <F>
Feb 8 12:35:23 kernel ata3.00: status: { DRDY ERR }
Feb 8 12:35:23 kernel ata3.00: error: { UNC }
Feb 8 12:35:23 kernel ata3.00: configured for UDMA/133
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] Unhandled sense code
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 8 12:35:23 kernel sd 2:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Feb 8 12:35:23 kernel Descriptor sense data with sense descriptors (in hex):
Feb 8 12:35:23 kernel 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Feb 8 12:35:23 kernel 38 0c 1c 35

I am really stuck. That vm is running fine. But becaue of those kernel errors I wanted to get vm to different node. But it seems to be not possible cos fo these info errors.

Last thing I will try is do online migration. Do you guys have any idea ? Do you think I should try check disk or something? This vm has very important data and I must run it on a safe node.