Backup failed on VM with error "backup_complete_cb -5"

chojin

Renowned Member
Nov 21, 2014
16
5
68
Hello,

First of all, I am a newbie in Proxmox (and generally VMs), taking over after someone who handled me the baby and left the country (literally).

I tried to understand how it all worked, and think I have the layout figured out.

We run proxmox 2.3.13 on a cluster of 2 servers, all VMs are on Server1. The server uptime is over 600 days (!)

We have 2 backup jobs:

- one doing a daily lzo snapshot at minight of the essential VMs, retaining only 2 backups
- one doing a weekly gzip snapshot sunday at noon of the others.

The backups are stored on an nfs mount with 2TB in use and 1.4TB free.

The past 4 days, our Active Directory VM couldn't be backed up. I checked the qemu log file and i see this:

Nov 21 00:00:02 INFO: Starting Backup of VM 100 (qemu)
Nov 21 00:00:02 INFO: status = running
Nov 21 00:00:02 INFO: backup mode: snapshot
Nov 21 00:00:02 INFO: bandwidth limit: 1000000 KB/s
Nov 21 00:00:02 INFO: ionice priority: 7
Nov 21 00:00:02 INFO: skip unused drive 'local:100/vm-100-disk-4.raw' (not included into backup)
Nov 21 00:00:02 INFO: creating archive '/mnt/pve/backup/dump/vzdump-qemu-100-2014_11_21-00_00_02.vma.lzo'
Nov 21 00:00:02 INFO: started backup task '976dd1ea-74e2-4b39-8dde-f475dc663196'
Nov 21 00:00:05 INFO: status: 0% (33751040/1039382085632), sparse 0% (10899456), duration 3, 11/7 MB/s
Nov 21 00:00:13 INFO: status: 0% (33882112/1039382085632), sparse 0% (10899456), duration 11, 0/0 MB/s
Nov 21 00:00:13 ERROR: backup_complete_cb -5
Nov 21 00:00:13 INFO: aborting backup job
Nov 21 00:00:14 ERROR: Backup of VM 100 failed - backup_complete_cb -5

Being a newbie, and this VM being the DC and containing the network shares, it is in use most of the day, so I don't know what to do to check, without interrupting or breaking anything.

This error only occurs on this particular VM, and only since 4 days. All other VMs backup fine.

I looked around on the forums, on Google and all, but I can't seem to find something to clear that up for me.

Can anyone help me out? Im sorry if I sound like a beginner, which I am in this particular field, but I'd really like your input.

Thanks in advance.
 
Last edited:
hello

Looks like there is an issue with vm-100-disk-4.raw file which is preventing backup to complete.
Does it exists on your local storage?
Is it included on your backup job?
 
Thank you for you reply.

I see the disk as "unused disk 0" local:100/vm-100-disk-4.raw

the "no backup" option is not ticked.

I don't know why it is here really, I see it in windows as a small unallocated partition of Disk 2. But the log says that it skipped it anyway, so maybe that is not the problem, i'm not sure.
 
if it was there before then for sure it is not the problem.
Can you try to check the box "no backup" on this particular disk (unused) and then initiate a manual backup?
 
I can do that, but being overly paranoid, I just need to be sure, will it maintain 100% functionality of the VM while the backup is in progress?

It's probably an obvious question, but I'm really beginning with live snapshots and all that, and I can't take any chance to disrupt people's work.

By the way the VM is huge too; the network share being stored in here is stupid. I'm having a NAS replace this nonsense, but in the meantime, the VM data is colossal. Can I run it and let it be for the couple hours it's supposed to take?

And what is the best way to initiate a manual backup that does not stop the VM?
 
I can do that, but being overly paranoid, I just need to be sure, will it maintain 100% functionality of the VM while the backup is in progress?

Yes, the performance will be degraded during the backup that is normal.But you can stop it if you want.Anyway try to test it at the end of the day.



By the way the VM is huge too; the network share being stored in here is stupid. I'm having a NAS replace this nonsense, but in the meantime, the VM data is colossal. Can I run it and let it be for the couple hours it's supposed to take?

You can try adding an external hdd as an additional backup device to the node where vms are running.This will be much better I think (for your test).From your previous message I understand that this error happens immediately after backup job is initiated.So if the the problem is the "unused disk", the backup simply will continue it's work until it's finished(you can stop the job if you wish too).If the problem is not "unused disk" then it will stop immediately so there won't be any pressure to your vm.

And what is the best way to initiate a manual backup that does not stop the VM?
Depends on the storage/format of your vm storage.I assume you use local lvm with .raw vm disk?
Then simply select the vm from the left pane, then choose "backup" from the upper right side.Select "Backup now",Storage location (e.g external disk or your nas), mode=snapshot,compression=lzo.

You said that you have two node cluster but the vms are running on the first node. What is the role of the second one? Is just waiting when the first dies and then restore the backups there? (it will take too long)If this is the case, wouldn't it be better to have a kind of shared storage to host your vm so you can utilize both nodes? (like DRBD,SAN)
 
Hello,

Sorry for my late reply, and thank you very much for your help.

I ticked the "no backup" box on the "unused disk", earlier tonight.

It activated the disk (i did not expect that), adding the "no backup" option to it.

I let the midnight backup job do its thing, it did not change a thing.

Here is the log:

Nov 26 00:00:02 INFO: Starting Backup of VM 100 (qemu)
Nov 26 00:00:02 INFO: status = running
Nov 26 00:00:02 INFO: exclude disk 'virtio0' (backup=no)
Nov 26 00:00:02 INFO: backup mode: snapshot
Nov 26 00:00:02 INFO: bandwidth limit: 1000000 KB/s
Nov 26 00:00:02 INFO: ionice priority: 7
Nov 26 00:00:02 INFO: creating archive '/mnt/pve/backup/dump/vzdump-qemu-100-2014_11_26-00_00_02.vma.lzo'
Nov 26 00:00:02 INFO: started backup task 'a0ccac60-585e-46e0-b4b9-30a9f92bb40c'
Nov 26 00:00:05 INFO: status: 0% (33882112/1039382085632), sparse 0% (9957376), duration 3, 11/7 MB/s
Nov 26 00:00:13 INFO: status: 0% (33947648/1039382085632), sparse 0% (9957376), duration 11, 0/0 MB/s
Nov 26 00:00:13 ERROR: backup_complete_cb -5
Nov 26 00:00:13 INFO: aborting backup job
Nov 26 00:00:14 ERROR: Backup of VM 100 failed - backup_complete_cb -5

About the shared storage, there is a backup NFS share visible on both nodes, that stores the backups, but you are right, the images are on the first node local storage. This is not the best way to work. I'll have to sort this out after I figure out this backup issue.

Do you have another idea? And by the way, can I safely "remove" the former "unused disk" from the VM, without breaking anything?

The disk was apparently added as virtio0, but the VM does not see it yet, I suppose it would after a reboot, but I don't know what is was here for in the first place, so I just do not want to mess anything (and I am afraid to reboot the VM now that this disk is enabled).

Thanks again for your precious insight, I really appreciate it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!