Backup of VM failed (with exit code 5)

Discussion in 'Proxmox VE 1.x: Installation and configuration' started by gkovacs, Aug 15, 2010.

  1. gkovacs

    gkovacs Active Member

    Joined:
    Dec 22, 2008
    Messages:
    500
    Likes Received:
    43
    I get this error quite often, and have no idea what might cause it. Sometimes it gives that error for a couple of days and it goes away (backup working again), but sometimes the LVM snapshot gets stuck so bad only a reboot can solve it (df -h stops, doesn't return to shell).

    Here is the vzdump log:
    Code:
    Aug 13 06:49:34 INFO: Starting Backup of VM 112 (openvz)
    Aug 13 06:49:34 INFO: CTID 112 exist mounted running
    Aug 13 06:49:34 INFO: status = CTID 112 exist mounted running
    Aug 13 06:49:43 INFO: backup mode: snapshot
    Aug 13 06:49:43 INFO: bandwidth limit: 10240 KB/s
    Aug 13 06:49:43 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-proxmox2-0'
    Aug 13 06:49:43 INFO: umount: /mnt/vzsnap0: device is busy
    Aug 13 06:49:43 INFO: umount: /mnt/vzsnap0: device is busy
    Aug 13 06:49:43 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
    Aug 13 06:49:44 INFO:   Can't remove open logical volume "vzsnap-proxmox2-0"
    Aug 13 06:49:44 ERROR: command 'lvremove -f /dev/pve/vzsnap-proxmox2-0' failed with exit code 5
    Aug 13 06:49:44 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-proxmox2-0')
    Aug 13 06:49:44 INFO:   Logical volume "vzsnap-proxmox2-0" already exists in volume group "pve"
    Aug 13 06:49:45 INFO:   Can't remove open logical volume "vzsnap-proxmox2-0"
    Aug 13 06:49:45 ERROR: command 'lvremove -f /dev/pve/vzsnap-proxmox2-0' failed with exit code 5
    Aug 13 06:49:45 ERROR: Backup of VM 112 failed - command 'lvcreate --size 1024M --snapshot --name vzsnap-proxmox2-0 /dev/pve/data' failed with exit code 5
    The system is an average PVE 1.5 installation, with a separate disk mounted for backup. Two folders on it were added via the web interface for daily and weekly backups. There is enough free space on both drives.

    Code:
    proxmox2:~# lvscan
    File descriptor 7 left open
      ACTIVE            '/dev/pve/swap' [8.00 GB] inherit
      ACTIVE            '/dev/pve/root' [20.00 GB] inherit
      ACTIVE            '/dev/pve/data' [116.55 GB] inherit
    
    proxmox2:~# pvscan
    File descriptor 7 left open
      PV /dev/sda2   VG pve   lvm2 [148.55 GB / 4.00 GB free]
      Total: 1 [148.55 GB] / in use: 1 [148.55 GB] / in no VG: 0 [0   ]
    
    proxmox2:~# mount
    /dev/mapper/pve-root on / type ext3 (rw,noatime,nodiratime,errors=remount-ro)
    tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
    proc on /proc type proc (rw,noexec,nosuid,nodev)
    sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
    procbususb on /proc/bus/usb type usbfs (rw)
    udev on /dev type tmpfs (rw,mode=0755)
    tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
    devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
    /dev/mapper/pve-data on /var/lib/vz type ext3 (rw,noatime,nodiratime)
    /dev/sda1 on /boot type ext3 (rw,noatime,nodiratime)
    /dev/sdb1 on /backup type ext3 (rw,noatime,nodiratime)
    Any ideas?
     
    #1 gkovacs, Aug 15, 2010
    Last edited: Aug 15, 2010
  2. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,400
    Likes Received:
    294
    How long does a vzdump backup run?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. gkovacs

    gkovacs Active Member

    Joined:
    Dec 22, 2008
    Messages:
    500
    Likes Received:
    43
    I have several large OpenVZ VE's (around 50 GB), backup takes a couple of hours each.
    The job starts at midnight, finishes in the morning.
     
  4. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,400
    Likes Received:
    294
    But does the last backup successfully finish if you get above error?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. dsat

    dsat New Member

    Joined:
    Aug 16, 2010
    Messages:
    2
    Likes Received:
    0
    Hi Dietmar,
    I have the same problem - the last backup didn't finished with "Finished Backup". The backup image was not stored on target system also.
     
  6. gkovacs

    gkovacs Active Member

    Joined:
    Dec 22, 2008
    Messages:
    500
    Likes Received:
    43
    If I recall correctly, it has happened this way: one backup was unsuccessful, while others were completed. In these cases there is a snapshot that won't get removed.

    Sometimes the problem disappears after a couple of days (snapshot disappears), but sometimes the system eventually gets to a state where no backup is ever finished due to the snapshot being stuck (and df -h won't exit back to bash). In these cases only a reboot helps.
     
  7. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,400
    Likes Received:
    294
    and what error do you get in that specific case?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,400
    Likes Received:
    294
    Please can you post the backup log?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  9. gkovacs

    gkovacs Active Member

    Joined:
    Dec 22, 2008
    Messages:
    500
    Likes Received:
    43
    The same error I posted above - unable to remove snapshot.
     
  10. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,400
    Likes Received:
    294
    Please can you try to manually remove that snapshot - maybe you get a more elaborate error message?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  11. dsat

    dsat New Member

    Joined:
    Aug 16, 2010
    Messages:
    2
    Likes Received:
    0
    Hi Dietmar,

    here is the log:

    Code:
    Detailed backup logs:
    
    vzdump --quiet --node 5 --snapshot --storage BACKUPSERVER --mailto eMailAdress 121 122 123 124
    
    121: Aug 16 02:00:14 INFO: Starting Backup of VM 121 (openvz)
    ...
    121: Aug 16 02:16:56 INFO: Finished Backup of VM 121 (00:16:43)
    
    122: Aug 16 02:16:57 INFO: Starting Backup of VM 122 (openvz)
    ...
    122: Aug 16 04:34:12 INFO: Finished Backup of VM 122 (02:17:16)
    
    123: Aug 16 04:34:13 INFO: Starting Backup of VM 123 (openvz)
    123: Aug 16 04:34:13 INFO: CTID 123 exist mounted running
    123: Aug 16 04:34:13 INFO: status = CTID 123 exist mounted running
    123: Aug 16 04:34:13 INFO: backup mode: snapshot
    123: Aug 16 04:34:13 INFO: bandwidth limit: 10240 KB/s
    123: Aug 16 04:34:13 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-VSERVER-0')
    123: Aug 16 04:34:14 INFO:   Logical volume "vzsnap-VSERVER-0" created
    123: Aug 16 04:34:17 INFO: creating archive '/mnt/pve/BACKUPSERVER/vzdump-openvz-123-2010_08_16-04_34_13.tar'
    123: Aug 16 07:36:34 INFO: Total bytes written: 33008609280 (31GiB, 3.0MiB/s)
    123: Aug 16 07:36:35 INFO: archive file size: 30.74GB
    123: Aug 16 07:36:35 INFO: delete old backup '/mnt/pve/BACKUPSERVER/vzdump-openvz-123-2010_08_09-03_59_34.tar'
    123: Aug 16 07:36:39 INFO: umount: /mnt/vzsnap0: device is busy
    123: Aug 16 07:36:39 INFO: umount: /mnt/vzsnap0: device is busy
    123: Aug 16 07:36:39 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
    123: Aug 16 07:36:43 INFO:   Can't remove open logical volume "vzsnap-VSERVER-0"
    123: Aug 16 07:36:43 ERROR: command 'lvremove -f /dev/pve/vzsnap-VSERVER-0' failed with exit code 5
    123: Aug 16 07:36:43 INFO: Finished Backup of VM 123 (03:02:30)
    
    124: Aug 16 07:36:56 INFO: Starting Backup of VM 124 (openvz)
    124: Aug 16 07:36:58 INFO: CTID 124 exist mounted running
    124: Aug 16 07:36:58 INFO: status = CTID 124 exist mounted running
    124: Aug 16 07:36:59 INFO: backup mode: snapshot
    124: Aug 16 07:36:59 INFO: bandwidth limit: 10240 KB/s
    124: Aug 16 07:36:59 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-VSERVER-0'
    124: Aug 16 07:36:59 INFO: umount: /mnt/vzsnap0: device is busy
    124: Aug 16 07:36:59 INFO: umount: /mnt/vzsnap0: device is busy
    124: Aug 16 07:36:59 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
    124: Aug 16 07:36:59 INFO:   Can't remove open logical volume "vzsnap-VSERVER-0"
    124: Aug 16 07:36:59 ERROR: command 'lvremove -f /dev/pve/vzsnap-VSERVER-0' failed with exit code 5
    124: Aug 16 07:36:59 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-VSERVER-0')
    124: Aug 16 07:37:00 INFO:   Logical volume "vzsnap-VSERVER-0" already exists in volume group "pve"
    124: Aug 16 07:37:01 INFO:   Can't remove open logical volume "vzsnap-VSERVER-0"
    124: Aug 16 07:37:01 ERROR: command 'lvremove -f /dev/pve/vzsnap-VSERVER-0' failed with exit code 5
    124: Aug 16 07:37:01 ERROR: Backup of VM 124 failed - command 'lvcreate --size 1024M --snapshot --name vzsnap-VSERVER-0 /dev/pve/data' failed with exit code 5
    A other backup from another server was ok today.
     
  12. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,400
    Likes Received:
    294
    seems something is using the snapshot - we already had that bug.

    maybe mlocate generates and index with /usr/bin/updatedb.mlocate ?

    (started from /etc/cron.daily/mlocate)

    Please try to configure /etc/updatedb.conf to exlude /mnt/
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  13. gkovacs

    gkovacs Active Member

    Joined:
    Dec 22, 2008
    Messages:
    500
    Likes Received:
    43
    My /etc/updatedb.conf doesn't seem to include /mnt:

    Code:
    PRUNE_BIND_MOUNTS="yes"
    # PRUNENAMES=".git .bzr .hg .svn"
    PRUNEPATHS="/tmp /var/spool /media"
    PRUNEFS="NFS nfs nfs4 rpc_pipefs afs binfmt_misc proc smbfs autofs iso9660 ncpfs coda devpts ftpfs devfs mfs shfs sysfs cifs lustre_lite tmpfs usbfs udf"
    Any other ideas? It seems to only happen after 1-2 weeks of the server running fine.
     
  14. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,400
    Likes Received:
    294
    Maybe you can test with 'lsof' what process access those files?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  15. JimBeam

    JimBeam Member

    Joined:
    Aug 26, 2010
    Messages:
    74
    Likes Received:
    0
    Hi,

    man updatedb.conf suggests you can use PRUNEPATHS to EXCLUDE certain directories, so you might add /mnt to that line:

    PRUNEPATHS
    A whitespace-separated list of path names of directories which
    should not be scanned by updatedb(8).
     
  16. dusti

    dusti New Member

    Joined:
    May 12, 2010
    Messages:
    2
    Likes Received:
    0
    To contribute to this:
    we had the same problem, /mnt was also not in the exclude list of updatedb.conf
    After reading this thread I added it and now all goes well.

    /etc/updatedb.conf (witch changes - see #3th line )
    PRUNE_BIND_MOUNTS="yes"
    # PRUNENAMES=".git .bzr .hg .svn"
    PRUNEPATHS="/tmp /var/spool /media /mnt"
    PRUNEFS="NFS nfs nfs4 rpc_pipefs afs binfmt_misc proc smbfs autofs iso9660 ncpfs coda devpts ftpfs devfs mfs shfs sysfs cifs lustre_lite tmpfs usbfs udf"

    /var/log/vzdump/openvz-1001.log BEFORE changing updatedb

    Aug 26 16:00:06 INFO: Starting Backup of VM 1001 (openvz)
    Aug 26 16:00:07 INFO: CTID 1001 exist mounted running
    Aug 26 16:00:07 INFO: status = CTID 1001 exist mounted running
    Aug 26 16:00:09 INFO: backup mode: snapshot
    Aug 26 16:00:09 INFO: bandwidth limit: 10240 KB/s
    Aug 26 16:00:09 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-openvz2-0')
    Aug 26 16:00:12 INFO: Logical volume "vzsnap-openvz2-0" created
    Aug 26 16:00:14 INFO: creating archive '/mnt/backup/vzdump-openvz-1001-2010_08_26-16_00_06.tgz'
    Aug 26 16:06:24 INFO: Total bytes written: 862095360 (823MiB, 3.1MiB/s)
    Aug 26 16:06:24 INFO: archive file size: 366MB
    Aug 26 16:06:24 INFO: delete old backup '/mnt/backup/vzdump-openvz-1001-2010_08_26-05_00_02.tgz'
    Aug 26 16:06:25 INFO: umount: /mnt/vzsnap0: device is busy
    Aug 26 16:06:25 INFO: umount: /mnt/vzsnap0: device is busy
    Aug 26 16:06:25 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
    Aug 26 16:06:27 INFO: Can't remove open logical volume "vzsnap-openvz2-0"
    Aug 26 16:06:27 ERROR: command 'lvremove -f /dev/pve/vzsnap-openvz2-0' failed with exit code 5
    Aug 26 16:06:27 INFO: Finished Backup of VM 1001 (00:06:21)

    /var/log/vzdump/openvz-1001.log AFTER changing updatedb

    Aug 26 16:20:08 INFO: Starting Backup of VM 1001 (openvz)
    Aug 26 16:20:08 INFO: CTID 1001 exist mounted running
    Aug 26 16:20:08 INFO: status = CTID 1001 exist mounted running
    Aug 26 16:20:10 INFO: backup mode: snapshot
    Aug 26 16:20:10 INFO: bandwidth limit: 10240 KB/s
    Aug 26 16:20:10 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-openvz2-0')
    Aug 26 16:20:12 INFO: Logical volume "vzsnap-openvz2-0" created
    Aug 26 16:20:14 INFO: creating archive '/mnt/backup/vzdump-openvz-1001-2010_08_26-16_20_08.tgz'
    Aug 26 16:25:47 INFO: Total bytes written: 862105600 (823MiB, 3.3MiB/s)
    Aug 26 16:25:47 INFO: archive file size: 366MB
    Aug 26 16:25:47 INFO: delete old backup '/mnt/backup/vzdump-openvz-1001-2010_08_26-16_00_06.tgz'
    Aug 26 16:25:50 INFO: Logical volume "vzsnap-openvz2-0" successfully removed
    Aug 26 16:25:50 INFO: Finished Backup of VM 1001 (00:05:42)

     
  17. gkovacs

    gkovacs Active Member

    Joined:
    Dec 22, 2008
    Messages:
    500
    Likes Received:
    43
    I have added /mnt to the PRUNEPATHS line as well.
    But only time will tell (a couple of weeks, at least) if this problem pops up its ugly head again.

    Thanks for your input.
     
  18. tcindia123

    tcindia123 New Member

    Joined:
    Aug 28, 2010
    Messages:
    4
    Likes Received:
    0
    yes try to remove the snapshot manually
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  19. plewka

    plewka Member

    Joined:
    Sep 28, 2009
    Messages:
    49
    Likes Received:
    1
    I have had issues with backup too. In past I was backing up two nodes to an enterprise NAS and
    have had backup failures like the above, too. NAS is an ugly SMB share.
    Sometimes a Snapshot was left off, too. Sometimes targz worked forever.

    Probably it's not a good idea to use the NAS für storing the *.dat file for *.tgz.
    I recently noticed there is a parameter for the dump which enables to use local storage
    for the *.dat before compression takes place. Will have a look at that possibility.

    Now I backup'ed the machines to the other node and vise versa. It's much faster an
    easier but again there were failures where the targz worked much longer than
    usual. At least I _thought_ these targz were stalled and usually killed them an skipped
    the individual VM.

    Now, at weekend, they first time used the NFS4 interconnect and all but one VMs took two minutes
    per GB. One machine took 13 hours for 9GB !?
    Maybe I killed them too early last times. But what could be the reason for such extended
    times? Where to look for?

    Many thanks in advance
     
  20. langloispy

    langloispy Member

    Joined:
    Jun 16, 2010
    Messages:
    57
    Likes Received:
    1
    I have the same problem. The tried to backup a VM but their was no space left on the specific storage. The vzdump continue to run but nothing happens... All other scheduled backup are halted waiting the first vzdump to be completed (which never happen)...

    After reading this thread, I have "killall vzdump" to allow the other backup to work. But The backup script is not able to "umount /mnt/vsnap0", and cannot "lvremove
    /dev/pve/vzsnap-pve02-fl-0"...

    Code:
    vzdump --quiet --snapshot --storage remotebackupsshfs 101
    
    101: Dec 27 02:00:02 INFO: Starting Backup of VM 101 (qemu)
    101: Dec 27 02:00:02 INFO: running
    101: Dec 27 02:00:02 INFO: status = running
    101: Dec 27 02:00:21 INFO: backup mode: snapshot
    101: Dec 27 02:00:21 INFO: ionice priority: 7
    101: Dec 27 02:00:21 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-pve02-fl-0'
    101: Dec 27 02:00:21 INFO: umount: /mnt/vzsnap0: device is busy
    101: Dec 27 02:00:21 INFO: umount: /mnt/vzsnap0: device is busy
    101: [COLOR=red]Dec 27 02:00:21 ERROR: command 'umount /mnt/vzsnap0' failed with exit code 1
    [/COLOR]101: Dec 27 02:00:21 INFO:   Can't remove open logical volume "vzsnap-pve02-fl-0"
    101: [COLOR=red]Dec 27 02:00:21 ERROR: command 'lvremove -f '/dev/pve/vzsnap-pve02-fl-0'' failed with exit code 5
    [/COLOR]101: Dec 27 02:00:21 INFO:   Logical volume "vzsnap-pve02-fl-0" already exists in volume group "pve"
    101: Dec 27 02:00:22 INFO:   Can't remove open logical volume "vzsnap-pve02-fl-0"
    101: [COLOR=red]Dec 27 02:00:22 ERROR: command 'lvremove -f '/dev/pve/vzsnap-pve02-fl-0'' failed with exit code 5
    [/COLOR]101: [COLOR=red]Dec 27 02:00:22 ERROR: Backup of VM 101 failed - command 'lvcreate --size 1024M --snapshot --name 'vzsnap-pve02-fl-0' '/dev/pve/data'' failed with exit code 5
    [/COLOR]


    I tried to manually issue the umount and the lvremove command but it does not worked. For now the only way to get rid of this problem is to reboot the host :( Is their a cleaner way to cancel a running vzdump?

    Thanks!
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice