LVM snapshot problems during backup (umount / lvremove fails)

gkovacs

Renowned Member
Dec 22, 2008
512
50
93
Budapest, Hungary
This is an up to date Proxmox VE 3.4 server, using Adaptec HW raid / LVM / ext4. It hosts about 20 containers and 2-3 VMs.

During nightly backups, the first couple of containers get backed up without error. Then a container (always the same number) starts giving these errors:

Code:
Jun 01 03:50:53 INFO: Starting Backup of VM 215 (openvz)
Jun 01 03:50:53 INFO: CTID 215 exist mounted running
Jun 01 03:50:53 INFO: status = running
Jun 01 03:50:53 INFO: backup mode: snapshot
Jun 01 03:50:53 INFO: bandwidth limit: 131072 KB/s
Jun 01 03:50:53 INFO: ionice priority: 7
Jun 01 03:50:53 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-proxmox2-0')
Jun 01 03:50:54 INFO:   Logical volume "vzsnap-proxmox2-0" created
Jun 01 03:50:55 INFO: creating archive '/mnt/pve/Backups-Weekly/dump/vzdump-openvz-215-2015_06_01-03_50_53.tar.lzo'
Jun 01 06:44:04 INFO: Total bytes written: 68104570880 (64GiB, 6.3MiB/s)
Jun 01 06:44:04 INFO: archive file size: 53.66GB
Jun 01 06:44:04 INFO: delete old backup '/mnt/pve/Backups-Weekly/dump/vzdump-openvz-215-2015_04_20-01_51_51.tar.lzo'
[COLOR=#ff0000]Jun 01 06:44:07 INFO: umount: /mnt/vzsnap0: device is busy.[/COLOR]
Jun 01 06:44:07 INFO:         (In some cases useful info about processes that use
Jun 01 06:44:07 INFO:          the device is found by lsof(8) or fuser(1))
Jun 01 06:44:07 ERROR: command 'umount /mnt/vzsnap0' failed: exit code 1
Jun 01 06:44:16 INFO: lvremove failed - trying again in 8 seconds
Jun 01 06:44:24 INFO: lvremove failed - trying again in 16 seconds
Jun 01 06:44:40 INFO: lvremove failed - trying again in 32 seconds
Jun 01 06:45:12 ERROR: command 'lvremove -f /dev/pve/vzsnap-proxmox2-0' failed: exit code 5
Jun 01 06:45:13 INFO: Finished Backup of VM 215 (02:54:20)

This is the last container that gets backed up, but the snapshot can't be released. After that, all container backups fail due to this error:

Code:
Jun 01 06:53:50 INFO: Starting Backup of VM 237 (openvz)
Jun 01 06:53:50 INFO: CTID 237 exist mounted running
Jun 01 06:53:50 INFO: status = running
Jun 01 06:53:50 INFO: backup mode: snapshot
Jun 01 06:53:50 INFO: bandwidth limit: 131072 KB/s
Jun 01 06:53:50 INFO: ionice priority: 7
[COLOR=#ff0000]Jun 01 06:53:50 INFO: trying to remove stale snapshot '/dev/pve/vzsnap-proxmox2-0'
Jun 01 06:53:50 INFO: umount: /mnt/vzsnap0: device is busy.[/COLOR]
Jun 01 06:53:50 INFO:         (In some cases useful info about processes that use
Jun 01 06:53:50 INFO:          the device is found by lsof(8) or fuser(1))
Jun 01 06:53:50 ERROR: command 'umount /mnt/vzsnap0' failed: exit code 1
Jun 01 06:53:50 INFO:   Logical volume pve/vzsnap-proxmox2-0 contains a filesystem in use.
Jun 01 06:53:50 ERROR: command 'lvremove -f /dev/pve/vzsnap-proxmox2-0' failed: exit code 5
Jun 01 06:53:50 INFO: creating lvm snapshot of /dev/mapper/pve-data ('/dev/pve/vzsnap-proxmox2-0')
Jun 01 06:53:50 INFO:   Logical volume "vzsnap-proxmox2-0" already exists in volume group "pve"
Jun 01 06:53:57 INFO: lvremove failed - trying again in 8 seconds
Jun 01 06:54:05 INFO: lvremove failed - trying again in 16 seconds
Jun 01 06:54:22 INFO: lvremove failed - trying again in 32 seconds
Jun 01 06:54:54 ERROR: command 'lvremove -f /dev/pve/vzsnap-proxmox2-0' failed: exit code 5
Jun 01 06:54:54 ERROR: Backup of VM 237 failed - command 'lvcreate --size 12288M --snapshot --name vzsnap-proxmox2-0 /dev/pve/data' failed: exit code 5

In the morning I check with lvs: the snapshot is usually 75% used (never been full), and I can manually umount and lvremove it, but next night the same thing happens.
I tried to migrate away the container in question, but then the next one in line gives the same error.

Any idea what to do next?
 
Last edited:
Forgot to add: lvs gives 'file descriptor leaked' messages on the same host, not sure if related:

Code:
root@proxmox2:/etc# lvs
File descriptor 7 (pipe:[156027980]) leaked on lvs invocation. Parent PID 541254: bash
  LV   VG   Attr      LSize  Pool Origin Data%  Move Log Copy%  Convert
  data pve  -wi-ao---  1.30t
  root pve  -wi-ao--- 32.00g
  swap pve  -wi-ao--- 16.00g
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!