LXC snapshot backup suspend the container and won't finish

Quasar90

Member
Nov 24, 2021
10
1
8
34
Hello,

we have a PVE Cluster runnning in version 8.2.7. We try to make backups from our LXC containers but every time the backup hang and won't finish. I notice in the task log, that the backup job suspend the container for the snapshot, What surprises me, why make a susped for snapshot?

Bash:
INFO: starting new backup job: vzdump 1145 --notification-mode auto --storage PBS-TEST-LXC --remove 0 --notes-template '{{cluster}}, {{guestname}}, {{node}}, {{vmid}}'  --node prod-01 --mode snapshot
INFO: Starting Backup of VM 1145 (lxc)
INFO: Backup started at 2024-09-30 11:35:58
INFO: status = running
INFO: CT Name: ldpgs
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp1 ('/home/tdas') in backup
INFO: including mount point mp2 ('/mnt/ldpgs/refdata') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: suspend vm to make snapshot
INFO: create storage snapshot 'vzdump'

we also have a slightly older PVE in Version 8.2.5 running and there the snapshot backup for a lxc is running nromaly and didn't suspend the container.

As storage a ZFS is used in both cases.
 
HI

I had the same issue twice with PM 8.2.7, the difference is that my log stops on suspend vm:
Code:
INFO: starting new backup job: vzdump 105 106 202 --notes-template '{{guestname}}' --fleecing '1,storage=local-lvm' --quiet 1 --storage pbs-xx-xxx --mailto xx@cxx --mode snapshot --mailnotification failure
INFO: Starting Backup of VM 105 (lxc)
INFO: Backup started at 2024-11-20 21:00:01
INFO: status = running
INFO: CT Name: mx1
INFO: including mount point rootfs ('/') in backup
INFO: including mount point mp0 ('/var/vmail') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: suspend vm to make snapshot

First time, I've unsuccessfully tried to recover the VM from suspend state and even after a force stop the container wasn't going to start again because there was a cgroup device busy:
Code:
- Executing script 
lxc-start 105 20241106162338.437 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 105 lxc pre-start produced output: failed to remove directory '/sys/fs/cgroup/hugetlb/lxc/105/ns': Device or resource busy

I've tried to unmount it, delete file as found in some posts,but no way, the only solution was to reboot the host.

Last error has happened just 1h ago, I just tried to resume it with lxc-unfreeze, but it didn't worked, so I've took the opportunity for the incident to upgrade the host to PM 8.2.9 and rebooted it.

Hope that someone have some clue about.

Regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!