LXC container backup job in suspend mode started failing last night

Nov 5, 2021
7
4
8
44
Hello all,

Last night when my automated backup jobs ran, the job for backing up one of my LXC containers failed. Earlier that day I had updated all my VMs, containers, and Proxmox hosts and rebooted my hosts (which also rebooted my VMs and containers). I did not make any changes to this container or the Proxmox host that day. All of my VMs are backing up successfully still, it is just this container that is causing issues. I am only running this one container on this node.

The logs don't provide much information, the rsync command fails with exit code 23. Comparing the most recent successful log with the failed logs, it looks like the first sync works successfully but the second sync does not ever finish. It seems that snapshot mode never worked when backing up this container, but suspend mode did. I have provided the logs below.

I have tried the following:
  • Manually running the backup from the UI in suspend mode to both a local storage volume on that Proxmox host as well as to the NFS share that they normally go to, both fail.
  • Rebooted the container and reran the backup job, still fails.
  • Manually copying the rsync command from the error message and executing it from an SSH session on the Proxmox host, it completes without error.
  • Confirmed that VMs on this node can be backed up to both the local and NFS backup storage.
  • Edited the backup job and changed the mode from Snapshot to Stop, this completes successfully.

Any ideas what the issue could be here? I didn't change anything on my end, the only thing I did that day was run my weekly updates and reboot the node and now snapshot/suspend backups of this container do not work at all. Happy to provide any other information that is requested.

Here is the log from the first failure:

Code:
INFO: starting new backup job: vzdump 300 --mailto *** --mailnotification failure --storage nfs_backups --node prox --compress zstd --notes-template '{{guestname}}' --mode snapshot --quiet 1 --prune-backups 'keep-daily=7,keep-monthly=3,keep-weekly=3,keep-yearly=3'
INFO: Starting Backup of VM 300 (lxc)
INFO: Backup started at 2022-10-09 04:10:00
INFO: status = running
INFO: CT Name: pihole1
INFO: including mount point rootfs ('/') in backup
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: pihole1
INFO: including mount point rootfs ('/') in backup
INFO: starting first sync /proc/3893/root/ to /mnt/vzdump_tmp/vzdumptmp2585127_300/
INFO: first sync finished - transferred 1.90G bytes in 13s
INFO: suspending guest
INFO: starting final sync /proc/3893/root/ to /mnt/vzdump_tmp/vzdumptmp2585127_300/
INFO: resume vm
INFO: guest is online again after 1 seconds
ERROR: Backup of VM 300 failed - command 'rsync --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --inplace --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' /proc/3893/root//./ /mnt/vzdump_tmp/vzdumptmp2585127_300/' failed: exit code 23
INFO: Failed at 2022-10-09 04:10:15
INFO: Backup job finished with errors
TASK ERROR: job errors

Here is the log from the last successful backup from the night before:

Code:
INFO: starting new backup job: vzdump 300 --node prox --prune-backups 'keep-daily=7,keep-monthly=3,keep-weekly=3,keep-yearly=3' --mailnotification failure --mode snapshot --compress zstd --mailto *** --quiet 1 --storage nfs_backups --notes-template '{{guestname}}'
INFO: Starting Backup of VM 300 (lxc)
INFO: Backup started at 2022-10-08 04:10:00
INFO: status = running
INFO: CT Name: pihole1
INFO: including mount point rootfs ('/') in backup
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: pihole1
INFO: including mount point rootfs ('/') in backup
INFO: starting first sync /proc/3611/root/ to /mnt/vzdump_tmp/vzdumptmp23253_300/
INFO: first sync finished - transferred 1.92G bytes in 10s
INFO: suspending guest
INFO: starting final sync /proc/3611/root/ to /mnt/vzdump_tmp/vzdumptmp23253_300/
INFO: final sync finished - transferred 220 bytes in 1s
INFO: resuming guest
INFO: guest is online again after 1 seconds
INFO: creating vzdump archive '/mnt/pve/nfs_backups/dump/vzdump-lxc-300-2022_10_08-04_10_00.tar.zst'
INFO: Total bytes written: 1842319360 (1.8GiB, 136MiB/s)
INFO: archive file size: 419MB
INFO: adding notes to backup
INFO: prune older backups with retention: keep-daily=7, keep-monthly=3, keep-weekly=3, keep-yearly=3
INFO: removing backup 'nfs_backups:backup/vzdump-lxc-300-2022_10_01-04_10_00.tar.zst'
INFO: pruned 1 backup(s) not covered by keep-retention policy
INFO: Finished Backup of VM 300 (00:00:26)
INFO: Backup finished at 2022-10-08 04:10:26
INFO: Backup job finished successfully
TASK OK

Here is the output of pct config for this container:

Code:
arch: amd64
cores: 2
features: nesting=1
hostname: pihole1
memory: 1024
net0: name=eth0,bridge=vmbr1,firewall=1,gw=10.40.42.1,hwaddr=9A:B8:5C:98:02:C7,ip=10.40.42.20/24,tag=20,type=veth
onboot: 1
ostype: debian
rootfs: local:300/vm-300-disk-0.raw,size=10G,acl=0
swap: 1024
unprivileged: 1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!