backup failed ...

PaulVM

Renowned Member
May 24, 2011
102
3
83
Fresh VPE 7.x.
2 LXC container
I could do standard backup until yesterday, now it fails any time I try.
Fails on local storage, on mounted dir and on PBS.

A couple of failed log:

Header Proxmox Virtual Environment 7.1-10 Datacenter Some guests are not covered by any backup job. Logs () INFO: starting new backup job: vzdump 410722 --node srv2113 --mode snapshot --compress zstd --mailto staff@domain.tld --storage local --all 0 --mailnotification always INFO: Starting Backup of VM 410722 (lxc) INFO: Backup started at 2022-02-10 21:37:04 INFO: status = running INFO: CT Name: d410722.domain.tld INFO: including mount point rootfs ('/') in backup INFO: excluding bind mount point mp0 ('/var/www') from backup (not a volume) INFO: excluding bind mount point mp1 ('/tmp') from backup (not a volume) INFO: mode failure - some volumes do not support snapshots INFO: trying 'suspend' mode instead INFO: backup mode: suspend INFO: ionice priority: 7 INFO: CT Name: d410722.domain.tld INFO: including mount point rootfs ('/') in backup INFO: excluding bind mount point mp0 ('/var/www') from backup (not a volume) INFO: excluding bind mount point mp1 ('/tmp') from backup (not a volume) INFO: starting first sync /proc/3442/root/ to /var/lib/vz/dump/vzdump-lxc-410722-2022_02_10-21_37_04.tmp INFO: first sync finished - transferred 6.42G bytes in 49s INFO: suspending guest INFO: starting final sync /proc/3442/root/ to /var/lib/vz/dump/vzdump-lxc-410722-2022_02_10-21_37_04.tmp INFO: resume vm INFO: guest is online again after 2 seconds ERROR: Backup of VM 410722 failed - command 'rsync --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --inplace --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' '--exclude=/var/www' '--exclude=/tmp' /proc/3442/root//./ /var/lib/vz/dump/vzdump-lxc-410722-2022_02_10-21_37_04.tmp' failed: exit code 23 INFO: Failed at 2022-02-10 21:37:58 INFO: Backup job finished with errors TASK ERROR: job errors Header Proxmox Virtual Environment 7.1-10 Datacenter Some guests are not covered by any backup job. Logs () INFO: starting new backup job: vzdump 410722 --mode snapshot --node srv2113 --storage pbs2110 --all 0 --mailnotification always --mailto staff@domain.tld INFO: Starting Backup of VM 410722 (lxc) INFO: Backup started at 2022-02-10 22:08:26 INFO: status = running INFO: CT Name: d410722.domain.tld INFO: including mount point rootfs ('/') in backup INFO: excluding bind mount point mp0 ('/var/www') from backup (not a volume) INFO: excluding bind mount point mp1 ('/tmp') from backup (not a volume) INFO: mode failure - some volumes do not support snapshots INFO: trying 'suspend' mode instead INFO: backup mode: suspend INFO: ionice priority: 7 INFO: CT Name: d410722.domain.tld INFO: including mount point rootfs ('/') in backup INFO: excluding bind mount point mp0 ('/var/www') from backup (not a volume) INFO: excluding bind mount point mp1 ('/tmp') from backup (not a volume) INFO: starting first sync /proc/3442/root/ to /var/tmp/vzdumptmp19034_410722 INFO: first sync finished - transferred 6.42G bytes in 24s INFO: suspending guest INFO: starting final sync /proc/3442/root/ to /var/tmp/vzdumptmp19034_410722 INFO: resume vm INFO: guest is online again after 2 seconds ERROR: Backup of VM 410722 failed - command 'rsync --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --inplace --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' '--exclude=/var/www' '--exclude=/tmp' /proc/3442/root//./ /var/tmp/vzdumptmp19034_410722' failed: exit code 23 INFO: Failed at 2022-02-10 22:08:58 INFO: Backup job finished with errors TASK ERROR: job errors

PVE updated (same problem before last updates):

# pveversion --verbose proxmox-ve: 7.1-1 (running kernel: 5.13.19-4-pve) pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe) pve-kernel-helper: 7.1-10 pve-kernel-5.13: 7.1-7 pve-kernel-5.4: 6.4-11 pve-kernel-5.13.19-4-pve: 5.13.19-9 pve-kernel-5.13.19-3-pve: 5.13.19-7 pve-kernel-5.13.19-2-pve: 5.13.19-4 pve-kernel-5.4.157-1-pve: 5.4.157-1 ceph-fuse: 14.2.21-1 corosync: 3.1.5-pve2 criu: 3.15-1+pve-1 glusterfs-client: 9.2-1 ifupdown: 0.8.36+pve1 libjs-extjs: 7.0.0-1 libknet1: 1.22-pve2 libproxmox-acme-perl: 1.4.1 libproxmox-backup-qemu0: 1.2.0-1 libpve-access-control: 7.1-6 libpve-apiclient-perl: 3.2-1 libpve-common-perl: 7.1-2 libpve-guest-common-perl: 4.0-3 libpve-http-server-perl: 4.1-1 libpve-storage-perl: 7.1-1 libqb0: 1.0.5-1 libspice-server1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 4.0.11-1 lxcfs: 4.0.11-pve1 novnc-pve: 1.3.0-1 proxmox-backup-client: 2.1.5-1 proxmox-backup-file-restore: 2.1.5-1 proxmox-mini-journalreader: 1.3-1 proxmox-widget-toolkit: 3.4-5 pve-cluster: 7.1-3 pve-container: 4.1-3 pve-docs: 7.1-2 pve-edk2-firmware: 3.20210831-2 pve-firewall: 4.2-5 pve-firmware: 3.3-5 pve-ha-manager: 3.3-3 pve-i18n: 2.6-2 pve-qemu-kvm: 6.1.1-1 pve-xtermjs: 4.16.0-1 pve-zsync: 2.2.1 qemu-server: 7.1-4 smartmontools: 7.2-pve2 spiceterm: 3.2-2 swtpm: 0.7.0~rc1+2 vncterm: 1.7-1 zfsutils-linux: 2.1.2-pve1

Container config (tried to change RAM and cores without differences)
arch: amd64 cores: 6 features: nesting=1 hostname: d410722.apf.it memory: 16000 mp0: /var/lib/vz/WWW/410722,mp=/var/www mp1: /var/lib/vz/WWW/410722TMP,mp=/tmp net0: name=eth0,bridge=vmbr9,firewall=1,gw=192.168.109.1,hwaddr=4E:E0:43:F2:01:54,ip=192.168.109.154/24,type=veth ostype: debian rootfs: local:410722/vm-410722-disk-0.raw,size=15G swap: 512

In PVE logs I have simply:

Feb 10 22:50:23 srv2113 pvedaemon[1596]: <root@pam> starting task UPID:srv2113:00008402:0008BDAF:6205889F:vzdump:410722: root@pam: Feb 10 22:50:23 srv2113 pvedaemon[33794]: INFO: starting new backup job: vzdump 410722 --remove 0 --node srv2113 --compr ess zstd --mode snapshot --storage local Feb 10 22:50:23 srv2113 pvedaemon[33794]: INFO: Starting Backup of VM 410722 (lxc) Feb 10 22:51:05 srv2113 pvedaemon[33794]: ERROR: Backup of VM 410722 failed - command 'rsync --stats -h -X -A --numeric- ids -aH --delete --no-whole-file --inplace --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--e xclude=/var/run/?*.pid' '--exclude=/var/www' '--exclude=/tmp' /proc/3442/root//./ /var/lib/vz/dump/vzdump-lxc-410722-202 2_02_10-22_50_23.tmp' failed: exit code 23 Feb 10 22:51:05 srv2113 pvedaemon[33794]: INFO: Backup job finished with errors Feb 10 22:51:05 srv2113 pvedaemon[33794]: job errors Feb 10 22:51:05 srv2113 pvedaemon[1596]: <root@pam> end task UPID:srv2113:00008402:0008BDAF:6205889F:vzdump:410722:root@ pam: job errors

What can I do to find the problem's cause?

Thanks, P.
 
I noticied that if I do the backup after poweroff the CT, it complete without problems.
I the CT is active I tried mode snapsht/suspend whitout success.

Thanks, P.
 
your container doesn't support snapshots, so snapshot and suspend mode are the same ;) some users in the past reported issues if the container uses ACLs, but the tmpdir used with suspend mode doesn't. maybe that is the case here as well?
 
I have many CT similar (same config).
In PVE 6 and 7 (less updated).
Never had problems.
What can I do to solve?
Thanks, P.
 
don't use suspend mode, or use a tmpdir path that supports ACLs (if that is the cause).
 
Ok, thanks for hints.
I tried from CLI and if I use the same sintax as in the scheduled task it also fail.
If I add a --tmpdir /tmp/ it "finished successfully"
So I tried to add tmpdir /tmp/ in jobs.cfg; still fail
Code:
vzdump: backup-58f1674c-5ebe
        schedule 1:40
        compress zstd
        enabled 1
        mailnotification always
        mailto staff@domain.tld
        mode snapshot
        storage local
        vmid 410722
        tmpdir /tmp/

storage.cfg is:
Code:
pbs: pbs2110
        datastore PMBKFABB
        server pbs2.domain.tld                                                                                                           content backup
        fingerprint 8d:78:da:08:fe:33:c7:9d:8e:16:3a:c2:24:73:55:ed:5e:31:6d:4a:b0:09:ac:f5:86:71:3e:28:cc:30:90:c9                     prune-backups keep-all=1
        username bk2113@pbs

dir: local
        path /var/lib/vz
        content vztmpl,iso,rootdir,backup,images,snippets
        prune-backups keep-last=5
        shared 0

dir: BkOVH
        path /mnt/BkOVH
        content backup
        prune-backups keep-last=9
        shared 0

can't undestand ... :-(
If useful this is mount result:

Code:
mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=16355624k,nr_inodes=4088906,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=3277220k,mode=755,inode64)
/dev/md2 on / type ext4 (rw,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=14508)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
/dev/mapper/vg-data on /var/lib/vz type ext4 (rw,relatime,nobarrier)
/dev/sdb1 on /boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=3277216k,nr_inodes=819304,mode=700,inode64)
curlftpfs#ftp://ftpback-rbx7-151.ovh.net/ on /mnt/BkOVH type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)

Thanks, P.
 
please post the full backup task log..
 
please post the full backup task log..
Here:
Code:
Header
Proxmox
Virtual Environment 7.1-10
Datacenter
Some guests are not covered by any backup job.
Logs
INFO: starting new backup job: vzdump 410722 --node srv2113 --mode snapshot --compress gzip --storage local --mailnotification always --all 0 --mailto staff@domain.tld --tmpdir /tmp/
INFO: Starting Backup of VM 410722 (lxc)
INFO: Backup started at 2022-02-14 12:47:05
INFO: status = running
INFO: CT Name: d410722.domain.tld
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/var/www') from backup (not a volume)
INFO: excluding bind mount point mp1 ('/tmp') from backup (not a volume)
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: d410722.domain.tld
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/var/www') from backup (not a volume)
INFO: excluding bind mount point mp1 ('/tmp') from backup (not a volume)
INFO: starting first sync /proc/52860/root/ to /tmp/vzdumptmp1802716_410722/
INFO: first sync finished - transferred 6.91G bytes in 59s
INFO: suspending guest
INFO: starting final sync /proc/52860/root/ to /tmp/vzdumptmp1802716_410722/
INFO: resume vm
INFO: guest is online again after 3 seconds
ERROR: Backup of VM 410722 failed - command 'rsync --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --inplace --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' '--exclude=/var/www' '--exclude=/tmp' /proc/52860/root//./ /tmp/vzdumptmp1802716_410722/' failed: exit code 23
INFO: Failed at 2022-02-14 12:48:11
INFO: Backup job finished with errors
TASK ERROR: job errors

in jobs.cfg I have:
Code:
vzdump: backup-58f1674c-5ebe
        schedule 1:40
        compress gzip
        enabled 1
        mailnotification always
        mailto staff@domain.tld
        mode snapshot
        storage local
        tmpdir /tmp/
        vmid 410722

Today also executing the vzdump command from the CLI doesn't work (same command succeds 1 / 3 attempt!).
Thank, P.
 
possibly the issue depends on changes since the first run. is it always the second rsync that fails and never the first?
 
possibly the issue depends on changes since the first run. is it always the second rsync that fails and never the first?
i never see the first failed in every manual testing,
the issue in my Setup is only rewrite the file with permission denied

but in the log im not shure if the error lock different if the first rsync failed.
if yes, then its every time the second because i doesn't see different logs
 
@Lokutos /var/lib/php/sessions IIRC has some special permission on debian based systems, making them unremovable by any user other than the owner (www-data?). probably between your first and second rsync run, the container cleans out some sessions, rsync attempts to remove the now removed session files (as user root within the container), but that is not possible.

@PaulVM are you also doing some PHP related stuff in the failing container?
 
  • Like
Reactions: CH.illig and PaulVM
@Lokutos /var/lib/php/sessions IIRC has some special permission on debian based systems, making them unremovable by any user other than the owner (www-data?). probably between your first and second rsync run, the container cleans out some sessions, rsync attempts to remove the now removed session files (as user root within the container), but that is not possible.

@PaulVM are you also doing some PHP related stuff in the failing container?
Yes sessions are dynamic, possible that the file has changed or deleted ...
But how to prevent?
 
@Lokutos /var/lib/php/sessions IIRC has some special permission on debian based systems, making them unremovable by any user other than the owner (www-data?). probably between your first and second rsync run, the container cleans out some sessions, rsync attempts to remove the now removed session files (as user root within the container), but that is not possible.

@PaulVM are you also doing some PHP related stuff in the failing container?
The CT is a Magento host on a Debian 10.x.
I have similar CT in other PVE (6.x), that never had similar problems.
Thans, P.
 
The CT is a Magento host on a Debian 10.x.
I have similar CT in other PVE (6.x), that never had similar problems.
Thans, P.
Me too, the Proxmox Host is one of 4 with around 10 LXC Containers that are absolutly the same scenario,

Debian 11 Host with Debian 11 VM with apache php mysql for Nextcloud.
 
@Lokutos /var/lib/php/sessions IIRC has some special permission on debian based systems, making them unremovable by any user other than the owner (www-data?). probably between your first and second rsync run, the container cleans out some sessions, rsync attempts to remove the now removed session files (as user root within the container), but that is not possible.

@PaulVM are you also doing some PHP related stuff in the failing container?

MAnually added to the tasks in jobs.cfg:
exclude-path /var/lib/php/sessions

This seems to solve the poblem.
Tried some backups and all completed succesfully.
Tonight the live test. Tomorrow the confirm.

Thanks, P.
 
  • Like
Reactions: CH.illig
Working but the config will get delete every time ANY Backup change


INFO: starting new backup job: vzdump 19224249 --mailnotification failure --exclude-path /var/lib/php/sessions --storage pbs --mode snapshot --quiet 1
INFO: Starting Backup of VM 19224249 (lxc)
INFO: Backup started at 2022-02-15 08:12:02
INFO: status = running
INFO: CT Name: svcld249.netz-haring.local
INFO: including mount point rootfs ('/') in backup
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: svcld249.netz-haring.local
INFO: including mount point rootfs ('/') in backup
INFO: starting first sync /proc/2102906/root/ to /var/tmp/vzdumptmp1438195_19224249
INFO: first sync finished - transferred 29.83G bytes in 297s
INFO: suspending guest
INFO: starting final sync /proc/2102906/root/ to /var/tmp/vzdumptmp1438195_19224249
INFO: final sync finished - transferred 126.14M bytes in 3s
INFO: resuming guest
INFO: guest is online again after 3 seconds
INFO: creating Proxmox Backup Server archive 'ct/19224249/2022-02-15T07:12:02Z'
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp1438195_19224249/etc/vzdump/pct.conf root.pxar:/var/tmp/vzdumptmp1438195_19224249 --include-dev /var/tmp/vzdumptmp1438195_19224249/. --skip-lost-and-found --exclude=/var/lib/php/sessions --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 19224249 --backup-time 1644909122 --repository backup_user@pbs@135.1.0.249:nashcp-3-200
INFO: Starting backup: ct/19224249/2022-02-15T07:12:02Z
INFO: Client name: SVHCP-3-249
INFO: Starting backup protocol: Tue Feb 15 08:17:02 2022
INFO: Downloading previous manifest (Sat Feb 5 21:00:02 2022)
INFO: Upload config file '/var/tmp/vzdumptmp1438195_19224249/etc/vzdump/pct.conf' to 'backup_user@pbs@135.1.0.249:8007:nashcp-3-200' as pct.conf.blob
INFO: Upload directory '/var/tmp/vzdumptmp1438195_19224249' to 'backup_user@pbs@135.1.0.249:8007:nashcp-3-200' as root.pxar.didx
INFO: root.pxar: had to backup 894.008 MiB of 27.786 GiB (compressed 279.006 MiB) in 182.99s
INFO: root.pxar: average backup speed: 4.885 MiB/s
INFO: root.pxar: backup was done incrementally, reused 26.913 GiB (96.9%)
INFO: Uploaded backup catalog (1.428 MiB)
INFO: Duration: 184.04s
INFO: End Time: Tue Feb 15 08:20:06 2022
INFO: Finished Backup of VM 19224249 (00:08:10)
INFO: Backup finished at 2022-02-15 08:20:12
INFO: Backup job finished successfully
TASK OK
 
Based on the provided logs, it appears that the backup of the LXC container with ID 410722 is failing due to an rsync command error with exit code 23. This error code generally indicates a file-related issue, such as a permission problem or a file not found.

Here are some steps you can take to investigate and troubleshoot the issue:

  1. Check the file system permissions: Ensure that the relevant directories and files in the LXC container have the correct permissions set. Make sure that the Proxmox VE user running the backup process has appropriate read/write access to the container's files.
  2. Verify the mount points: Confirm that the mount points /var/www and /tmp in the LXC container are correctly set up. Check if these directories exist and are accessible from within the container.
  3. Check for any unusual file locks: It's possible that certain files or directories within the LXC container are locked or in use by other processes, preventing the backup process from accessing them. Use tools like lsof or fuser to check for any open file handles within the container.
  4. Test a manual rsync command: Attempt to perform a manual rsync command from the Proxmox VE host to the LXC container, targeting the directories included in the backup. This can help identify any specific errors or issues related to the rsync operation.
  5. Review Proxmox VE and LXC container logs: Check the system logs on the Proxmox VE host for any relevant error messages related to the backup process or the LXC container. Additionally, inspect the logs within the LXC container itself to see if there are any issues reported.
  6. Update Proxmox VE and related packages: While you mentioned that the problem persists even after updating Proxmox VE, it's generally recommended to keep the system up to date with the latest stable releases. Regular updates can address known issues and provide bug fixes.
  7. Reach out to Proxmox VE support: If the issue persists and you are unable to identify the cause or resolve it, it may be beneficial to contact Proxmox VE support for further assistance. They can provide more specific guidance and help diagnose the problem based on the system's configuration and logs.
Remember to create backups before making any changes. If you need any help, try to reach out to Magento developers https://plumrocket.com/magento-services, they help roll out magento stores on different systems.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!