LXC backup routine not working anymore

intellq

Member
Sep 8, 2021
15
1
8
46
I have some backups routines from LXC containers. They ran fine for days, but since yesterday they are failing.

Code:
INFO: starting new backup job: vzdump 105 --storage gluster-2tbraid5 --mode snapshot --compress zstd --remove 0 --notes-template '{{guestname}}' --node servidor2
INFO: filesystem type on dumpdir is 'fuse.glusterfs' -using /var/tmp/vzdumptmp7217_105 for temporary files
INFO: Starting Backup of VM 105 (lxc)
INFO: Backup started at 2023-04-20 08:19:02
INFO: status = running
INFO: CT Name: pm3
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/mnt/servidor1') from backup (not a volume)
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: pm3
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/mnt/servidor1') from backup (not a volume)
INFO: starting first sync /proc/3398/root/ to /var/tmp/vzdumptmp7217_105
ERROR: Backup of VM 105 failed - command 'rsync -v --progress --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --sparse --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' '--exclude=/mnt/servidor1' /proc/3398/root//./ /var/tmp/vzdumptmp7217_105' failed: exit code 23
INFO: Failed at 2023-04-20 08:22:53
INFO: Backup job finished with errors
TASK ERROR: job errors

Based on what I've found in some related topics about rsync exit code 23, I added "-v --progress" to rsync command line and ran it manually.

Command line / log error:

rsync -v --progress --stats -h -X -A --numeric-ids -aH --no-whole-file --sparse --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' '--exclude=/mnt/servidor1' /proc/3398/root//./ /mnt/GFS-2tb_RAID5/images/pm3

Code:
rsync: [sender] get_xattr_data: lgetxattr("/proc/3398/root/var/lib/docker/overlay2/e2da0a0363bf2876d2a127759abeee205e4f59ffdb7cfcb0ce52517682e5d8fd/diff/etc/_localtime","user.overlay.origin",0) failed: No data available (61)
rsync: [sender] get_xattr_data: lgetxattr("/proc/3398/root/var/lib/docker/overlay2/f216019cb902d0d80dd1f5d78b1bb1b487326667281c973b978b63afb05bf084/diff/etc/_localtime","user.overlay.origin",0) failed: No data available (61)
rsync: [generator] set_acl: sys_acl_set_file(var/log/journal, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [generator] set_acl: sys_acl_set_file(var/log/journal/0a6dce15bc014770acac5c45e21973e5, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [generator] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/0a6dce15bc014770acac5c45e21973e5/.system.journal.aj67xZ, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/0a6dce15bc014770acac5c45e21973e5/.system@0005f9740b0ea0b9-f00650da29f72ab2.journal~.1l2Ni0, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/0a6dce15bc014770acac5c45e21973e5/.system@03ea82ca24e7449a901321e2e7b5d085-0000000000000001-0005f956bdbb169e.journal.WNQA63, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6/.system.journal.MCuUS0, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6/.system@0005f9066d2f2e89-7026df5c0045da22.journal~.EDeAl3, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6/.system@0005f907abc27967-5e5c245ad0915580.journal~.fVN5j0, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6/.system@0005f908490d401d-00373b4bd794fe08.journal~.QWcs3Z, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6/.system@0005f90a72d0fab1-e4e61d51537d5403.journal~.KSsiEZ, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6/.system@0005f90ca2591671-9285aba4de92d8e0.journal~.pqWMT0, ACL_TYPE_ACCESS): Operation not supported (95)
rsync: [receiver] set_acl: sys_acl_set_file(var/log/journal/fff51684a5d343d08b117628c12cdfe6/.system@0005f90cf29b4a20-fdc2553221ddd663.journal~.cjV7D2, ACL_TYPE_ACCESS): Operation not supported (95)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]

These LXC are as resources in a HA task. If I change their backup routine mode to stop, it doesn't work. HA prevent the change of status:

Code:
INFO: starting new backup job: vzdump 105 104 --mode stop --quiet 1 --notes-template '{{guestname}}' --compress zstd --mailnotification always --storage gluster-2tbraid5 --prune-backups 'keep-daily=3,keep-monthly=1,keep-weekly=1'
ERROR: Backup of VM 104 failed - Cannot execute a backup with stop mode on a HA managed and enabled Service. Use snapshot mode or disable the Service.
INFO: Failed at 2023-04-20 02:00:05
ERROR: Backup of VM 105 failed - Cannot execute a backup with stop mode on a HA managed and enabled Service. Use snapshot mode or disable the Service.
INFO: Failed at 2023-04-20 02:00:05
INFO: Backup job finished with errors
TASK ERROR: job errors

If I run the backups with LXC containers stopped, they execute. In snapshot mode, not anymore. Nothing was changed.

Any help would be appreciated :)
 
Last edited:
INFO: filesystem type on dumpdir is 'fuse.glusterfs' -using /var/tmp/vzdumptmp7217_105 for temporary files
ACL_TYPE_ACCESS): Operation not supported (95)
Hi,
seems like your dumpdir does not support ACLs. Please check if thats the case and change to a dumpdir with ACL support.

Edit: Please post the output of mount
 
Last edited:
  • Like
Reactions: intellq
Hi,
seems like your dumpdir does not support ACLs. Please check if thats the case and change to a dumpdir with ACL support.

Well, I'm not sure if it does:

Code:
root@servidor2:/mnt/GFS-2tb_RAID5/images# df -h /var/tmp
Sist. Arq.      Tam. Usado Disp. Uso% Montado em
/dev/sdb1       228G   27G  190G  13% /

/var/tmp is the same mount point as /

Edit: Please post the output of mount

Code:
root@servidor2:/mnt/GFS-2tb_RAID5/images# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=24677112k,nr_inodes=6169278,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=4941580k,mode=755,inode64)
/dev/sdb1 on / type ext4 (rw,relatime,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=23128)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
systemd-1 on /mnt/GFS-2tb_RAID5 type autofs (rw,relatime,fd=59,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=37998)
/dev/sdc1 on /mnt/pve/gluster type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
gluster2:CMM on /mnt/GFS-2tb_RAID5 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072,_netdev,x-systemd.automount)
//10.3.150.1/f$/backup on /mnt/servidor1 type cifs (rw,relatime,vers=2.1,cache=strict,username=backup,domain=cmm.local,uid=0,noforceuid,gid=0,noforcegid,addr=10.3.150.1,file_                mode=0777,dir_mode=0777,soft,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=4941576k,nr_inodes=1235394,mode=700,inode64)
 
Last edited:
rsync: [sender] get_xattr_data: lgetxattr("/proc/3398/root/var/lib/docker/overlay2/e2da0a0363bf2876d2a127759abeee205e4f59ffdb7cfcb0ce52517682e5d8fd/diff/etc/_localtime","user.overlay.origin",0) failed: No data available (61) rsync: [sender] get_xattr_data: lgetxattr("/proc/3398/root/var/lib/docker/overlay2/f216019cb902d0d80dd1f5d78b1bb1b487326667281c973b978b63afb05bf084/diff/etc/_localtime","user.overlay.origin",0) failed: No data available (61)
Might also be related to the missing xattr data, can you check if the rsync works without the `-X` (xattrs) flag, and/or without the `-A` (acls) flag.
 
  • Like
Reactions: intellq
What about if I change my fstab

from:
Code:
UUID=21dd8f6a-048c-437a-bb91-85ba5b694175 /               ext4    errors=remount-ro 0       1

to:
Code:
UUID=21dd8f6a-048c-437a-bb91-85ba5b694175 /               ext4    errors=remount-ro,acl 0       1

Do you think it's worth it?
no, ext4 acl support should be enabled by default, i suspect more that the missing xattr data from you docker overlay is causing the issues
 
  • Like
Reactions: intellq
Ok, I checked and ACLs are indeed working.

I'll try rsync without -X arg.

Thanks Chris.
 
Last edited:
Ok, I checked and ACLs are indeed working.

I'll try rsync without -X arg.

Thanks Chris.
If it turns out to be caused by the missing xattrs data, try to find out why the data is not there anymore. Check the corresponding docker image layer.
 
  • Like
Reactions: intellq
If it turns out to be caused by the missing xattrs data, try to find out why the data is not there anymore. Check the corresponding docker image layer.

I'm no docker expert... can you point me in the right direction on how to check this?
 
update:

..without -X arg: same errors, but without get_xattr_data ones.
..without -X and -A: now the backup works.

edit:
ok, I gave up and delete all the affected files. journal files and some /etc/_localtime

now backup is working again.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!