[SOLVED] 1 in 12 lxc backups fail on average

RobFantini

Famous Member
May 24, 2012
2,022
107
133
Boston,Mass
Hello
we have about 12 lxc's that are backed up to pbs and vzdump to ext4 . on the average 1 per day have backup failures. the fails are to pbs and old type vzdump.
this has been going on for some weeks. here are some of the errors:
Code:
602: 2020-09-14 00:04:40 INFO: Starting Backup of VM 602 (lxc)
602: 2020-09-14 00:04:40 INFO: status = running
602: 2020-09-14 00:04:40 INFO: CT Name: bc-sys2-buster
602: 2020-09-14 00:04:40 INFO: including mount point rootfs ('/') in backup
602: 2020-09-14 00:04:40 INFO: excluding volume mount point mp1 ('/var/lib/bluecherry/recordings/') from backup (disabled)
602: 2020-09-14 00:04:40 INFO: backup mode: snapshot
602: 2020-09-14 00:04:40 INFO: ionice priority: 5
602: 2020-09-14 00:04:40 INFO: create storage snapshot 'vzdump'
602: 2020-09-14 00:04:41 INFO: creating vzdump archive '/nvme-ext4/dump/vzdump-lxc-602-2020_09_14-00_04_40.tar.zst'
602: 2020-09-14 00:04:49 INFO: tar: ./var/lib/php/sessions/sess_ec6p8vhh2e7fv1mkhu5p29j743: Cannot stat: Structure needs cleaning
602: 2020-09-14 00:04:49 INFO: tar: ./var/lib/php/sessions/sess_offhlc0vk1pu8lrp1b72fv0o7r: Cannot stat: Structure needs cleaning
602: 2020-09-14 00:04:49 INFO: tar: ./var/lib/php/sessions/sess_g9d04ddt1kkhlkpat2efbqefqg: Cannot stat: Structure needs cleaning
602: 2020-09-14 00:05:55 INFO: Total bytes written: 2334740480 (2.2GiB, 31MiB/s)
602: 2020-09-14 00:05:55 INFO: tar: Exiting with failure status due to previous errors
602: 2020-09-14 00:05:55 INFO: remove vzdump snapshot
602: 2020-09-14 00:05:56 ERROR: Backup of VM 602 failed - command 'set -o pipefail && tar cpf - --totals --one-file-system -p --sparse --numeric-owner --acls --xattrs '--xattrs-include=user.*' '--xattrs-include=security.capability' '--warning=no-file-ignored' '--warning=no-xattr-write' --one-file-system '--warning=no-file-ignored' '--directory=/nvme-ext4/dump/vzdump-lxc-602-2020_09_14-00_04_40.tmp' ./etc/vzdump/pct.conf ./etc/vzdump/pct.fw '--directory=/mnt/vzsnap0' --no-anchored '--exclude=lost+found' --anchored '--exclude=./tmp/?*' '--exclude=./var/tmp/?*' '--exclude=./var/run/?*.pid' ./ | zstd --rsyncable '--threads=1' >/nvme-ext4/dump/vzdump-lxc-602-2020_09_14-00_04_40.tar.dat' failed: exit code 2

# changing cronjob for rsnapshot so it not done at the same time as backups prevented these.  we had 1-2 per day .
606: 2020-09-02 17:06:26 INFO: Error: error at "rsnapshot/hourly.8/localhost/fbc/bin": stat failed on "rsnapshot/hourly.8/localhost/fbc/bin/sip": EUCLEAN: Structure needs cleaning
605: 2020-08-22 18:06:25 INFO: Error: error at "rsnapshot/hourly.1/localhost/fbc": stat failed on "rsnapshot/hourly.1/localhost/fbc/config": EUCLEAN: Structure needs cleaning


# all 7 nodes were getting done at the same time
606: 2020-09-04 20:01:50 INFO: Error: error at "var/lib/php/sessions": stat failed on "var/lib/php/sessions/sess_r156l49bg0trth8qaos3ff4bmp": EUCLEAN: Structure needs cleaning
602: 2020-09-04 20:04:49 INFO: Error: error at "var/lib/php/sessions": stat failed on "var/lib/php/sessions/sess_51a9djj1j75atoc8php32emum4": EUCLEAN: Structure needs cleaning
603: 2020-09-05 20:04:23 INFO: Error: error at "var/lib/php/sessions": stat failed on "var/lib/php/sessions/sess_gjeuvl93r1td8gpv423jl4qk6e": EUCLEAN: Structure needs cleaning

# try spreading out backup times per node 30 min apart
# that helped  . 3 days no errors.  today we have one i think due to remote data lung sync at s022 .

603: 2020-09-08 19:17:09 INFO: Error: error at "var/lib/php/sessions": stat failed on"var/lib/php/sessions/sess_r4ommfnd9obhaoaqehp8d21qrp": EUCLEAN: Structure needs cleaning
 
versions:
Code:
proxmox-ve: 6.2-1 (running kernel: 5.4.60-1-pve)
pve-manager: 6.2-11 (running version: 6.2-11/22fb4983)
pve-kernel-5.4: 6.2-6
pve-kernel-helper: 6.2-6
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.55-1-pve: 5.4.55-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
ceph: 14.2.11-pve1
ceph-fuse: 14.2.11-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-2
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-1
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-6
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-12
pve-cluster: 6.1-8
pve-container: 3.1-13
pve-docs: 6.2-5
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-1
pve-qemu-kvm: 5.0.0-13
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-14
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve1
 
here is what rsyslog reported on the 9/14 fail
Code:
Sep 14 00:05:56 sys8 vzdump[2074744]: ERROR: Backup of VM 602 failed - command 'set -o pipefail && tar cpf - --totals --one-file-system -p --sparse --numeric-owner --acls --xattrs '--xattrs-include=user.*' '--xattrs-include=security.capability' '--warning=no-file-ignored' '--warning=no-xattr-write' --one-file-system '--warning=no-file-ignored' '--directory=/nvme-ext4/dump/vzdump-lxc-602-2020_09_14-00_04_40.tmp' ./etc/vzdump/pct.conf ./etc/vzdump/pct.fw '--directory=/mnt/vzsnap0' --no-anchored '--exclude=lost+found' --anchored '--exclude=./tmp/?*' '--exclude=./var/tmp/?*' '--exclude=./var/run/?*.pid' ./ | zstd --rsyncable '--threads=1' >/nvme-ext4/dump/vzdump-lxc-602-2020_09_14-00_04_40.tar.dat' failed: exit code 2
Sep 14 00:09:12 pve15 vzdump[1902577]: ERROR: Backup of VM 607 failed - command 'set -o pipefail && tar cpf - --totals --one-file-system -p --sparse --numeric-owner --acls --xattrs '--xattrs-include=user.*' '--xattrs-include=security.capability' '--warning=no-file-ignored' '--warning=no-xattr-write' --one-file-system '--warning=no-file-ignored' '--directory=/nvme-ext4/dump/vzdump-lxc-607-2020_09_14-00_07_56.tmp' ./etc/vzdump/pct.conf ./etc/vzdump/pct.fw '--directory=/mnt/vzsnap0' --no-anchored '--exclude=lost+found' --anchored '--exclude=./tmp/?*' '--exclude=./var/tmp/?*' '--exclude=./var/run/?*.pid' ./ | zstd --rsyncable '--threads=1' >/nvme-ext4/dump/vzdump-lxc-607-2020_09_14-00_07_56.tar.dat' failed: exit code 2
 
spreading out the load on pbs / network by doing backups at a different time then remote syncs has helped this issue.

there were also updates to pbs software. in any case we have not had an issue for the last 3-4 days.
 
just to complete this, a week or so after we had 1-2 fails per day lxc and kvm.

that seems to be fixed by software updates for pbs and pve/kvm
 
naturally after marking this solved there were more fails like this last two days:

- to local storage:
INFO: starting new backup job: vzdump --all 1 --mode snapshot --mailnotification failure --compress zstd --quiet 1 --mailto fbcadmin --storage z-local-nvme INFO: skip external VMs: 108, 446, 604, 4121, 109, 607, 4120, 5103, 106, 110, 113, 114, 216, 605, 88903, 88951, 100, 102, 105, 115, 701, 801, 2165, 103, 104, 112, 121, 122, 126, 128, 603, 727, 978, 7211, 66741, 66780, 223, 606, 902, 10784, 66791, 88691, 88950 INFO: Starting Backup of VM 101 (lxc) INFO: Backup started at 2020-10-05 00:00:02 INFO: status = running INFO: CT Name: dhcp-primary INFO: including mount point rootfs ('/') in backup INFO: backup mode: snapshot INFO: ionice priority: 5 INFO: create storage snapshot 'vzdump' /dev/rbd5 INFO: creating vzdump archive '/nvme-ext4/dump/vzdump-lxc-101-2020_10_05-00_00_02.tar.zst' INFO: tar: ./var/backups/dpkg.status.0: Cannot stat: Structure needs cleaning INFO: Total bytes written: 2239569920 (2.1GiB, 23MiB/s) INFO: tar: Exiting with failure status due to previous errors INFO: remove vzdump snapshot 2020-10-05 00:01:39.401 7fb21b1df700 -1 librbd::object_map::InvalidateRequest: 0x7fb214027660 should_complete: r=0 Removing snap: 100% complete...done. ERROR: Backup of VM 101 failed - command 'set -o pipefail && tar cpf - --totals --one-file-system -p --sparse --numeric-owner --acls --xattrs '--xattrs-include=user.*' '--xattrs-include=security.capability' '--warning=no-file-ignored' '--warning=no-xattr-write' --one-file-system '--warning=no-file-ignored' '--directory=/nvme-ext4/dump/vzdump-lxc-101-2020_10_05-00_00_02.tmp' ./etc/vzdump/pct.conf ./etc/vzdump/pct.fw '--directory=/mnt/vzsnap0' --no-anchored '--exclude=lost+found' --anchored '--exclude=./tmp/?*' '--exclude=./var/tmp/?*' '--exclude=./var/run/?*.pid' ./ | zstd --rsyncable '--threads=1' >/nvme-ext4/dump/vzdump-lxc-101-2020_10_05-00_00_02.tar.dat' failed: exit code 2 INFO: Failed at 2020-10-05 00:01:40

and pbs
NFO: CT Name: rsnapshot-atkinson INFO: including mount point rootfs ('/') in backup INFO: including mount point mp0 ('/bkup') in backup INFO: backup mode: snapshot INFO: ionice priority: 5 INFO: suspend vm to make snapshot INFO: create storage snapshot 'vzdump' INFO: resume vm INFO: guest is online again after <1 seconds INFO: creating Proxmox Backup Server archive 'ct/88888/2020-10-04T12:44:46Z' INFO: run: /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp11381/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --include-dev /mnt/vzsnap0/./bkup --skip-lost-and-found --backup-type ct --backup-id 88888 --backup-time 1601815486 --repository pbs-user@pbs@127.0.0.1:sys36-store1 INFO: Starting backup: ct/88888/2020-10-04T12:44:46Z INFO: Client name: sys36 INFO: Starting backup protocol: Sun Oct 4 08:44:46 2020 INFO: Upload config file '/var/tmp/vzdumptmp11381/etc/vzdump/pct.conf' to 'pbs-user@pbs@127.0.0.1:8007:sys36-store1' as pct.conf.blob INFO: Upload directory '/mnt/vzsnap0' to 'pbs-user@pbs@127.0.0.1:8007:sys36-store1' as root.pxar.didx INFO: catalog upload error - channel closed INFO: Error: error at "bkup/rsnapshot-fbc-offsite/daily.0/server-backup/bkup/server-backup/pro4old-etc-kvm": catalog_encode_u64 failed - value >= 2^63 INFO: remove vzdump snapshot ERROR: Backup of VM 88888 failed - command '/usr/bin/proxmox-backup-client backup '--crypt-mode=none' pct.conf:/var/tmp/vzdumptmp11381/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --include-dev /mnt/vzsnap0/./bkup --skip-lost-and-found --backup-type ct --backup-id 88888 --backup-time 1601815486 --repository pbs-user@pbs@127.0.0.1:sys36-store1' failed: exit code 255 INFO: Failed at 2020-10-04 09:17:31 INFO: Backup job finished with errors

KVM's are not failing.
LXC have about one fail out of 20 every other day.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!