backup finish failed: command error: error during syncfs: Invalid argument (os error 22)

pallingerpeter

New Member
Jan 17, 2024
18
1
3
In the last weeks, I got these error messages many times, roughly one in every ten backups produces these errors.
I get that this means that there was an error server-side during the final _sync_ call when the backup finishes.
However, I do not know how I could debug this:
  • I tried looking at the logs in /var/log/proxmox-backup/tasks/ but found no clues
  • I looked at the journald logs, and found
    • Code:
      backup ended and finish failed: backup ended but finished flag is not set.
      removing unfinished backup
    • which is not really helpful
Does anyone have a clue what error 22 may mean? Or even to where to search in the logs?
 
Last edited:
Hi,
this error stems from the call to syncfs which ensures all contents are written to disk. What filesystem are you using?

There is the option to adjust the sync level in the datastore tuning options, see https://pbs.proxmox.com/docs/storage.html#datastore-tuning-options
Switching to a different level might help, but at the cost of potential data loss in case of power outage.
 
Last edited:
Hi,
this error stems from the call to syncfs which ensures all contents are written to disk. What filesystem are you using?
I am using CephFS. Maybe I will try to downgrade kernels to see if this behaviour is a regression of the cephfs kernel module (I had some (different) problems with that before).

There is the option to adjust the sync level in the datastore tuning options, see https://pbs.blochwave.org:8007/docs/storage.html#datastore-tuning-options
Switching to a different level might help, but at the cost of potential data loss in case of power outage.
I have seen that option, and thought about using it.
Power outage is not much of a problem, as the server is on UPS. Moreover, losing some backups (e.g. last 10 minutes of backups) in case of power failure is acceptable as long as verification catches them (it is OK to have to run verification in case of a power outage -- it is very rare due to the UPS).
The question is whether setting it to none will introduce a _lot_ of silent errors during normal operation that will have to be detected by verification later, and have to be dealt with regularly by hand.
I think I will try both (sync level and kernel version downgrade) approaches eventually, if no better suggestions arise.
Thank you for your response!
 
Okay,
I am using CephFS. Maybe I will try to downgrade kernels to see if this behaviour is a regression of the cephfs kernel module (I had some (different) problems with that before).
yes, I do remember that there was the in-kernel cephfs bug [0], which we were able to pinpoint and fix thanks to your and all the other community feedback.

In addition to using an older kernel version, you might also want to see if using the fuse ceph client also shows the same error behavior. As always for such issues, please post your pveversion -v and proxmox-backup-mananger version --verbose, so the issue might be pinpointed more easily.

[0] https://bugzilla.proxmox.com/show_bug.cgi?id=5683
 
proxmox-ve: 8.3.0 (running kernel: 6.8.12-8-pve)
pve-manager: 8.3.3 (running version: 8.3.3/f157a38b211595d6)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.8: 6.8.12-8
proxmox-kernel-6.8.12-8-pve-signed: 6.8.12-8
proxmox-kernel-6.8.12-7-pve-signed: 6.8.12-7
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph: 17.2.7-pve3
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.2.0
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.10
libpve-cluster-perl: 8.0.10
libpve-common-perl: 8.2.9
libpve-guest-common-perl: 5.1.6
libpve-http-server-perl: 5.2.0
libpve-network-perl: 0.10.0
libpve-rs-perl: 0.9.1
libpve-storage-perl: 8.3.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.5.0-1
proxmox-backup-client: 3.3.2-1
proxmox-backup-file-restore: 3.3.2-2
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.3.4
pve-cluster: 8.0.10
pve-container: 5.2.3
pve-docs: 8.3.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.14-3
pve-ha-manager: 4.0.6
pve-i18n: 3.3.3
pve-qemu-kvm: 9.0.2-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.3.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve1
proxmox-backup-manager version --verbose
proxmox-backup 3.3.0 running kernel: 6.8.12-7-pve
proxmox-backup-server 3.3.2-1 running version: 3.3.2
proxmox-kernel-helper 8.1.0
proxmox-kernel-6.8 6.8.12-8
proxmox-kernel-6.8.12-8-pve-signed 6.8.12-8
proxmox-kernel-6.8.12-7-pve-signed 6.8.12-7
pve-kernel-6.2.16-3-pve 6.2.16-3
ifupdown2 3.2.0-1+pmx11
libjs-extjs 7.0.0-5
proxmox-backup-docs 3.3.2-1
proxmox-backup-client 3.3.2-1
proxmox-mail-forward 0.3.1
proxmox-mini-journalreader 1.4.0
proxmox-offline-mirror-helper unknown
proxmox-widget-toolkit 4.3.4
pve-xtermjs 5.3.0-3
smartmontools 7.3-pve1
zfsutils-linux 2.2.7-pve1
 
Update: I changed the sync_level=none, and no backups failed since then. Also, no verify failures occurred during the weekly verify runs.
I will consider this problem solved for now.