Unexpected backup failures

vaschthestampede

Well-Known Member
Oct 21, 2020
144
8
58
39
I have a PSB with 18 servers connected. The PBS server is a PowerEdge R740xd2 with two Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz and 128GB RAM.

Code:
proxmox-backup: 4.2.0 (running kernel: 7.0.0-3-pve)
proxmox-backup-server: 4.2.0-1 (running version: 4.2.0)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-7.0: 7.0.0-3
proxmox-kernel-7.0.0-3-pve-signed: 7.0.0-3
proxmox-kernel-6.17: 6.17.13-6
proxmox-kernel-6.17.13-6-pve-signed: 6.17.13-6
proxmox-kernel-6.17.2-1-pve-signed: 6.17.2-1
ifupdown2: 3.3.0-1+pmx12
libjs-extjs: 7.0.0-5
proxmox-backup-docs: 4.2.0-1
proxmox-backup-client: 4.2.0-1
proxmox-mail-forward: 1.0.3
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.9
pve-xtermjs: 5.5.0-3
smartmontools: 7.4-pve1
zfsutils-linux: 2.4.1-pve1

All these servers have a scheduled backup at 6:30 PM.

The problem is that, not always, not always for the same VMs and not always in the same servers, some backups fail and I can't understand why.

I uploaded the logs because the message wouldn't let me include them all.

Let me know what other information might be helpful.
 

Attachments

Last edited:
I assume your PBS datastores resides on a filesystem offered on an iscsi device. Is this filesystem shared with other datastores or other, unrelated data? Note that by default PBS performs an fssync at the end of a backup, to assure data is persisted to disk. Might be the case that this takes a lot on your filesystem, and the PVE client side runs into a timeout?
 
Last edited:
I assume your PBS datastores resides on a filesystem offered on an iscsi device
Yes.

Is this filesystem shared with other datastores or other, unrelated data?
No, it is connected via two-meter P2P fiber, without even a switch in between.

Might be the case that this takes a lot on your filesystem, and the PVE client side runs into a timeout?
How can I check this?
 
I set:
Code:
proxmox-backup-manager datastore update iSCSI --tuning 'sync-level=none'

I'll let it go to see if the errors persist.
Unfortunately, since they're very random, it could take weeks to see if it's resolved.
 
I set:
Code:
proxmox-backup-manager datastore update iSCSI --tuning 'sync-level=none'

I'll let it go to see if the errors persist.
Unfortunately, since they're very random, it could take weeks to see if it's resolved.
make sure to restart the PBS services as well, there is currently a bug which prevents the sync level from taking immediate effect. also, if acceptable performance wise i would recommend to rather use file instead of none. To be on the safe side.
 
Last edited: