"Unable to acquire lock" flags remote sync jobs as failed ❌

Dec 8, 2021
21
0
6
31
USA
OK, I know that this is not an issue that most people are dealing with.

Code:
2023-02-02T09:27:20-06:00: sync group vm/105 failed - unable to acquire lock on snapshot directory "/mnt/datastore/PBS-01_ZFS-1/vm/105/2023-02-02T15:30:03Z" - locked by another operation

We take backups of our Logging an monitoring solution every 20 min +10 (*:10, *:30 *:50). These backups are perfect. but we have a problem when syncing to our off-site backup in AWS. More often than not the remote sync runs at the same time as the local backup (or while the backup is being verified), I expect this to happen. But the entire sync job is marked as Failed. which is not really true. on the next run that backup will be synced, while the current one will Flag as a failure. and the cycle continues.


I Tried:
  • Add an offset to the backups to minimize conflict, It reduced the conflicts from every hour to every few hours. but 13 fails in 24 hours is the best I can get.
  • To make a separate sync task for VM/105 but that solution would still generate errors.
  • I also disabled Verification after backup which is a setting I prefer to have on. when I scheduled verification at specific times, I tend to get two failed sync errors per day which is better, but its also almost every backup is failing to sync for 2-3 hours.

I have some thoughts on solutions.
  1. Retry syncing of all locked backups. at the end of the sync job. This could possibly provide time for those backups to be released.
  2. Allow some sort of Timeout for locked items. Maybe this would be paired with #1. maybe start a job that attempts to sync the locked items after 90 seconds or something. this attempts to get the locked backups again before throwing an error.
  3. The ability to flag a Backup or Sync job as "skipable", if any one job gets skipped more than once then throw an error, otherwise throw a warning.

Anyway These are just my ideas. and I am open to other suggestions.
 
Last edited:
Hi,
I have some thoughts on solutions.
  1. Retry syncing of all locked backups. at the end of the sync job. This could possibly provide time for those backups to be released.
  2. Allow some sort of Timeout for locked items. Maybe this would be paired with #1. maybe start a job that attempts to sync the locked items after 90 seconds or something. this attempts to get the locked backups again before throwing an error.
  3. The ability to flag a Backup or Sync job as "skipable", if any one job gets skipped more than once then throw an error, otherwise throw a warning.
feel free to open up a feature request on our bugtracker for this: https://bugzilla.proxmox.com/
Personally, I like the first suggestion. Should cover many cases already.
 
@rwithd Did you ever create a feature request for this? I've been running into this issue almost as long as I've been running PBS. It makes it hard to know if a PBS sync has a more serious problem when you get several of emails about this every day.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!