Backup failed: Can't acquire lock

proxwolfe · Sep 21, 2021

Hi,

I am transitioning from backing up to a NAS to backing up to a PBS.

Since yesterday, I have two backup jobs scheduled every night. They both are to backup the same VMs/LXCs - but starting at different times.

- Starting at 1am to the NAS
- Starting at 2am to the PBS

The backup job to the NAS went through without issues (ended close to 9am).

The backup job to the PBS failed with the following log entries:

Code:

INFO: trying to get global lock - waiting...
ERROR: can't acquire lock '/var/run/vzdump.lock' - got timeout
TASK ERROR: got unexpected control message:

Can only one backup job run at a time? Or is something else going on?

Thanks!

fiona · Sep 21, 2021

Hi,

proxwolfe said:
Hi,

I am transitioning from backing up to a NAS to backing up to a PBS.

Since yesterday, I have two backup jobs scheduled every night. They both are to backup the same VMs/LXCs - but starting at different times.

- Starting at 1am to the NAS
- Starting at 2am to the PBS

The backup job to the NAS went through without issues (ended close to 9am).

The backup job to the PBS failed with the following log entries:

Code:

INFO: trying to get global lock - waiting... ERROR: can't acquire lock '/var/run/vzdump.lock' - got timeout TASK ERROR: got unexpected control message:

Can only one backup job run at a time? Or is something else going on?

yes, there is a global lock for vzdump. The default wait time for the lock is 180 minutes, but can be controlled with --lockwait.

proxwolfe said:
Thanks!

proxwolfe · Sep 21, 2021

Fabian_E said:
yes, there is a global lock for vzdump. The default wait time for the lock is 180 minutes, but can be controlled with --lockwait

That explains it.

So, is this necessary? I mean, is there a technical reason why you shouldn't run two backup jobs at the same time? What is to be protected by the lock? The VM, the PVE host and/or the backup target?

Does it matter, if the backup jobs backup different VMs/LXCs? (In my case they don't, just trying to understand the mechanics.)

Does it matter, if the backup jobs backup the same VMs/LXCs but at overlapping times? (Like in my case with a one hour offset.)

Does it matter, if the backup jobs backup to different backup target? (Like in my case to a NAS and a PBS.)

Is it safe to reduce the lock time in my case? Can I use "--lockwait" from the GUI or do I need to modify some config file from the terminal?

Thanks!

fiona · Sep 21, 2021

proxwolfe said:
That explains it.

So, is this necessary? I mean, is there a technical reason why you shouldn't run two backup jobs at the same time? What is to be protected by the lock? The VM, the PVE host and/or the backup target?

I think, because a backup can put quite a bit of load on the host/network.

proxwolfe said:
Does it matter, if the backup jobs backup different VMs/LXCs? (In my case they don't, just trying to understand the mechanics.)

Does it matter, if the backup jobs backup the same VMs/LXCs but at overlapping times? (Like in my case with a one hour offset.)

Does it matter, if the backup jobs backup to different backup target? (Like in my case to a NAS and a PBS.)

No, it's at most one active vzdump process at a time.

proxwolfe said:
Is it safe to reduce the lock time in my case?

Then the backup will fail earlier. If you want to be sure the second backup runs as well, you'd need to increase the lock wait. So that it will still be waiting at the time the first backup is finished.

proxwolfe said:
Can I use "--lockwait" from the GUI or do I need to modify some config file from the terminal?

This is not possible via GUI AFAICT. The file with the vzdump jobs is /etc/pve/vzdump.cron.

proxwolfe said:
Thanks!

proxwolfe · Sep 21, 2021

Fabian_E said:
No, it's at most one active vzdump process at a time.

I'm not sure I fully understand this yet: Is this another limitation, i.e. technically only one vzdump process can be active at a time? Or is it because of the locking mechanism which prevents more than one vzdump process from becoming active? In other words: If there were no locking, could there be two vzdump processes running at the same time?

Fabian_E said:
Then the backup will fail earlier.

So then it would not fail because of the lock but because of some other mechanism (like only one vzdump process can be active at a time), right?

Fabian_E said:
because a backup can put quite a bit of load on the host/network.

During the night, reduced performance/responsiveness would be acceptable to me (but of course that depends on the use case). Can the backup be sped up by allocating more resources (CPU) to the VM?

Maybe the easiest solution would be to just deactivate the old backup job and see, if the new one works alright. I would just be more comfortable to keep both for a transition period. Or maybe, I can schedule them one after the other, provided from the second run onwards, it doesn't take so long anymore.

fiona · Sep 21, 2021

proxwolfe said:
I'm not sure I fully understand this yet: Is this another limitation, i.e. technically only one vzdump process can be active at a time? Or is it because of the locking mechanism which prevents more than one vzdump process from becoming active? In other words: If there were no locking, could there be two vzdump processes running at the same time?

Yes, it's the locking mechanism. I don't think there's anything in general that would prevent it (but I can't guarantee it either, there might be some corner case/assumption I'm missing). But if both jobs would reach the same machine at the same time one would fail, which is also not ideal.

proxwolfe said:
So then it would not fail because of the lock but because of some other mechanism (like only one vzdump process can be active at a time), right?

It would fail because the time for waiting for the lock has run out. When vzdump starts, it tries to acquire the lock and waits for the configured lockwait time. If it can acquire the lock within the time, it will execute. If it cannot acquire the lock within the time, it aborts. If there is no other instance, it will get the lock immediately and start executing immediately.

proxwolfe said:
During the night, reduced performance/responsiveness would be acceptable to me (but of course that depends on the use case). Can the backup be sped up by allocating more resources (CPU) to the VM?

I don't think this will make a difference. While it's true that the VM is started for the backup, it's started into a paused state and reading the data happens in a separate thread, which is not bound by the CPU limit configured for the VM AFAIK.

proxwolfe said:
Maybe the easiest solution would be to just deactivate the old backup job and see, if the new one works alright. I would just be more comfortable to keep both for a transition period. Or maybe, I can schedule them one after the other, provided from the second run onwards, it doesn't take so long anymore.

If you can't finish both backups within the night, that might be better. If you want to make sure both backups run regardless, you should increase lockwait, so that the second backup will not abort after 180 minutes, but wait longer until the first one is finished.

proxwolfe · Sep 21, 2021

Fabian_E said:
If you can't finish both backups within the night, that might be better. If you want to make sure both backups run regardless, you should increase lockwait, so that the second backup will not abort after 180 minutes, but wait longer until the first one is finished.

Understood - thank you!

proxwolfe · Oct 8, 2021

Fabian_E said:
Yes, it's the locking mechanism. I don't think there's anything in general that would prevent it (but I can't guarantee it either, there might be some corner case/assumption I'm missing). But if both jobs would reach the same machine at the same time one would fail, which is also not ideal.

One more question please:

We established above, that it is not advisable (and, therefore, prevented) that two vzdump processes run at the same time on the same PVE host.

Are there any limitations as to how many backup jobs (from different PVE hosts or different proxmox backup clients) can target one PBS at the same time?

Thanks!

fiona · Oct 11, 2021

proxwolfe said:
One more question please:

We established above, that it is not advisable (and, therefore, prevented) that two vzdump processes run at the same time on the same PVE host.

Are there any limitations as to how many backup jobs (from different PVE hosts or different proxmox backup clients) can target one PBS at the same time?

I'm not aware of such a limit, as long as the IDs are different (you should use one datastore for each cluster/standalone node anyways).

proxwolfe said:
Thanks!

Guilherme Soncini · Nov 18, 2021

Hi guys, to solve this problem just kill all vzdump processes and after that delete the file /var/run/vzdump.lock, when starting a new routine it will create the file again and everything will be solved.

Search

Search

Backup failed: Can't acquire lock

proxwolfe

Renowned Member

fiona

Proxmox Staff Member

proxwolfe

Renowned Member

fiona

Proxmox Staff Member

proxwolfe

Renowned Member

fiona

Proxmox Staff Member

proxwolfe

Renowned Member

proxwolfe

Renowned Member

fiona

Proxmox Staff Member

Guilherme Soncini

Active Member

We value your privacy