Proxmox 9 Veeam 12.3.2 Worker start failed

Oct 4, 2025
2
4
3
Hello!
We have the problem that very often Veeam fails to start the worker.
Now the problem could be with Veeam or with Proxmox. I then always have to manually abort the worker start in Proxmox.
When I manually repeat the backup job, it usually works directly and the worker starts without problems.
In Proxmox itself I only see the following error message:
trying to acquire lock...TASK ERROR: can't lock file '/var/lock/qemu-server/lock-115.conf' - got timeout.
Which logs in Proxmox can I check to get a more detailed error message?
Greetings from Hamburg, Thorsten
 
We have the same problem.

rm /var/lock/qemu-server/lock-<uid>.conf
qm unlock <uid>

solves the problem for the next start.

From my point of view the issue is that after the shutdown of the Veeam worker machine a lock in /var/lock/qemu-server still exists. I have a support case open with the Proxmox team as well as with Veeam - no luck so far.

Have you meanwhile a working solution/work around?

Best,
Frederik
 
The backup jobs run successfully and the worker is being started when the number of VMs to be backup is rather small (in my case the jobs works with 11 VMs; backup jobs with more than 18 VMs fail).This behavior is consistent and I'm able to reproduce the issues...
 
That looks like root disk issues, eg consumer ssd/nvme could become slow as floppys under write load and app's, eg veeam may brake trying to remove a lock after backup run after a timeout so the lock leave until next run which then prevent next backup run.
 
With "failing" I mean that Veeam is not able to start the Promox Worker VM - the reason why the backup job fails. We have enterprise grade SSD in the server and the system is responsive and has a good performance. Proxmox support told us that it could be some sort of racing conditions when the VM starts. A solution was to let Veeam only process one VM at a time, but then the worker will be shut down and started for each VM to backup and this consumes too much time - not an option in our case.
 
Veeam Backup & Replication works like this: you deploy a so called PVE worker on each Proxmox host. This worker is doing the backup of the VMs which are running on the corresponding Proxmox node. To backup a regular VM, Veeam starts this PVE worker. There is the problem. In certain cases Veeam is not able to start this worker.
 
Your process is clear but not the symptom. As example you start a vm which is running with a service inside until you do updates inside and because of live migration of it the pve hosts could be updated also in between but anyway, so the vm is running for a while and at last state - from booting local or last migration from 1 to other pve it gets a local /var/lock/qemu-server/lock-<vmid>.conf file but then this isn't "locked" for other processes like a veeam worker. It was just created and released, so you can write to it, eg:
root@pve4:/var/lock/qemu-server# ll /var/lock/qemu-server/lock-180.conf
-rw-r--r-- 1 root root 4 Nov 4 20:22 /var/lock/qemu-server/lock-180.conf
root@pve4:/var/lock/qemu-server# echo nix > lock-180.conf
root@pve4:/var/lock/qemu-server# cat lock-180.conf
nix
root@pve4:/var/lock/qemu-server# ll /var/lock/qemu-server/lock-180.conf
-rw-r--r-- 1 root root 4 Nov 4 20:22 /var/lock/qemu-server/lock-180.conf
--> So in the end your vm is running and your lock file could be months old (!!) the "lock" file isn't locked for other processes and so it couldn't normally get a file lock error !! If the vm is shutdown pve remove the lock but as long as it's running or freezed it's there and writeable and not locked. Understand ?
 
Last edited:
Hello everyone,


we experienced very similar behavior to what is described in this thread – with the difference that in our case the Veeam PVE worker VM could no longer be stopped.


The issue occurred during a Veeam 12.3.2 backup run on Proxmox VE 9.
During the backup, Proxmox reported lock and QMP timeouts, including messages like:

can't lock file '/var/lock/qemu-server/lock-<vmid>.conf' - got timeout

The worker VM was still shown as running in Proxmox, although no QEMU process was actually running anymore.
A regular stop operation was no longer possible; only a manual unlock (qm unlock) or removal of the lock file resolved the inconsistent state.

At the same time, the underlying LUN ran out of free space, which caused:
  • the Veeam worker VM to effectively crash, and
  • one or more VMs that were being backed up to become unstable or crash as well.
After manually cleaning up the locks and restarting the backup job, the job usually completed successfully again. This strongly suggests a race condition or cleanup issue under high I/O load.

Important note:
So far, this has only happened once, and during the weekend, but if this behavior becomes reproducible or permanent, it is business critical for us, as it affects both production VMs and the backup infrastructure.

From our perspective, this appears to be a combination of:
  • Veeam backup and snapshot I/O pressure
  • a worker VM that is not properly stopped under load
  • lock handling in Proxmox with an inconsistent VM state
  • and a low-space condition on the storage/LUN
Current status / workaround:
We are currently testing the following mitigations:
  • reducing Veeam parallelism (VMs per job = 1, threads = 1)
  • keeping the worker VM running permanently instead of shutting it down after each job
  • optionally setting the worker VM hardware version to 9.2, which seems to be an older but known workaround
At this point, it is too early to say whether these measures fully resolve the issue.
We are monitoring the situation closely and will open a support case with Veeam if the problem occurs again.
 
I've updated Veeam to 13.0.x and the problem is gone. Veeam 13 does not stop the worker anymore (or my changes were not overwritten by the update). I opened up a ticket with Proxmox and Veeam. Proxmox also thought that some sort of a racing condition could occur... I really like Proxmox and Veeam, they just need another couple of months to get to know each other ;-)