I woke up this morning to discover my daily backup of one of my VMs failed due to the backup destination filing up. I freed up space, but PVE was already exhibiting weird behavior. The VM would not fully start (there was an error mounting a drive), which I had assumed was due to the backup lock still present on the VM (according to the web UI). There was also still a backup task shown in the Tasks section. The Stop button was greyed out, so I started looking into how to stop the backup and/or remove the lock. I discovered no vzdump processes running, so I proceeded with trying to unlock the VM. The `qm unlock` command did nothing; there was no error, but it had no effect. I then tried removing lock files from /etc/pve/qemu-server and /var/lock/qemu-server which also had no effect. I then found references on how to clear tasks using the API and tried that, to no avail. I removed files in /var/log/pve/tasks subdirectories (only ones related to the VM in question), also to no avail. I then discovered how to remove entries from the database, and that had no effect. I rebooted the node itself a number of times during this process.
Next, I decided to try restoring the VM from the last good backup. This worked in that it resulted in a functional VM, but the PVE web UI still claims it is locked and that a backup is running (however, presumably due to one of the files I removed, the task details now just have an error about a missing file). It also shows that the VM is running even when it is stopped, meaning I can't completely remove the VM via the UI (it doesn't give me the option since it thinks the VM is running).
Does anyone know:
A) How to fix this, short of reinstalling PVE and restoring all my VMs?
B) Why a failed backup would cause such problems? Shouldn't the task just die if something goes wrong?
Let me know if there are any specific logs or other info I can provide.
UPDATE: The backup task and the VM lock both cleared at some point, with everything seemingly back to normal. Still interested in why nothing I did resolved it, and why it happened in the first place, if anyone knows.
Next, I decided to try restoring the VM from the last good backup. This worked in that it resulted in a functional VM, but the PVE web UI still claims it is locked and that a backup is running (however, presumably due to one of the files I removed, the task details now just have an error about a missing file). It also shows that the VM is running even when it is stopped, meaning I can't completely remove the VM via the UI (it doesn't give me the option since it thinks the VM is running).
Does anyone know:
A) How to fix this, short of reinstalling PVE and restoring all my VMs?
B) Why a failed backup would cause such problems? Shouldn't the task just die if something goes wrong?
Let me know if there are any specific logs or other info I can provide.
UPDATE: The backup task and the VM lock both cleared at some point, with everything seemingly back to normal. Still interested in why nothing I did resolved it, and why it happened in the first place, if anyone knows.
Last edited: