[SOLVED] VM unresponsive after failed backup

bly

New Member
Mar 15, 2024
10
1
3
Hello, I need some help to properly configure the backup:
while the scheduled backup of alll my VM failed because the backup disk space was not sufficient, the VM that was actually being backed up at the moment of failure became unresponsive and any try to shut it down and restart failed.

I'd like to know how properly handle this issue and use correct backup settings to avoid it repeating. Any help/hint is appreciated, thank you!
I corrently changed backup settings to keep only last backup instead of 2, but I'd like configure it so the VM don't get unresponsive if backup fails for any reason.
 
Hello,

How did the shutdown/restart failed? Could you please send us the task's log? You can find the logs in the bottom of our web UI on the Logs section and then clicking on the failed task.
 
Hi, I did attached here all logs related my tentatives and the failed backup.
In the end I had shut down and restarted the node to have the VM again up.
Thank you for help!
 

Attachments

  • task-rsthost5-vzdump-2024-03-15T00_00_00Z.log
    36.6 KB · Views: 1
  • task-rsthost5-qmreboot-2024-03-15T08_02_01Z.log
    37 bytes · Views: 1
  • task-rsthost5-qmstop-2024-03-15T08_03_02Z.log
    106 bytes · Views: 1
  • task-rsthost5-qmreset-2024-03-15T08_03_57Z.log
    106 bytes · Views: 1
  • task-rsthost5-qmshutdown-2024-03-15T08_06_42Z.log
    106 bytes · Views: 2
It looks like many tasks were blocked due to

Code:
TASK ERROR: can't lock file '/var/lock/qemu-server/lock-412.conf' - got timeout

In most cases, it means there is a running task which might have had hanged. In such cases you can manually stop the task from the web UI.
 
Thank you for the reply.
Backup hanged again this night but trying to stop it from the UI didn't help, backup process was still hanged and from the UI I can see the affected VM settings are locked because ongoing backup.
From UI I got the backup process PID on the node, but kill -9 pid didn't helped, and process is not killed. I tried also to kill zstd but no luck there too.

Can you please tell me how correctly abort the hanged backup and unlock the VM settings?
 
How did you try to stop the task?

On the bottom of the web UI you have a list of all tasks, you click on one and press the "Stop" button.

I suspect there is a bottleneck somewhere in your setup for backups to hang in this reproducible fashion, we recommend at least a 10G network for backups and enterprise grade SSDs given directly to the Backup server.
 
Yes, I did use the stop button first, then tried to locate and kill pid. When I restarted the node as last resort, shutdown was hang waiting for zstd process. It rebooted after some minutes.

This night I moved the backup target to a faster storage, it ended without issues. Thanks for the help!
 
  • Like
Reactions: Kingneutron

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!