Some VMs get stuck during backup (full)

Matteo Calorio

Well-Known Member
Jun 30, 2017
34
0
46
52
Hello,

since some weeks some VMs (5 over 250) have problems during backup on PBS.

We have the same problem on any cluster node we move the VMs on:

1689862395701.png

What we can see but that we can't fully comprehend is that these VMs at a certain point (always the same) get stuck, that is both backup and the VM itself get stuck: the VM shows high CPU and RAM usage and its not accessible anymore via Proxmox interface.

Moving disks file on another Ceph storage (slower, because HDD instead of SSD) backup works and VM continue working even after backup.

Moving disks back, the problem occurs again.

Main storage (SSD) works with all other VMs. Even recovering the VM from a backup of months ago, before the problem occurred, still shows the problem.

Moving disks on a SAN and checking them with qemu-img check shows no errors. The same happens while checking disks with chkdsk.

All VMs having problems are windows machines with sql server on them (but we have lot more without this problem).

If we turn off the VMs and do a "Stop" type backup it works, the one not working is the "Snapshot" one.

Is seems so mysterious... Any ideas?

PS: this is a typical CPU/RAM graph:

1689865164573.png

Matteo
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!