The VM hangs if access to PBS is interrupted during backup

docent

Renowned Member
Jul 23, 2009
96
1
73
Hello!
We have PVE 8.1.4 and PBS 3.1.
Our PBS freezes for two days in a row while the backup is in progress. At the same time, the VM that is being backed up freezes too.
If I try to restart this VM, an error message "TASK ERROR: VM is locked (backup)" appears.
If I restart the PBS, the VM will continue to work, but the disk inside it is not accessible.
After restarting the virtual machine, errors appear on the file system.
I repeated this situation on PVE 7.4 and found exactly the same problem there.

Code:
INFO: starting new backup job: vzdump 100 --storage PBS --mailnotification always --mode snapshot --notes-template '{{guestname}}' --quiet 1
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2024-03-27 00:15:01
INFO: status = running
INFO: VM Name: fs01
INFO: include disk 'scsi0' 'pool1:vm-100-disk-0' 16G
INFO: include disk 'scsi1' 'pool1:vm-100-disk-3' 1T
INFO: include disk 'efidisk0' 'pool1:vm-100-disk-2' 1M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/100/2024-03-26T19:15:01Z'
INFO: started backup task 'ee316a35-3c9e-42fb-b52a-8a6942c3535e'
INFO: resuming VM again
INFO: efidisk0: dirty-bitmap status: OK (drive clean)
INFO: scsi0: dirty-bitmap status: OK (612.0 MiB of 16.0 GiB dirty)
INFO: scsi1: dirty-bitmap status: OK (7.7 GiB of 1.0 TiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 8.3 GiB dirty of 1.0 TiB total
INFO:   6% (552.0 MiB of 8.3 GiB) in 3s, read: 184.0 MiB/s, write: 184.0 MiB/s
INFO:  17% (1.4 GiB of 8.3 GiB) in 6s, read: 302.7 MiB/s, write: 302.7 MiB/s
INFO:  20% (1.7 GiB of 8.3 GiB) in 9s, read: 88.0 MiB/s, write: 88.0 MiB/s
INFO:  23% (1.9 GiB of 8.3 GiB) in 12s, read: 86.7 MiB/s, write: 86.7 MiB/s
INFO:  26% (2.2 GiB of 8.3 GiB) in 15s, read: 80.0 MiB/s, write: 80.0 MiB/s
INFO:  29% (2.4 GiB of 8.3 GiB) in 18s, read: 84.0 MiB/s, write: 84.0 MiB/s
INFO:  32% (2.7 GiB of 8.3 GiB) in 21s, read: 88.0 MiB/s, write: 88.0 MiB/s
INFO:  40% (3.4 GiB of 8.3 GiB) in 24s, read: 230.7 MiB/s, write: 230.7 MiB/s
INFO:  42% (3.6 GiB of 8.3 GiB) in 27s, read: 68.0 MiB/s, write: 68.0 MiB/s
INFO:  45% (3.8 GiB of 8.3 GiB) in 30s, read: 73.3 MiB/s, write: 73.3 MiB/s
INFO:  48% (4.0 GiB of 8.3 GiB) in 33s, read: 80.0 MiB/s, write: 80.0 MiB/s
INFO:  50% (4.2 GiB of 8.3 GiB) in 36s, read: 77.3 MiB/s, write: 77.3 MiB/s
INFO:  53% (4.5 GiB of 8.3 GiB) in 39s, read: 86.7 MiB/s, write: 86.7 MiB/s
INFO:  60% (5.0 GiB of 8.3 GiB) in 42s, read: 176.0 MiB/s, write: 176.0 MiB/s
INFO:  66% (5.5 GiB of 8.3 GiB) in 45s, read: 186.7 MiB/s, write: 186.7 MiB/s
INFO:  71% (5.9 GiB of 8.3 GiB) in 48s, read: 128.0 MiB/s, write: 128.0 MiB/s
INFO:  73% (6.1 GiB of 8.3 GiB) in 51s, read: 60.0 MiB/s, write: 60.0 MiB/s
INFO:  79% (6.6 GiB of 8.3 GiB) in 54s, read: 180.0 MiB/s, write: 180.0 MiB/s
INFO:  82% (6.9 GiB of 8.3 GiB) in 57s, read: 80.0 MiB/s, write: 80.0 MiB/s
INFO:  82% (6.9 GiB of 8.3 GiB) in 8h 22m, read: 0 B/s, write: 0 B/s
ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: connection reset
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 100 failed - backup write data failed: command error: write_data upload error: pipelined request failed: connection reset
INFO: Failed at 2024-03-27 08:37:02
INFO: Backup job finished with errors
INFO: notified via target `mail-to-root`
TASK ERROR: job errors
 
In the current architecture of PVE/PBS interaction the stability and reliability of the PBS is more than crucial. Its literally in the direct path of write.
Your quickest way to resolution at this point is to stabilize your PBS, which means determining the cause of the hang.

https://forum.proxmox.com/threads/vms-freezing-and-unreachable-when-backup-server-is-slow.96521/
https://forum.proxmox.com/threads/pbs-incredibly-slow-guests-hang.132865/
https://forum.proxmox.com/threads/vm-freezes-on-when-backing-up-other-vm-on-the-node-to-pbs.140023/
https://forum.proxmox.com/threads/vms-freezez-during-backup-to-pbs.106020/


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!