Random backups randomly fails with: qmp command 'backup' failed - got timeout

Jan 22, 2022
11
1
6
27
Hello,
I have proxmox backup server and last few months I have this issue:

Code:
3004: 2025-09-13 00:53:00 INFO: Starting Backup of VM 3004 (qemu)
3004: 2025-09-13 00:53:00 INFO: status = running
3004: 2025-09-13 00:53:00 INFO: VM Name: TRUNC
3004: 2025-09-13 00:53:00 INFO: include disk 'scsi0' 'rbd_main:vm-3004-disk-1' 500G
3004: 2025-09-13 00:53:00 INFO: include disk 'efidisk0' 'rbd_main:vm-3004-disk-0' 528K
3004: 2025-09-13 00:53:00 INFO: include disk 'tpmstate0' 'rbd_main:vm-3004-disk-2' 4M
3004: 2025-09-13 00:53:00 INFO: backup mode: snapshot
3004: 2025-09-13 00:53:00 INFO: ionice priority: 7
3004: 2025-09-13 00:53:00 INFO: snapshots found (not included into backup)
3004: 2025-09-13 00:53:00 INFO: creating Proxmox Backup Server archive 'vm/3004/2025-09-12T22:53:00Z'
3004: 2025-09-13 00:53:00 INFO: enabling encryption
3004: 2025-09-13 00:53:00 INFO: attaching TPM drive to QEMU for backup
3004: 2025-09-13 00:53:00 INFO: drive-scsi0: attaching fleecing image rbd_main:vm-3004-fleece-0 to QEMU
3004: 2025-09-13 00:53:00 INFO: issuing guest-agent 'fs-freeze' command
3004: 2025-09-13 00:55:11 INFO: issuing guest-agent 'fs-thaw' command
3004: 2025-09-13 00:55:11 ERROR: VM 3004 qmp command 'backup' failed - got timeout
3004: 2025-09-13 00:55:11 INFO: aborting backup job
3004: 2025-09-13 00:56:27 INFO: resuming VM again
3004: 2025-09-13 00:56:27 INFO: removing (old) fleecing image 'rbd_main:vm-3004-fleece-0'
3004: 2025-09-13 00:56:28 ERROR: Backup of VM 3004 failed - VM 3004 qmp command 'backup' failed - got timeout

I tried to upgrade PVE (now 8.4.8) and PBS (now 4.0.14) with no luck. I tried to split backups from whole cluster to per node backups.

This stuff happens across nodes and VMs (debian, alpine, windows).
 
Hi,
how does the network/CPU/IO load on the backup server and Proxmox VE node look like around the time the issue happens? Please check the logs on the Proxmox Backup Server side to see if it's the same issue as described here: https://bugzilla.proxmox.com/show_bug.cgi?id=5080