Experiencing consistent backup timeouts across multiple Proxmox nodes (VM 255 example):
Environment:
Questions:
Code:
255: 2025-03-24 00:46:35 INFO: Starting Backup of VM 255 (qemu)
255: 2025-03-24 00:46:35 INFO: status = running
255: 2025-03-24 00:46:35 INFO: VM Name: vmname255
255: 2025-03-24 00:46:35 INFO: include disk 'scsi0' 'ceph-nvme:vm-255-disk-0' 125G
255: 2025-03-24 00:46:35 INFO: backup mode: snapshot
255: 2025-03-24 00:46:35 INFO: bandwidth limit: 102400 KiB/s
255: 2025-03-24 00:46:35 INFO: ionice priority: 7
255: 2025-03-24 00:46:36 INFO: creating Proxmox Backup Server archive 'vm/255/2025-03-23T23:46:35Z'
255: 2025-03-24 00:46:36 INFO: enabling encryption
255: 2025-03-24 00:46:37 INFO: drive-scsi0: attaching fleecing image fleecing:vm-255-fleece-0 to QEMU
255: 2025-03-24 00:46:37 INFO: issuing guest-agent 'fs-freeze' command
255: 2025-03-24 00:48:42 INFO: issuing guest-agent 'fs-thaw' command
255: 2025-03-24 00:48:42 ERROR: VM 255 qmp command 'backup' failed - got timeout
255: 2025-03-24 00:48:42 INFO: aborting backup job
255: 2025-03-24 00:53:40 INFO: resuming VM again
255: 2025-03-24 00:53:41 ERROR: Backup of VM 255 failed - VM 255 qmp command 'backup' failed - got timeout
Environment:
- Proxmox VE 8.X clusters (10+ nodes)
- Ceph NVMe storage for VM disks
- Proxmox Backup Server (PBS) v3.3.2, separate hardware (HDD array with raid50)
- Backups via snapshot mode to PBS
- fleecing on fast local disk
- Timeouts occur randomly during backup QMP command, even on idle nodes.
- PBS shows no obvious resource saturation (CPU/RAM/network within limits).
- Retries sometimes work, but not reliably.
Questions:
- Anyone resolved similar "qmp command timeout" issues tied to PBS?
- Recommended PBS tweaks for large-scale environments?
- Known bugs in PBS v3.3.2 related to concurrent backups?