Backup Job Runs But Stops Midway — No Logs Generated for Most VMs (Happening on All 4 Servers)

Mayur Gamot

New Member
Apr 7, 2026
1
0
1
Hi everyone,

I am facing a serious backup issue and I need help understanding what is going wrong. The problem is happening on all 4 of our Proxmox servers — 2 servers in Delhi (India) and 2 servers in the USA. All four are showing the exact same behavior, so this does not seem to be a hardware or location-specific problem.


WHAT IS HAPPENING
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Every night our backup jobs start on schedule. The job runs, backs up a few VMs successfully, and then it just stops. The remaining VMs on that server are never backed up — and the strange part is there are no log entries for them at all. Not a failure message, not a skip notice, nothing. They are simply absent from the backup log as if the job never tried to back them up.

On our main Delhi server (DelhiServer), we did catch two actual error messages before the log went completely silent:

QMP command 'query-backup' timeout (VM 101 — Account)
QMP command 'query-backup' timeout (VM 104 — Window-Server-CAT)

After those two timeout errors, the remaining 16 VMs on that server have zero log output — no attempt, no error, nothing at all.


SCALE OF THE PROBLEM (Backup Date: 02 April 2026)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

DelhiServer (192.168.5.210)
→ Backed up successfully : 0 confirmed
→ Failed with error : 2 VMs (QMP timeout on VM 101 and VM 104)
→ No log entry at all : 16 VMs

Delhi Server-3 (192.168.5.212)
→ Backed up successfully : 3 VMs
→ Failed with error : 0
→ No log entry at all : 7 VMs

s4t1 (USA)
→ Backed up successfully : 8 out of 42 VMs
→ Failed with error : 0
→ No log entry at all : 34 VMs

s4t2 (USA — Hetzner PBS)
→ Backed up successfully : 14 out of 37 VMs
→ Failed with error : 0
→ No log entry at all : 23 VMs

In total across all 4 servers, 80 VMs have no backup and no log entry explaining why.


SERVER INFORMATION — DelhiServer (192.168.5.210)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Note: I am sharing full details for DelhiServer right now. I will add the other 3 servers shortly.

Hardware
CPU : Intel Core i9-12900K (12th Gen)
RAM : 125 GB total | 99 GB used | 23 GB free
Swap : 8 GB total | 7.3 GB used (swap is nearly full)

Software Versions
Proxmox VE : 9.1.0
PVE Manager : 9.1.6
Running Kernel : 6.17.13-2-pve
PBS Client : 4.1.5-1
QEMU/KVM : 10.1.2-7
OS : Debian GNU/Linux

Network
Primary IP : 192.168.5.210 (vmbr0 bridge)
Interface : enp6s0 (1500 MTU, UP)

Storage Overview
local (dir) : 94 GB total | 44 GB used | 46 GB free (49% used)
local-lvm : 1.71 TB total | 1.45 TB used | 261 GB free (84.75% used)
ct-storage : 1.86 TB total | 214 GB used | 1.65 TB free (11.51% used)
delhi-pbs : 1.86 TB total | 1.58 TB used | 280 GB free (84.99% used) ← Backup Target
delhi-pbs2 : 1.86 TB total | 1.58 TB used | 280 GB free (84.99% used) ← Backup Target



BACKUP CONFIGURATION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Backup mode : Snapshot / Incremental
Backup schedule : Nightly at 21:00
Storage targets : Proxmox Backup Server (PBS) on-site for Delhi servers
Hetzner PBS for the USA servers


WHAT WE HAVE ALREADY CHECKED
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

• All VMs are powered on and running at the time of backup — this is not a powered-off VM issue.
• Backup jobs are active and scheduled correctly in the Proxmox UI.
• No configuration changes were made before this issue started.
• The problem is identical across Delhi and USA locations, which rules out a local network issue.
• Storage has space available, though PBS targets are running high at ~85%.


MY QUESTIONS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1. Why does the backup job not generate any log entry for the skipped VMs — not even a "skipped" or "failed" line?

2. Could the QMP timeout error on VM 101 and VM 104 be causing the entire job to abort silently for all remaining VMs in the queue?

3. Could high RAM usage (99/125 GB) and near-full swap (7.3/8 GB) on DelhiServer be causing the QMP timeout and backup failures?

4. Could the PBS storage being at ~85% capacity be contributing to the backup job stopping early?

5. Is there a known issue in PVE 9.1 or PBS 4.x where snapshot/incremental jobs exit early without logging all VMs?

6. How can I force Proxmox to log every single VM attempt — success or failure — so I can see exactly where it stops?


I am happy to share full vzdump logs, /var/log/syslog output, or PBS task logs. Just let me know which ones would be most useful and I will attach them right away.

Thank you in advance for any help!
 
Last edited: