Backup job finished with errors

alex_31_ntl

New Member
Feb 28, 2024
2
0
1
Hello everyone,

For several days now, I've been experiencing problems backing up my virtual machines (VMs) on my Proxmox Backup Server (PBS).

To put this in context, I've deployed a dozen Proxmox Virtual Environment (PVE) nodes in my company, on which around 3 VMs are installed. I've scheduled daily backups and made sure that the backups of the 3 VMs don't run simultaneously (with an hour's interval between each PVE VM).

However, I encounter a random error affecting a few VMs every day. It's never the same VMs that cause the problem, and the next day, the backup runs correctly while another VM fails. Here are the logs I've observed on the pve :


1709126125195-png.63888




and the pbs :

1709126634733.png

For your information, the QEMU guest agent is activated on VMs, I've read on some forums that this can cause problems during backups...
Here is l/O during the backup phase on VMs, I have this peak for each server. :


1709126716207.png
Could the problem come from the fact that the backups of each PVE are done at the same time?

exemple :


1709126764398.png

1709126813070.png


Do you have any idea what might be causing the problem?

I'm getting desperate... :)
Alexis
 
Last edited:
Hi,
what Proxmox VE version are you using, please post the output of pveversion -v. The error indicates that the client disconnected from the backup server without finishing the backup. Do you see any errors around that time in the systemd journal? journalctl --since <DATETIME> --until <DATETIME> will give you a paginated view of the journal for the selected timespan.
 
Hi Christ,
Here is the proxmox version for each pve :

For the first two pve
pveversion -v
proxmox-ve: 8.0.1 (running kernel: 6.2.16-5-pve)

For the third pve
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)

And the result of the systemd journal :

journalctl --since "2024-02-28 00:00:00" --until "2024-02-28 00:15:00"

1709133119723.png

same kind of result for the second and third pve :

1709133457165.png
 
Hi Christ,
Here is the proxmox version for each pve :

For the first two pve
pveversion -v
proxmox-ve: 8.0.1 (running kernel: 6.2.16-5-pve)

For the third pve
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)

And the result of the systemd journal :

journalctl --since "2024-02-28 00:00:00" --until "2024-02-28 00:15:00"

View attachment 63897

same kind of result for the second and third pve :

View attachment 63898
Seems like the VMs are not responsive anymore. Are they still operating normally after the backup? Please check if the process is stuck in an uninterruptible sleep state via ps auwxf or htop . Could be that your backup server is not able to handle the load? Please check also the systemd journal on the PBS side.

On a side note: You do not need to create an individual backup schedule for each VM, if you create one, the backups for that node are performed sequentially already, not concurrently.

Edit: Please also have a look at this thread and perform the suggestions/ouputs given by @fiona https://forum.proxmox.com/threads/e...-action-on-proxmox-qmp-command-failed.139892/
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!