Backup Error for ONE VM only - ERROR: Backup of VM 100 failed - job failed with err -125 - Operation canceled

Dec 23, 2022
5
0
1
Hi,

I need some assistance in order to identify the root cause of a backup job failing for the reason of: ERROR: Backup of VM 100 failed - job failed with err -125 - Operation canceled

PREVIOUS ISSUE IN DECEMBER AND HOW IT WAS RESOLVED

In December, I started to have problems on a backup job as well with a different error code: job failed with err -5 - Input/output error
It was resolved by reinstalling Proxmox from scratch on the server. It resolved the issue for a few weeks than it came back. I suspended the backup job, and recently the VM as it was decommissioned. Thus until last Thursday, everything was running smoothly.

DETAILS

Please find in attached file some technical information:
- Physical server info
- Basic Promox version and setup
- PVE Version -v
- Journalctl -xe (pertaining to the last occurrence)
- QM CONFIG VM 100 (affected VM)
- smartctl -a for both devices (sda and sdb)

Let me know if you need further information to assist in troubleshooting.

Regards,
 

Attachments

  • 2023-03-29 - Investigation erreur - VM 100 (plume-asset-server)_redacted.txt
    14.6 KB · Views: 7
Last edited:
Updating the case here. I opened a ticket with DELL to confirm whether or not it was hardware-related. No issue with the hardware or the drives.
Thus, I'd appreciate inputs from anyone who faced the same issue AND identified the root cause. :)
Also, i updated to the latest on Promox and the node was restarted. I assume the BTRFS CSUM errors will not be fixex unfortunately.
Last time this issue occurred, as mentioned, I reinstalled completely and restored my VMs on it. Hoping I won't have to start from scratch again.
Although, what bugs me is not only the fact I need to resinstall, rather the lost of confidence on the system and its integrity, and whether or not I will face more serious issue in the future.
 
Was told eventually by Proxmox BTRFS is still experimental.
Anyone had similar issue and found a way to resolve it? Apart from rebuilding the node from zero?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!