[SOLVED] Miss matched backup_status response from API.

dmilbert

New Member
Jul 13, 2021
11
3
3
24
Hi, so I am developing/writing a script to send individual emails to our customers notifying them of their VM's backup status (successful or failed).
While testing I discovered that depending on where the backup is cancelled or problem occurs the API's (Proxmox API & PBS API) respond with different status messages.

For example, when a backup is cancelled on the Proxmox backup Server via the interface the following response is given by both API's:
PBS (Output of PHP Array):

Code:
Array
(
    [endtime] => 1627389701
    [node] => localhost
    [pid] => 3715528
    [pstart] => 21200829
    [starttime] => 1627389697
    [status] => task aborted
    [upid] => UPID:prx-backup:0038B1C8:01437FBD:0000033C:60FFFF01:backup:pbs\x3avm-997:host_backup@pbs:
    [user] => host_backup@pbs
    [worker_id] => pbs:vm/997
    [worker_type] => backup
)
Proxmox (Output of pvesh command):

Code:
{
    "endtime": 1627389702,
    "id": "997",
    "node": "prx001",
    "saved": "1",
    "starttime": 1627389694,
    "status": "job errors",
    "type": "vzdump",
    "upid": "UPID:prx001:0016A67B:274C6555:60FFFEFE:vzdump:997:dmilbert@pam:",
    "user": "dmilbert@pam"
  },

And when the Backup is cancelled on the Proxmox Server the following respone is given:
PBS (Output of PHP Array):

Code:
Array
(
    [endtime] => 1627390823
    [node] => localhost
    [pid] => 3715528
    [pstart] => 21200829
    [starttime] => 1627390818
    [status] => backup ended but finished flag is not set.
    [upid] => UPID:prx-backup:0038B1C8:01437FBD:0000033D:61000362:backup:pbs\x3avm-997:host_backup@pbs:
    [user] => host_backup@pbs
    [worker_id] => pbs:vm/997
    [worker_type] => backup
)
Proxmox (Output of pvesh command):

Code:
{
    "endtime": 1627390823,
    "id": "997",
    "node": "prx001",
    "saved": "1",
    "starttime": 1627390815,
    "status": "interrupted by signal",
    "type": "vzdump",
    "upid": "UPID:prx001:001824BE:274E1B27:6100035F:vzdump:997:dmilbert@pam:",
    "user": "dmilbert@pam"
  },

As you can see in the above examples the Status/error messages differ between API's and depending on cancellation method.

This makes things very confusing and also makes debugging a hassle, if this situation were to happen instead of a backup being cancelled it gets interrupted or fails how would you know which API is correct or what actually happened.
 
personally i'd consider a canceled backup always as a failure, since there was no backup

the reason why the messages differ is because canceling a task does not really communicate with the task itself it only aborts it.
so if the pbs task is canceled, pve only sees that the connection is closed and ends with 'job errors'

and if you cancel the backup on the pve side, the backup server only sees that the client ended the backup without actually finishing it
('backup ended but finished flag is not set.')

both error conditions can happen besides cancelling of the task

and while in pbs it might be possible to have better cancellation logic, since a backup is basically a http2 upload, we cannot refer that information
back to the pve

and in pve we cannot really improve the cancellation logic, since we must rely there on inter process signals to kill the worker

also: any 'status' that is not 'OK' or 'WARNINGS: ...' is an error
 
Ok thank you, I will create a workaround in my script to accommodate the logic used by the PVE & PBS Servers
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!