Commandline to show status of last backup

CCWTech

Well-Known Member
Mar 3, 2020
109
9
58
55
Is there a command line option or something that can be queried that would show the status of the last Proxmox Backup or backups in the last 24 hours? Like success or failure?

We have a proprietary RMM and it would be nice to add this query into our dashboard.
 
Last edited:
Code:
pvesh get /nodes/$(hostname)/tasks --type vzdump --limit 500 --output-format json | sed 's/},/}\
/g' | while read -r line; do   start=$(echo "$line" | grep -o '"starttime":[0-9]*' | grep -o '[0-9]*');   [ -z "$start" ] && continue;   now=$(date +%s);   if [ $((now - start)) -lt 86400 ]; then     upid=$(echo "$line" | grep -o '"upid":"[^"]*"' | cut -d'"' -f4);     status=$(echo "$line" | grep -o '"status":"[^"]*"' | cut -d'"' -f4);     time_fmt=$(date -d "@$start");     echo -e "$upid\t$status\t$time_fmt\tvzdump";   fi; done

This does it I think. But I don't know if there is a better way to detect it.
 
We have had issues with mail delivery before. I'm looking for something that actually grabs info from the box itself.
 
I think using the API like that is a good choice. There's a lot of ways to handle the returned data but since it's JSON why not use jq like this?
Bash:
pvesh get /nodes/$(hostname)/tasks --type vzdump --since $(($(date +%s) - 86400)) --limit 500 --output-format json | jq -r '.[] | "\(.upid) \(.status) \((.starttime | strflocaltime("%a %b %e %I:%M:%S %p %Z %Y")) ) \(.type)"' | column -t
or
Bash:
pvesh get /nodes/$(hostname)/tasks --type vzdump --limit 500 --output-format json | jq -r '.[] | select(.starttime >= (now - 86400)) | "\(.upid) \(.status) \((.starttime | strflocaltime("%a %b %e %I:%M:%S %p %Z %Y")) ) \(.type)"' | column -t
You might have to play around with the time formatting but I feel like this is much more flexible/"proper" and less error prone.
 
Last edited:
  • Like
Reactions: UdoB
If you're running PBS, you can also look into <zpool/<datastore>/vm/<vmid> and check if the fidx file is present for the most recent backup.
 
So there are several approaches to check for some backups on a specific node or a specific PBS. That's great and @Impact's oneliner is a great start.

What I really would like to see is a cluster-wide check, telling me which VM is not backup'ed since <timespan>, e.g since yesterday.
 
I think using the API like that is a good choice. There's a lot of ways to handle the returned data but since it's JSON why not use jq like this?
Bash:
pvesh get /nodes/$(hostname)/tasks --type vzdump --since $(($(date +%s) - 86400)) --limit 500 --output-format json | jq -r '.[] | "\(.upid) \(.status) \((.starttime | strflocaltime("%a %b %e %I:%M:%S %p %Z %Y")) ) \(.type)"' | column -t
or
Bash:
pvesh get /nodes/$(hostname)/tasks --type vzdump --limit 500 --output-format json | jq -r '.[] | select(.starttime >= (now - 86400)) | "\(.upid) \(.status) \((.starttime | strflocaltime("%a %b %e %I:%M:%S %p %Z %Y")) ) \(.type)"' | column -t
You might have to play around with the time formatting but I feel like this is much more flexible/"proper" and less error prone.
I thought about that. Is jq installed by default?
 
If you're running PBS, you can also look into <zpool/<datastore>/vm/<vmid> and check if the fidx file is present for the most recent backup.
All are at different locations (each server is at a different site so not using PBS.
 
What I really would like to see is a cluster-wide check, telling me which VM is not backup'ed since <timespan>, e.g since yesterday.
Good idea. We have this with my check above, but only for selected VMs. The program takes a VM ID and checks the PBS for a backup, converts the last timestamp to epoch and compare with current epoch. Everything is running on the PBS itself and it provides its findings via NRPE for Icinga.

Having it for all VMs should be doable, but only if you also backup all VMs ;)
 
What I really would like to see is a cluster-wide check, telling me which VM is not backup'ed since <timespan>, e.g since yesterday.
I hacked something together that will just display the stuff. You need to write a bit around to get an actual check:

Python:
#!/usr/bin/env python3

from proxmoxer import ProxmoxAPI
import time
from datetime import datetime, timedelta

pve = ProxmoxAPI(
    "proxmox",
    user="monitoring@pve",
    token_name="pbs" ,
    token_value="<redacted>",
    verify_ssl=False
)


vmids = {}

for pve_node in pve.nodes.get():
    for container in pve.nodes(pve_node['node']).lxc.get():
        vmids[container['vmid']]=container['name']
    for vm in pve.nodes(pve_node["node"]).qemu.get():
        vmids[vm['vmid']]=vm['name']

pbs = ProxmoxAPI(
    "pbs",
    service="PBS",
    user="monitoring@pbs",
    token_name="pbs" ,
    token_value="<redacted>",
    verify_ssl=False
)

# datastore name on PBS
datastore = "datastore"

for vm in vmids:
    backup = pbs(f"admin/datastore/{datastore}/snapshots/?backup-id={vm}").get()
    backups = len(backup)
    if backups == 0:
        print(f"{vm} no backup present")
    else:
        sorted_list = sorted(backup, key=lambda x: x["backup-time"])
        ts = sorted_list[-1]["backup-time"]
        ago = datetime.now() - datetime.fromtimestamp(ts)
        if ago < timedelta(hours=24):
            print(f"{vm} was backed up less than 24h ago ({ago})")
        else:
            print(f"{vm} needs a new backup - {ago}")
 
Last edited:
  • Like
Reactions: UdoB and gfngfn256
It's good to see the community interaction on this forum.
Was a bit of a small problem to try to solve with Python. Started to get deeper into it.

P.S. I haven't tested the code as I don't use proxmoxer.
Was the first one that google found. Is there something better? Using the API wrapper was very simple and straight forward to use.
 
Excellent, thank you to all who have commented, I'll have our lead developer take a look at this.
 
I hacked something together
Great!

I can run it, but I get zero output. Probably wrong permissions for my token, on the PBS-side. We'll see...

Should it work with the backups being in a sub-namespace? The root-namespace is empty...
 
Last edited:
I can run it, but I get zero output. Probably wrong permissions for my token, on the PBS-side. We'll see...
I have a user with Audit permissions on both sides.

Should it work with the backups being in a sub-namespace? The root-namespace is empty...
Don't use that on my test machine here, so I cannot tell.

I can't tell you - but Proxmox themselves maintain (rely heavily on) Perl. But they do provide a list of community clients for Python (& others) here.
Yeah, I've seen that list and use most of then in other languages.
 
  • Like
Reactions: UdoB
Hello, thank you LnxBil for the script

I made this one for nagios/centreon plugin from your work:
Python:
#!/usr/bin/env python3

########################################################################
##
## Written by 6adminIT
## Based on LnxBil script on https://forum.proxmox.com/threads/commandline-to-show-status-of-last-backup.168175/#post-782193
##
## Licensed under GPL (see below)
##
## Python nagios/centreon script to monitor PVE server backups on PBS
## server.
##
## Need proxmoxer wrapper available here : https://pypi.org/project/proxmoxer/
##
## Tested PVE version : 8.4
## Tested PBS version : 3.4
##
## This program is free software: you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program.  If not, see <http://www.gnu.org/licenses/>.
##
########################################################################

from proxmoxer import ProxmoxAPI
import argparse
import time
from datetime import datetime, timedelta

########### Filled from command line arguments if exists
##
excluded = [""]
time_range = 24

########### Parse command line args
##
parser = argparse.ArgumentParser(
    prog='check_pve_backups_to_pbs',
    description='Check Proxmox VE backups age on Proxmox Backup Server',
    epilog='Note that user with token API must be set wih AUDIT permission on both servers')
parser.add_argument('--pveserver', help="PVE server IP or FQDN and port - eg: proxmoxserver.lan:8006", type=str, required=True)
parser.add_argument('--pveuser', help="PVE server user", type=str, required=True)
parser.add_argument('--pvetokenid', help="PVE API token ID", type=str, required=True)
parser.add_argument('--pvetoken', help="PVE API token value", type=str, required=True)
parser.add_argument('--pbsserver', help="PBS server IP or FQDN and port - eg: proxmoxbackup.lan:8007", type=str, required=True)
parser.add_argument('--pbsuser', help="PBS server user", type=str, required=True)
parser.add_argument('--pbstokenid', help="PBS API token ID", type=str, required=True)
parser.add_argument('--pbstoken', help="PBS API token value", type=str, required=True)
parser.add_argument('--datastore', help="name of the datastore of PBS", type=str, required=True)
parser.add_argument('--exclude', help="Optional - ID of excluded VM or CT, separate with commas - eg: 100,101", type=str, required=False)
parser.add_argument('--time', help="Optional - Range in hours to check backups (default 24)", type=int, required=False)

## Get args
args = parser.parse_args()
pve_server = args.pveserver
pve_user = args.pveuser
pve_tokenid = args.pvetokenid
pve_token = args.pvetoken
pbs_server = args.pbsserver
pbs_user = args.pbsuser
pbs_tokenid = args.pbstokenid
pbs_token = args.pbstoken
datastore = args.datastore

if args.exclude is not None:
    excluded = args.exclude.split(",")

if args.time is not None:
    time_range = args.time

########### OTHER VARS
##
# in hours
check_backups = True
msg_backup = ""
vm_count = 0
# return state for centreon/nagos: 0=OK, 1=WARNING, 2=CRITICAL, 3=UNKNOWN
stateNum = 0

########### BEGIN CHECKS
##
### Proxmox VE server
pve = ProxmoxAPI(
    pve_server,
    user=pve_user,
    token_name=pve_tokenid,
    token_value=pve_token,
    verify_ssl=False
)

vmids = {}

for pve_node in pve.nodes.get():
    for container in pve.nodes(pve_node['node']).lxc.get():
        vmids[container['vmid']]=container['name']
    for vm in pve.nodes(pve_node["node"]).qemu.get():
        vmids[vm['vmid']]=vm['name']

### Proxmox Backup server
pbs = ProxmoxAPI(
    pbs_server,
    service="PBS",
    user=pbs_user,
    token_name=pbs_tokenid,
    token_value=pbs_token,
    verify_ssl=False
)

for vm in vmids:
    if str(vm) not in excluded:
        vm_count += 1
        backup = pbs(f"admin/datastore/{datastore}/snapshots/?backup-id={vm}").get()
        backups = len(backup)
        if backups == 0:
            msg_backup = msg_backup + f"{vm} no backup present - "
            check_backups = False
            stateNum = 2
        else:
            sorted_list = sorted(backup, key=lambda x: x["backup-time"])
            ts = sorted_list[-1]["backup-time"]
            ago = datetime.now() - datetime.fromtimestamp(ts)
            if ago < timedelta(hours=time_range):
                msg_backup = msg_backup + f"{vm} was backed up less than {time_range}h ago - "
            else:
                msg_backup = msg_backup + f"{vm} needs a new backup - "
                check_backups = False
                stateNum = 2

if check_backups:
    print(f"OK - {vm_count} BACKUPS LESS THAN {time_range}h")
#    print(f"OK - {msg_backup}")
else:
    print(f"KO - {msg_backup}")

exit(stateNum)
 
  • Like
Reactions: waltar and UdoB