Automated Backup Management in Proxmox via Pools

Apr 27, 2024
404
139
43
Portland, OR
www.gnetsys.net
I would like to automatically add new VMs to a Pool.

I manage backups via Pool membership.
I would like to automatically backup new VMs.
So it would be great if I could automatically stick em in a "New VMs" Pool.

How would you do that? I've heard about hook scripts. I could imagine much too complex methods involving shell scripts.

NOTE ... Scroll down. I built it all, screenshots and stuff ...
 
Last edited:
Magnus got me started. I still haven't found a way to evaluate things that don't have Pool memberships, but this might be just as good.

***(@markusmaximus later contributed code using the .pool == null property, which was what I was looking for here.)

Code:
pvesh get /cluster/backup-info/not-backed-up -output-format=json

It gives me an ugly (I've always hated JSON) list of everything without a backup.
[{"name":"testbox","type":"qemu","vmid":626}]

I've created dummy backup jobs for the two Pools that won't get backups, so they drop off this "not-backed-up" list.

So now all existing VMs are either members of Pools that have real backups or Pools that have dummy backups.
When I run this command, its going to give me a list of new VMs that haven't been added to one of the backup Pools.
And It seems pretty straightforward to turn that into a command to add them to the NewVMs Pool, where they will be automatically backed up and gathered for further disposition later.

Using this API "not-backed-up" thing to manage Pool membership, is definitely roundabout, but its totally supported.
To be clear, I still don't have a way to explicitly check for lack of Pool membership, but one-to-one linking Pools to Backup Jobs lets you use API to check Backup status and infer Pool status. Instead of building logic, I'll use this API thing.

I guess I'll have to code something. Blah.
 
Last edited:
Oooo.
This is ugly, but we are getting there.
This gives me a list of VMs with the rest of the data stripped out.

pvesh get /cluster/backup-info/not-backed-up -output-format=yaml | grep vmid

A bit of text cutting, and I'll have it down to the exact data I need.
Gotta write something to cycle through the list it produces.
I'm no coder, but this is pretty darn close to done.
 
Automated Backup Management in Proxmox via Pools

Taken as a whole, this is automated backup group management in Proxmox.
Is this new? Nobody's built this before?

Thanks to the community members that assisted.

Here it is.
  • In your PVE GUI, create role-designated Pools (UAT/Prod/etc). Create a NewVMs Pool (that should be empty).
    • (Click the Datacenter node and go down to Security in order to make Pools.)
  • Add all VMs to a Pool. Everything gets Pool membership.
  • All Backup Jobs must be Pool-based. Create a Backup Job for every Pool.
    • (Just disable the Backup Job for Pools that you don't want to backup, but you must create the Job.)
    • Make a Job for NewVMs Pool (which should be empty) too.
1733616452839.png

The script logic is thus ...
  • If the script finds that you don't have a Backup, you must not be in a Pool ...
  • So you get added to the NewVMs pool, where you safely get backups ...
... even though we forgot about you. And then we come along later to figure out which Pool you really need to go in.

Forgive my coding if there are glaring mistakes. I'll fix anything really stupid if you tell me.
And for you one-liner freaks out there, this code is optimized to be understood, not for minimal byte-count.

Code:
#2024 Generic Network Systems Proxmox team. Free to the world.
#This script adds VMs that are not backed up to a Pool called NewVMs.
#Create the Pool and a Backup Job for it before using this script.

#List YAML vmids not backed up then strip em w grep
notBackedUp="$(pvesh get /cluster/backup-info/not-backed-up -output-format=yaml \
    | grep vmid)"

#Exit the script if there are no results.
[ -z "$notBackedUp" ] &&  exit 0

#strip the text, collapse the line, add commas
addToPool=""$(echo "$notBackedUp" \
    | cut -f4 -d" " \
    | paste -sd,)""

#execute
pvesh set /pools/NewVMs -vms $addToPool

#log
echo $(date) " " "Added to NewVMs Pool:" "$addToPool" >> /var/log/newvms-pool.log

The above code runs, I don't get errors, and it does what's expected.
You'll need a cron job and save the script somewhere.
Your mileage may vary.


---------------------------------------
Other stuff:

Error checking could be added. Or logging. Or both. Sorry, its a hack. I'll quite likely build them and add them here.
*Did that.

The "set /pools/" command throws errors if you are already in a pool (which shouldn't happen, but what if ..)
We have an option to override the errors and force a Pool change with "allow-move". Pool membership is exclusive, you can only be in one.
Forcing pool membership changes could lead to unintended consequences if users that are unaware of the script were to add new VMs to the cluster and do unexpected things with Pools and Backup Jobs.
On the other hand, forcing NewVMs Pool membership on not-backed-up machines may help reign in users that are not correctly provisioning new VMs and is in fact part of the overall intent here. This is an option to be considered.

--------------------------------------
References:

pvesh
https://pve.proxmox.com/pve-docs/pvesh.1.html

pvesh /cluster/backup-info/not-backed-up
https://pve.proxmox.com/pve-docs/api-viewer/index.html#/cluster/backup-info/not-backed-up

pvesh set /pool/{poolid}
https://pve.proxmox.com/pve-docs/api-viewer/index.html#/pools/{poolid}

Proxmox API
https://pve.proxmox.com/wiki/Proxmox_VE_API
 
Last edited:
  • Like
Reactions: Grams
Updated the script for the 'no results' condition. It's the only error I've found to handle.

Fixed two formatting errors.

Tested it. Seems to work, no errors.

I'm adding it to a cron right now.
... Aaaaand ... It works!
VMs that don't have a backup go into the NewVMs Pool, just like they should.
And in that Pool, they will have a backup.

... And added logging.
 
Last edited:
I ran this script flawlessly in my lab for days. Tried deploying it to Prod at work today.
I failed to follow my own initial setup directions (which are in the header of the script itself), and ran my own script on a system I'd only half configured. Now I know what errors it produces when you do that.

I'm considering logging for cases where people don't follow the directions and don't setup either the NewVMs Pool or if there isn't a Backup Job for every Pool. It fails quite gracefully. I could probably parse those returns and output them as configuration recommendations in a log.

Processing failures leads to a bit of problem. For the current logging model, it only logs successes. You get a record of completed items changes, and I expect the entire log over a long lifetime of many, many VMs being created to be just a few K.

If I added failures to the log, and if the cron was misconfigured, the log file could fill the hard drive. If mistakes were made in both areas, it could fill a small drive in a few days of nonstop text file growth. That would be bad.

So if I add failure logging, I also need to add log management.
And log mgt goes way beyond the simple demo script I intended to write.
But I'm actually using it. It's not a toy. It needs to always work.
I'm a hacker, I don't normally build durable code. So, maybe. After I learn how.

................................

... I've been running this in Prod. Decided to keep it simple and not log failures, so if it ever screws up, it will just stop working silently. It logs actions, and those actions seem to run correctly every time. Good deal. Done.

... I have it tested and running on PVE 7.2.4. No issues. Works. Pools work differently there. I don't like it. Gonna upgrade those boxen asap.

... This system is now running successfully on a number of clusters around the globe. It just cleans up if somebody makes a mess. Very nice.
 
Last edited:
I am bumping this, because its probably the most significant thing I've written for Proxmox, and its working very well for me.

In my systems, when somebody adds a new VM and forgets to back it up, this script adds them to a backup pool for NewVMs.

I can't tell you how many times this script has caught people neglecting the basics. Yes, even me. Today.

Everything in our systems is backed up, or consciously excluded from backups.
All of it. No exceptions.


... actually, I guess I could tell you. Because I implemented logging. Probly break some NDA if I did it tho.
 
Last edited:
  • Like
Reactions: UdoB
Run it on one host in the cluster.
I run it with a cron every hour.

I'd be happy to help anyone that wants to setup a similar system.
The script relies on this logic, and for the logic to work, the Pools and Jobs need to be configured as described.
  • If the script finds that you don't have a Backup, you must not be in a Pool ...
  • So you get added to the NewVMs pool, where you safely get backups ...
 
Thanks, I modified it a little.

Instead of assuming if no backup was taken, I am simply looking for any VM that is not in a pool already.
I am also adding all vm's to HA.

Bash:
#!/bin/bash
#2025 Generic Network Systems Proxmox team. Free to the world.
#This script adds VMs that are not in a pool to a Pool called NewVMs.
#Create the Pool and a Backup Job for it before using this script.

#List JSON vmids not in a pool in a single line with csv for bulk processing
notBackedUp=$(pvesh get /cluster/resources --type vm -output-format json | \
        jq -r '[.[] | select(.pool == null) | .vmid] | join(",")')
#List JSON vmids not setup for HA in multiple lines (due to no bulk adding feature)
noHaState=$(pvesh get /cluster/resources --type vm -output-format json | \
        jq -r '.[] | select(.hastate == null) | .vmid')

#Add vms to pool if no pool assigned
if [ -n "$notBackedUp" ]; then
        pvesh set /pools/NewVMs -vms $notBackedUp
fi

#Add VMs to HA using for loop
if [ -n "$noHaState" ]; then
        for vmid in $noHaState; do ha-manager add vm:"$vmid" --state started --max_relocate 2 --max_restart 2; done
fi

#log
echo $(date) " " "Added to NewVMs Pool:" "$notBackedUp" >> /var/log/newvms-pool.log
echo $(date) " " "Added to NewVMs to HA:" "$noHaState" | tr '\n' ',' >> /var/log/newvms-ha.log
 
Last edited:
.pool == null
Cool.
When I was working on this, I did not see this option. Thanks.

What I have works.
What you wrote is what I was originally trying to do, but couldn't find a direct way to evaluate Pool membership.

I don't think I'm going to change anything.
Everything on this page works and can be used as such or as an example about how to poke at and automate PVE Pool functions.
 
Last edited:
I just implemented this into my home environment also and it is working exactly as designed.

Thank you for this work, I have been pondering how I would achieve something like this myself.
 
  • Like
Reactions: tcabernoch