Sequential backup for VMs/CTs

Matthieu Le Corre · Mar 2, 2021

We are trying to backup our whole cluster to a PBS server.

This was working quite well, but since last 2 months, some backup started to failed ( qemu guest agent timeout, qmp command timeout ....),
It seems that for an known reason, the storage below the PBS datastore that is not fast enough (shared with other backup system).

But the real problem is that PVE is starting to backup all the nodes ( 24 nodes) at the same time !
This is too much for our system, and some backup fails.

Is there a way to limit the number of backup lauched at the same time ?
Maybe an option like for the bulk migration ?

We can of course split backup in 24 parts, but this is far from optimal.
Vzdump hook script can maybe play the waiting room role, but this is ones again not that nice ...

fabian · Mar 3, 2021

there is no built-in cross-node coordination for backup jobs. you can of course manually tune your start times to distribute the load, or implement bwlimits.. in the end, it sounds like the root cause is your backup system isn't powerful enough though?

spirit · Mar 3, 2021

Some kind of queue or lock could be great. I have differents customers with same problem, when you have big clusters with a lot nodes and a lot of vms, and big backup. You need fast network too for the backup server, if you have 24 nodes sending backup at the same time.
It really difficult to plan different schedules for specific vms/node.

https://bugzilla.proxmox.com/show_bug.cgi?id=3086

fabian · Mar 3, 2021

yeah, I maybe should have added a 'yet' to my post above

it should be possible to adapt the protocol to have server-side "backup slots" and queue jobs once all are full, and that would be an interesting feature indeed.

Matthieu Le Corre · Mar 3, 2021

I can't agree more about a server-side solution, PBS is really great but it miss a scheduling system for backups.

For now I splitted the job in 24 parts and it seems that issue is gone !
However, jobs length was around 2 hours, and now it take 8 hours as there are a lot's of idle time between different nodes jobs.

eXtremeSHOk · Jul 25, 2022

has any progress been made towards this ?

We are saturating a 20Gbit link with backups from the cluster and need to limit the number of concurrent nodes.

liberodark · Nov 2, 2022

Hi,

Any news about this ?
Because if you have some cluster is hard to do a backup is running all backup in all nodes in same time for create global backup policy is not easy...
Actually have only found a workaround create job node by node & node after node.
I hope you can add possibility to set limit for backup all nodes with 1 job without saturating network.

PS : Sorry to bother you @fabian do you think I have to open a ticket through the pbs subscription to move the subject forward or is it not worth it?

Best Regards

fabian · Nov 7, 2022

we know that this feature is important for quite a few users, and we have some ideas how to tackle it, but haven't settled on one yet. you can subscribe to https://bugzilla.proxmox.com/show_bug.cgi?id=3086 to get updates!

liberodark · Dec 6, 2022

Hello,
I haven't seen any such addition with PBS 2.3 and from what I've tested the behavior is the same.
This involves significant issues on the backup side as soon as we have a cluster or several clusters.
The subject is more and more sensitive on my side.
If you have any information on the progress of the subject?
PS: I have already consulted ticket 3086 (no news).

Best Regards

fabian · Dec 7, 2022

there hasn't been any progress, else the bug tracker would have been updated

thiagotgc · May 5, 2024

News on the subject?

We always suffer from this. The lack of queuing for one backup at a time between all nodes in the cluster is long awaited...

iverlaek · Jan 20, 2025

Same for me, It seems the (my) pbs server cannot cope with to many incoming streams from different servers.

Somehow a backup scheme per server that starts on the next server when the previous server finished the job.

I tried labeling every node by hand and put them in its pool=node name didn't work out as VMs will be moved around. It also makes the pooling mechanism useless.

EG. "maximum number of parallel nodes" in the backup definition screen would help.

Alternatively: Maybe someone could create a script that lists the VMs and CTs per node and then creates a backup script for backing up the ones on node 1 first, then node 2, then node 3 etc. I think this should be possible as there is script

I also have multiple backup locations (2 Synology NFS NAS, 1 PBS), so having a round robin scheme to use each storage location after another would help me keeping the number of backup definitions to a minimum.

spirit · Jan 21, 2025

iverlaek said:
Alternatively: Maybe someone could create a script that lists the VMs and CTs per node and then creates a backup script for backing up the ones on node 1 first, then node 2, then node 3 etc. I think this should be possible as there is script

you can do a stupid bash script doing:

ssh root@server1 'vzdump --all ...'
ssh root@server2 'vzdump --all ...'
ssh root@server3 'vzdump --all ...'

iverlaek · Jan 21, 2025

spirit said:
you can do a stupid bash script doing:

ssh root@server1 'vzdump --all ...'
ssh root@server2 'vzdump --all ...'
ssh root@server3 'vzdump --all ...'

Of course, stupid me. Will make a couple of cron jobs. Thanks

Nevertheless better support for a proper backup strategy in the proxmox and/or PBS UI would be beneficial for many users. eg. a way to create the famous 3-2-1 type of backups.

Search

Search

Sequential backup for VMs/CTs

Matthieu Le Corre

Renowned Member

fabian

Proxmox Staff Member

spirit

Distinguished Member

fabian

Proxmox Staff Member

Matthieu Le Corre

Renowned Member

eXtremeSHOk

Renowned Member

liberodark

Well-Known Member

fabian

Proxmox Staff Member

liberodark

Well-Known Member

fabian

Proxmox Staff Member

thiagotgc

Well-Known Member

iverlaek

Active Member

spirit

Distinguished Member

iverlaek

Active Member

We value your privacy