Sequential backup for VMs/CTs

Matthieu Le Corre

Renowned Member
Apr 18, 2016
33
2
73
Nantes - France
We are trying to backup our whole cluster to a PBS server.

This was working quite well, but since last 2 months, some backup started to failed ( qemu guest agent timeout, qmp command timeout ....),
It seems that for an known reason, the storage below the PBS datastore that is not fast enough (shared with other backup system).

But the real problem is that PVE is starting to backup all the nodes ( 24 nodes) at the same time !
This is too much for our system, and some backup fails.

Is there a way to limit the number of backup lauched at the same time ?
Maybe an option like for the bulk migration ?

We can of course split backup in 24 parts, but this is far from optimal.
Vzdump hook script can maybe play the waiting room role, but this is ones again not that nice ...
 
there is no built-in cross-node coordination for backup jobs. you can of course manually tune your start times to distribute the load, or implement bwlimits.. in the end, it sounds like the root cause is your backup system isn't powerful enough though?
 
Some kind of queue or lock could be great. I have differents customers with same problem, when you have big clusters with a lot nodes and a lot of vms, and big backup. You need fast network too for the backup server, if you have 24 nodes sending backup at the same time.
It really difficult to plan different schedules for specific vms/node.

https://bugzilla.proxmox.com/show_bug.cgi?id=3086
 
yeah, I maybe should have added a 'yet' to my post above ;) it should be possible to adapt the protocol to have server-side "backup slots" and queue jobs once all are full, and that would be an interesting feature indeed.
 
I can't agree more about a server-side solution, PBS is really great but it miss a scheduling system for backups.

For now I splitted the job in 24 parts and it seems that issue is gone !
However, jobs length was around 2 hours, and now it take 8 hours as there are a lot's of idle time between different nodes jobs.
 
has any progress been made towards this ?

We are saturating a 20Gbit link with backups from the cluster and need to limit the number of concurrent nodes.
 
  • Like
Reactions: Pca
Hi,

Any news about this ?
Because if you have some cluster is hard to do a backup is running all backup in all nodes in same time for create global backup policy is not easy...
Actually have only found a workaround create job node by node & node after node.
I hope you can add possibility to set limit for backup all nodes with 1 job without saturating network.

PS : Sorry to bother you @fabian do you think I have to open a ticket through the pbs subscription to move the subject forward or is it not worth it?

Best Regards
 
Last edited:
  • Like
Reactions: Matthieu Le Corre
Hello,
I haven't seen any such addition with PBS 2.3 and from what I've tested the behavior is the same.
This involves significant issues on the backup side as soon as we have a cluster or several clusters.
The subject is more and more sensitive on my side.
If you have any information on the progress of the subject?
PS: I have already consulted ticket 3086 (no news).

Best Regards
 
there hasn't been any progress, else the bug tracker would have been updated ;)
 
News on the subject?

We always suffer from this. The lack of queuing for one backup at a time between all nodes in the cluster is long awaited...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!