Bandwidth throttling for PVE backup job

His.Dudeness

Member
Mar 8, 2020
46
4
13
Hi,

I got an issue with backups from my proxmox PVE system to a file share. The problem is, that the file share is on an older model HPE MicroServer. If I write to it over a long period of time with gigabit speed it tends to „take a break“ to write all the data to the disks. When copying from another windows box I can see that sometimes the transfer rate drops to nearly zero and then it resumes – no big deal.

PVE on the other hand seems to mind quite a bit when the backup server stops responding for a few seconds.

All the cifs shares mounted on the backup server get a little grey "?" in the PVE web UI for a minute or two and the backup job hangs completely. It won’t even stop with “vzdump –stop” and I have to unlock the VM (qm unlock) and reboot the PVE host.

By searching the forum I found out that I can throttle the backup speed with the --bwlimit parameter but that makes the backup very slow. A job that runs without throttle 10 minutes over 1 GBE takes almost 2 hours with a 50MB/s throttle which makes no sense to me.

It seems to me that the --bwlimit parameter limits the reading from disk and not the writing to the target file?
Is there a way to tell PVE to wait a little longer before crashing the backup job when the target share is not the fastest one?

Any ideas how to cope with that except from buying new backup hardware?


Cheers
Michael
 
By searching the forum I found out that I can throttle the backup speed with the --bwlimit parameter but that makes the backup very slow. A job that runs without throttle 10 minutes over 1 GBE takes almost 2 hours with a 50MB/s throttle which makes no sense to me.
Yes, that is correct. It limits I/O bandwidth.

Do you use a seperated Network (VLAN?) for backups, or have the possibility to do so?

If so you could create a VLAN or another network, add the VLAN interface, or network interface to proxmox and limit the bandwidth with tc or wondershaper for example.

Although i don't like fixing symptoms of issues.
So i would suggest to find out why the backup target acts like that and either provide it with more cache, change the caching algorithm, or simply disable the cache.
I can't imagine something else causing this issue. But i think you should figure it out.
 
2 things..

1) If you have enough space on a local disk, back it up locally and then copy it across with the backup job hook.

2) Depending on what you are backing up, maybe get the free version of veeam and install the agent on the VM and to incremental backups.
 
@AKA veeam has a free tier.... but you make a great point, I need to checkout the backup server, it may be the better option.
 
Hi @ all!

thanks for your help!

@Rassillon: I am already using VeeAm agent for the windows VMs. I also used Veeam to migrate the VMs from XenServer to PVE :) The downside is that I have to manage and monitor the jobs on each VM individually. It’s ok as there are only a few but a complete VM backup now and then for desaster recovery would sure be nice.

@AKA: Proxmox Backup Server might be nice, but afaik it only runs on linux. My backup server is a windows box and unfortunately it needs to stay that way because it also does a couple of other things.
And yes: I also think the right thing to to would be to buy a faster backup target that can handle to write gigabit speed on disk over a long period of time.

One thing makes me wonder: If the job fails it ALWAYS fails (or hangs) AFTER the backup of a VM.

e.g.:

Code:
INFO: 93% (200.0 GiB of 215.0 GiB) in 2h 11m 23s, read: 39.1 MiB/s, write: 449.0 KiB/s
INFO: 94% (202.1 GiB of 215.0 GiB) in 2h 12m 19s, read: 39.5 MiB/s, write: 39.5 MiB/s
INFO: 95% (204.3 GiB of 215.0 GiB) in 2h 13m 15s, read: 39.1 MiB/s, write: 19.9 MiB/s
INFO: 96% (206.4 GiB of 215.0 GiB) in 2h 14m 11s, read: 39.9 MiB/s, write: 0 B/s
INFO: 97% (208.6 GiB of 215.0 GiB) in 2h 15m 6s, read: 39.9 MiB/s, write: 0 B/s
INFO: 98% (210.7 GiB of 215.0 GiB) in 2h 16m 1s, read: 39.9 MiB/s, write: 0 B/s
INFO: 99% (212.9 GiB of 215.0 GiB) in 2h 16m 57s, read: 39.2 MiB/s, write: 0 B/s
INFO: 100% (215.0 GiB of 215.0 GiB) in 2h 17m 52s, read: 39.8 MiB/s, write: 74.0 B/s
INFO: backup is sparse: 68.77 GiB (31%) total zero data
INFO: transferred 215.00 GiB in 8272 seconds (26.6 MiB/s)

At that point it just stops and would stay there indefinitely if I don’t unlock the VM and reboot the host.

I mean even if the backup target is slow – completely locking up and then the need to reboot the host is certainly not a desired behavior I guess?
Is there no way to tell PVE just to wait a little longer before crashing the job?

cheers
Michael
 
Can't u just setup a VM on your proxmox host with proxmox backup server and use the cifs share as a datastore? That'd solve the issue. I'm doing that with a synology HA setup at a clients place too. works fine.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!