[SOLVED] huge difference between benchmark and real backups

May 5, 2021
6
0
6
32
Hi everyone,

First of all congratulations for PBS: it literally saved us during the massive fire of the OVH data center in Strasbourg.

Since several months we take daily backup of a 1TB LXC container (nextcloud) and everything's working fine, except the duration, which is always around 2:30' regardless the amount of incremental data there is to upload. So I looked at others discussions here and made some tests but I'm not sure about everything. I already understood that CT are entirely re-read at each backup so I'm investigating to migrate this container to a VM. But still, I try to understand if this is the only explanation here.

The usual result:

Bash:
100: 2021-05-04 02:01:32 INFO: root.pxar: had to backup 28.11 GiB of 965.61 GiB (compressed 9.96 GiB) in 9087.07s
100: 2021-05-04 02:01:32 INFO: root.pxar: average backup speed: 3.17 MiB/s
100: 2021-05-04 02:01:32 INFO: root.pxar: backup was done incrementally, reused 937.50 GiB (97.1%)

What is important here is the average of 3.17MiB/s.

Our PBS is in another data center, it's an old Qnap which has just what needs PBS for its minimal requirements (no SSD, not very powerful)... but 3.17MiB/s ? So I did two benchmarks:

From PVE to PBS

Bash:
➜  ~ proxmox-backup-client benchmark --repository root@pam@[remotePBS]:8007:[repository]

Uploaded 124 chunks in 5 seconds.
Time per request: 42582 microseconds.
TLS speed: 98.50 MB/s

┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 98.50 MB/s (8%)    │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 332.37 MB/s (16%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 441.44 MB/s (59%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 686.69 MB/s (57%)  │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed          │ 230.50 MB/s (30%)  │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1681.82 MB/s (46%) │
└───────────────────────────────────┴────────────────────┘

From PBS to itself

Bash:
root@pbs:~# proxmox-backup-client benchmark --repository [repository]

Uploaded 192 chunks in 5 seconds.
Time per request: 26597 microseconds.

┌───────────────────────────────────┬───────────────────┐
│ Name                              │ Value             │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ 157.70 MB/s (13%) │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 108.49 MB/s (5%)  │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed    │ 178.54 MB/s (24%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed  │ 225.02 MB/s (19%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed          │ 72.47 MB/s (10%)  │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed       │ 240.78 MB/s (7%)  │
└───────────────────────────────────┴───────────────────┘

Conclusions:
  1. There is an important difference between the two benchmarks. PVE seems to think that PBS is faster than it really is. Why?
  2. Even if we keep only the PBS's self test, its benchmark results are way faster than the real backup result. How is it possible?
Let's sum up: I know I'll have to migrate CT -> VM if I want a real incremental backup, but still the average speed of my backups are really slow compared to what benchmark are announcing.

Where can I investigate?

Regards,
 
Last edited:
the reported speed is "uploaded (logical) data/total time", so in your cause with most time spent reading data that is not uploaded, this speed will be rather low indeed since it represents the average transfer speed without accounting for compression. if we calculated the speed based on total logical data instead of transferred, we'd have the opposite problem (too high speeds reported) if the backup is of a VM with dirty-bitmaps, or the data is not read from disk but from cache.

the difference between the benchmark outputs is also easily explained - except for the TLS benchmark (which actually mimics a backup upload but with test data), the benchmark is only testing the machine where you execute the command. you likely have rather different CPUs on both ends, with the server being weaker. the TLS benchmark is faster when run locally on the PBS server, since your loopback interface (hopefully ;)) has lower latency and higher speed than your client -> server link.
 
I knew I missed something... Thank you very much!

Perhaps the benchmark command purpose isn't very clear: I thought it was sending the entire benchmark request to the repository's host... Except from the TLS speed, is it really helpful to benchmark a remote repository in that case?

Apart from the benchmark problem, maybe a suggestion in case the backup is from a LXC: output a different/more explicit message about what was really happening concerning the reading data story? Or just in the documentation.

I guess this post should be marked as resolved anyway, thanks again :)
 
I knew I missed something... Thank you very much!

Perhaps the benchmark command purpose isn't very clear: I thought it was sending the entire benchmark request to the repository's host... Except from the TLS speed, is it really helpful to benchmark a remote repository in that case?

yeah, the remote part only enables the TLS benchmark, nothing else.

Apart from the benchmark problem, maybe a suggestion in case the backup is from a LXC: output a different/more explicit message about what was really happening concerning the reading data story? Or just in the documentation.
we've been through a few iterations of how to display those stats, if you have any concrete suggestions feel free to post them ;)
I guess this post should be marked as resolved anyway, thanks again :)
you can do that yourself by editing your initial post in the thread! :)
 
we've been through a few iterations of how to display those stats, if you have any concrete suggestions feel free to post them ;)

Maybe something like that would lead to less confusion ?

Bash:
100: 2021-05-04 02:01:32 INFO: root.pxar: had to read 965.61 GiB
100: 2021-05-04 02:01:32 INFO: root.pxar: had to backup 28.11 GiB (compressed 9.96 GiB)
100: 2021-05-04 02:01:32 INFO: root.pxar: total spent time: 9087.07s, average speed: 3.17 MiB/s
100: 2021-05-04 02:01:32 INFO: root.pxar: backup was done incrementally, reused 937.50 GiB (97.1%)


Thanks for being understanding :)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!