[SOLVED] Remote Sync Speed suddenly slow

pixelpoint

Member
Mar 25, 2021
19
1
8
35
Dear Proxmox Forum Community,

I have encountered a problem with the sync between local PBS and remote PBS.

The Setup
We run a Proxmox Backup Server internally for backing up Proxmox VE VMs (+ proxmox-backup-client for 2 smaller workstation hosts).
The internal PBS is of course behind a firewall (MikroTik), though except for NAT Port 8007 nothing else has been configured in the firewall regarding the Proxmox BS.
The internal PBS has been added as a Remote on our external PBS, which syncs the changes every day.
The external PBS also has another PBS (deployed @ customer) which has also been added as a Remote and is synced.
Proxmox-Backup-Client Benchmarks from some cloud host -> PBS WAN = top speed.

PBS LAN
- Is a virtual Machine on PVE
- RAID5 HDDs
- 6 CPUs
- 8GB RAM
- Latency PBS LAN -> PBS WAN ~ 50 ms

PBS WAN
- Is a physically real system somewhere in a datacenter
- RAID5 HDDs
- 8 CPUs
- 64 GB RAM

Some time ago (I don't exactly know when) the Remote Sync (LAN -> WAN) started to become very slow.
Since then, our PBS (LAN) -> PBS (WAN) nets ~ 1.1 MiB/s to 2 MiB/s, sometimes even a little more (our internet speed is up/down = 200mbit/200mbit).

This is an excerpt of the Syncjob running right now, which has started syncing @ 05:30 today.
Code:
2022-06-10T09:36:04+02:00: re-sync snapshot vm/104/2022-06-08T22:26:03Z done
2022-06-10T09:36:04+02:00: percentage done: 13.85% (5/43 groups, 22/23 snapshots in group #6)
2022-06-10T09:36:04+02:00: sync snapshot vm/104/2022-06-09T22:15:46Z
2022-06-10T09:36:04+02:00: sync archive qemu-server.conf.blob
2022-06-10T09:36:04+02:00: sync archive drive-scsi0.img.fidx
2022-06-10T09:44:49+02:00: downloaded 752783404 bytes (1.37 MiB/s)

1.37 MiB/s is about 11 Mbit/s, while our internet speed is 200 Mbit/s down + 200 Mbit/s up.

iperf3 PBS LAN -> PBS WAN
Code:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.24 MBytes  27.1 Mbits/sec  207    106 KBytes        
[  5]   1.00-2.00   sec  1.86 MBytes  15.6 Mbits/sec    4   89.1 KBytes        
[  5]   2.00-3.00   sec  1.86 MBytes  15.6 Mbits/sec    0    102 KBytes        
[  5]   3.00-4.00   sec  2.49 MBytes  20.9 Mbits/sec    0    117 KBytes        
[  5]   4.00-5.00   sec  2.49 MBytes  20.9 Mbits/sec    0    133 KBytes        
[  5]   5.00-6.00   sec  2.49 MBytes  20.9 Mbits/sec    0    147 KBytes        
[  5]   6.00-7.00   sec  2.49 MBytes  20.8 Mbits/sec   15    120 KBytes        
[  5]   7.00-8.00   sec  2.49 MBytes  20.9 Mbits/sec    0    141 KBytes        
[  5]   8.00-9.00   sec  2.49 MBytes  20.9 Mbits/sec   10   79.2 KBytes        
[  5]   9.00-10.00  sec  1.24 MBytes  10.4 Mbits/sec    0   91.9 KBytes        
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  23.1 MBytes  19.4 Mbits/sec  236             sender
[  5]   0.00-10.05  sec  22.0 MBytes  18.4 Mbits/sec                  receiver

iperf3 PBS WAN -> PBS LAN
Code:
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  4.47 MBytes  37.5 Mbits/sec                   
[  5]   1.00-2.00   sec  5.21 MBytes  43.7 Mbits/sec                   
[  5]   2.00-3.00   sec  4.42 MBytes  37.1 Mbits/sec                   
[  5]   3.00-4.00   sec  4.28 MBytes  35.9 Mbits/sec                   
[  5]   4.00-5.00   sec  4.49 MBytes  37.7 Mbits/sec                   
[  5]   5.00-6.00   sec  1.94 MBytes  16.3 Mbits/sec                   
[  5]   6.00-7.00   sec  5.36 MBytes  45.0 Mbits/sec                   
[  5]   7.00-8.00   sec  5.37 MBytes  45.1 Mbits/sec                   
[  5]   8.00-9.00   sec  3.11 MBytes  26.1 Mbits/sec                   
[  5]   9.00-10.00  sec  1.90 MBytes  15.9 Mbits/sec                   
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.05  sec  43.5 MBytes  36.3 Mbits/sec  767             sender
[  5]   0.00-10.00  sec  40.6 MBytes  34.0 Mbits/sec                  receiver

Proxmox Backup Client Benchmark PBS LAN -> PBS WAN
Code:
Uploaded 13 chunks in 25 seconds.
Time per request: 1998750 microseconds.
TLS speed: 2.10 MB/s
SHA256 speed: 247.86 MB/s
Compression speed: 310.05 MB/s
Decompress speed: 419.64 MB/s
AES256/GCM speed: 108.28 MB/s
Verify speed: 159.78 MB/s
┌───────────────────────────────────┬───────────────────┐
│ Name                              │ Value             │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ 2.10 MB/s (0%)    │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 247.86 MB/s (12%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed    │ 310.05 MB/s (41%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed  │ 419.64 MB/s (35%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed          │ 159.78 MB/s (21%) │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed       │ 108.28 MB/s (3%)  │
└───────────────────────────────────┴───────────────────┘

Proxmox Backup Client Benchmark PBS WAN -> PBS LAN
Code:
Uploaded 16 chunks in 8 seconds.
Time per request: 532943 microseconds.
TLS speed: 7.87 MB/s
SHA256 speed: 531.03 MB/s
Compression speed: 602.46 MB/s
Decompress speed: 771.34 MB/s
AES256/GCM speed: 1879.78 MB/s
Verify speed: 325.92 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 7.87 MB/s (1%)     │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 531.03 MB/s (26%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 602.46 MB/s (80%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 771.34 MB/s (64%)  │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed          │ 325.92 MB/s (43%)  │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1879.78 MB/s (52%) │
└───────────────────────────────────┴────────────────────┘

Proxmox Backup Client Benchmark cloud-host -> PBS WAN
Code:
Uploaded 94 chunks in 5 seconds.
Time per request: 57836 microseconds.
TLS speed: 72.52 MB/s
SHA256 speed: 354.83 MB/s
Compression speed: 531.84 MB/s
Decompress speed: 1055.62 MB/s
AES256/GCM speed: 2171.38 MB/s
Verify speed: 271.71 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 72.52 MB/s (6%)    │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 354.83 MB/s (18%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 531.84 MB/s (71%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 1055.62 MB/s (88%) │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed          │ 271.71 MB/s (36%)  │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 2171.38 MB/s (60%) │
└───────────────────────────────────┴────────────────────┘

Proxmox Backup Client PBS-Customer -> PBS WAN
Code:
Uploaded 16 chunks in 10 seconds.
Time per request: 640751 microseconds.
TLS speed: 6.55 MB/s
SHA256 speed: 269.04 MB/s
Compression speed: 338.45 MB/s
Decompress speed: 464.10 MB/s
AES256/GCM speed: 116.44 MB/s
Verify speed: 170.28 MB/s
┌───────────────────────────────────┬───────────────────┐
│ Name                              │ Value             │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ 6.55 MB/s (1%)    │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 269.04 MB/s (13%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed    │ 338.45 MB/s (45%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed  │ 464.10 MB/s (39%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed          │ 170.28 MB/s (22%) │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed       │ 116.44 MB/s (3%)  │
└───────────────────────────────────┴───────────────────┘

SecureCopy 5GB .txt File PBS LAN -> PBS WAN nets ~ 2MB/s (~ 16 Mbit/s)

So after checking both Hardware, Software and Firewall I am still not sure why exactly this is happening.
Maybe someone around here has a tipp for me?

Thanks for taking the time to read!
Best regards,
pixelpoint
 
your iperf output is also nowhere close to linespeed (but pretty close to the TLS benchmarks and actual results), especially the LAN -> WAN direction (20Mbit -> ~2.5MB/s). it also logs retries. maybe something is off in your LAN? since the cloud host to PBS WAN TLS benchmark is okay (~600Mbit/s), the WAN PBS is not the bottle neck..
 
Dear fabian,

thank you for your respsonse.
We're analysing our network right now and I will update as soon as we're finished.

Best regards,
pixelpoint
 
Ok, we found the "culprit": One of our employees uploaded a full-backup (2 TB) to the cloud, which took ~6 days to complete.
That's why the sync was limited in speed.

Regarding the iperf3 retries:
I tested again today (the full-backup upload has conclued already) and got this:

  • cloud server -> PBS WAN = 900 retries
  • LAN PC -> PBS WAN = 80 retries
  • LAN PC <-> PBS LAN = 5000 retries (tested in both directions)
  • PBS LAN -> PBS WAN = 180 retries
Do you maybe have a tipp for me on where to start looking for the cause of this cloud to PBS WAN thing?
900 retries seems like a number that's just WAY to big and I'm not sure on how to start diagnosing something like that in the cloud.

The 5k retries from LAN PC -> PBS LAN (and vice versa) are probably because of a switch in between LAN PBS and LAN PC (or a faulty NIC maybe, I'll look into that).

Proxmox Backup Client Benchmark PBS LAN -> PBS WAN
Code:
Uploaded 13 chunks in 11 seconds.
Time per request: 908481 microseconds.
TLS speed: 4.62 MB/s
SHA256 speed: 255.84 MB/s
Compression speed: 299.74 MB/s
Decompress speed: 385.41 MB/s
AES256/GCM speed: 103.26 MB/s
Verify speed: 142.12 MB/s
┌───────────────────────────────────┬───────────────────┐
│ Name                              │ Value             │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ 4.62 MB/s (0%)    │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 255.84 MB/s (13%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed    │ 299.74 MB/s (40%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed  │ 385.41 MB/s (32%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed          │ 142.12 MB/s (19%) │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed       │ 103.26 MB/s (3%)  │
└───────────────────────────────────┴───────────────────┘
Proxmox Backup Client Benchmark PBS WAN -> PBS LAN
Code:
Uploaded 13 chunks in 27 seconds.
Time per request: 2108709 microseconds.
TLS speed: 1.99 MB/s
SHA256 speed: 472.65 MB/s
Compression speed: 539.41 MB/s
Decompress speed: 757.89 MB/s
AES256/GCM speed: 1648.82 MB/s
Verify speed: 289.36 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 1.99 MB/s (0%)     │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 472.65 MB/s (23%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 539.41 MB/s (72%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 757.89 MB/s (63%)  │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed          │ 289.36 MB/s (38%)  │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1648.82 MB/s (45%) │
└───────────────────────────────────┴────────────────────┘

These are not optimal values, but the sync now finishes within a few hours, which is totally acceptable for us.

Thank you again for your help.
Best regards,
pixelpoint
 
but the speed is still slow - you just reduced the amount of data to be transferred? I can't really tell you what's going on with the retries - unless you have control over some of the hops in between (in which case you could "segment" the benchmark and see whether just some of the segments are affected). it might just be some backpressure/congestion control kicking in.
 
but the speed is still slow - you just reduced the amount of data to be transferred?
I actually didn't do anything different then the last time I tested.
I just set the environment variables for the Repo and Password and execute proxmox-backup-client benchmark
I just assumed the change in number of chunks sent is automatic and probably rises and falls with available bandwidth.
The transferred chunks in my first post are also different for some benchmarks, even though I copy-pasted commands without changing anything.

The speed is still slow, yes, but that's how it's been pretty much constantly.
I never found out why exactly that is, but because the sync always finished in a few hours, this problem wasn't too high on the priority list.

unless you have control over some of the hops in between
I only have control over things inside our own LAN.
The cloud servers and PBS WANs surrounding infrastructure are both outside of our reach, as they're just rented servers somewhere in a datacenter (both in EU).

Best regards,
pixelpoint
 
in that case I'd not worry too much about the retries unless you actually see issues that might be caused by them.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!