[SOLVED] Small chunk size = slow remote sync

mhaluska

Well-Known Member
Sep 23, 2018
53
6
48
Is planned to have possibility define chunk size? Due to small chunk size, remote sync is really slow.

On 300Mbps line, sync was reaching max 28Mbps (latency 23ms).
PBS client/server: 0.8.13

Backup to local PBS -> 14s:
Code:
2020-09-03T16:15:55+02:00: starting new backup on datastore 'HOME-PVE': "ct/666252/2020-09-03T14:15:55Z"
2020-09-03T16:15:55+02:00: GET /previous: 400 Bad Request: no previous backup
2020-09-03T16:15:55+02:00: add blob "/mnt/pbs-backup/local/home-pve/ct/666252/2020-09-03T14:15:55Z/pct.conf.blob" (391 bytes, comp: 391)
2020-09-03T16:15:55+02:00: created new dynamic index 1 ("ct/666252/2020-09-03T14:15:55Z/catalog.pcat1.didx")
2020-09-03T16:15:55+02:00: created new dynamic index 2 ("ct/666252/2020-09-03T14:15:55Z/root.pxar.didx")
2020-09-03T16:16:09+02:00: Upload statistics for 'root.pxar.didx'
2020-09-03T16:16:09+02:00: UUID: 3c1b06de027a4708822903bfd989439d
2020-09-03T16:16:09+02:00: Checksum: 94e122b816f18fb0496046794937eb1534d7b3fd1c4ab435b8507982ad9c32db
2020-09-03T16:16:09+02:00: Size: 1433917065
2020-09-03T16:16:09+02:00: Chunk count: 372
2020-09-03T16:16:09+02:00: Upload size: 1402519050 (97%)
2020-09-03T16:16:09+02:00: Duplicates: 12+0 (3%)
2020-09-03T16:16:09+02:00: Compression: 46%
2020-09-03T16:16:09+02:00: successfully closed dynamic index 2
2020-09-03T16:16:09+02:00: Upload statistics for 'catalog.pcat1.didx'
2020-09-03T16:16:09+02:00: UUID: ea97fa3f6af84f98bc6691d4751809b3
2020-09-03T16:16:09+02:00: Checksum: 66593bbfea7831b6a113f7768b49205a661aac20b76565f70167cb8537dfe3a6
2020-09-03T16:16:09+02:00: Size: 681797
2020-09-03T16:16:09+02:00: Chunk count: 2
2020-09-03T16:16:09+02:00: Upload size: 681797 (100%)
2020-09-03T16:16:09+02:00: Compression: 42%
2020-09-03T16:16:09+02:00: successfully closed dynamic index 1
2020-09-03T16:16:09+02:00: add blob "/mnt/pbs-backup/local/home-pve/ct/666252/2020-09-03T14:15:55Z/index.json.blob" (412 bytes, comp: 412)
2020-09-03T16:16:09+02:00: successfully finished backup
2020-09-03T16:16:09+02:00: backup finished successfully
2020-09-03T16:16:09+02:00: TASK OK

Sync PBS to remote -> almost 3m39s (around 1 chunk per second):
Code:
2020-09-03T16:18:53+02:00: Starting datastore sync job 'pbs1-home-pve'
2020-09-03T16:18:53+02:00: Sync datastore 'PBS1-HOME-PVE' from 'vm-pbs1/HOME-PVE'
2020-09-03T16:18:53+02:00: sync snapshot "ct/666252/2020-09-03T14:15:55Z"
2020-09-03T16:18:53+02:00: sync archive pct.conf.blob
2020-09-03T16:18:53+02:00: sync archive root.pxar.didx
2020-09-03T16:22:32+02:00: sync archive catalog.pcat1.didx
2020-09-03T16:22:32+02:00: got backup log file "client.log.blob"
2020-09-03T16:22:32+02:00: sync snapshot "ct/666252/2020-09-03T14:15:55Z" done
2020-09-03T16:22:32+02:00: delete vanished group 'ct/666251'
2020-09-03T16:22:32+02:00: sync job 'pbs1-home-pve' end
2020-09-03T16:22:32+02:00: TASK OK

With this sync speed, I'll be syncing my initial backup few days.

Thanks
 
Adding part of log on local side where is visible slow chunk download:
Code:
2020-09-03T16:18:53+02:00: starting new backup reader datastore 'HOME-PVE': "/mnt/pbs-backup/local/home-pve"
2020-09-03T16:18:53+02:00: protocol upgrade done
2020-09-03T16:18:53+02:00: GET /download
2020-09-03T16:18:53+02:00: download "/mnt/pbs-backup/local/home-pve/ct/666252/2020-09-03T14:15:55Z/index.json.blob"
2020-09-03T16:18:53+02:00: GET /download
2020-09-03T16:18:53+02:00: download "/mnt/pbs-backup/local/home-pve/ct/666252/2020-09-03T14:15:55Z/pct.conf.blob"
2020-09-03T16:18:53+02:00: GET /download
2020-09-03T16:18:53+02:00: download "/mnt/pbs-backup/local/home-pve/ct/666252/2020-09-03T14:15:55Z/root.pxar.didx"
2020-09-03T16:18:53+02:00: GET /chunk
2020-09-03T16:18:53+02:00: download chunk "/mnt/pbs-backup/local/home-pve/.chunks/5da9/5da9a61ff8521a13d2b071f41bea012972553ce46cf1b81d2a9ce0b2f944fd9d"
2020-09-03T16:18:54+02:00: GET /chunk
2020-09-03T16:18:54+02:00: download chunk "/mnt/pbs-backup/local/home-pve/.chunks/92a2/92a201c1301f641aced62e3678186f2c7fbeb0f91c8c9a19c84ccbb93ace8d72"
2020-09-03T16:18:54+02:00: GET /chunk
2020-09-03T16:18:54+02:00: download chunk "/mnt/pbs-backup/local/home-pve/.chunks/7dbe/7dbefab7e76ae37175873bff087a29e8a5207891e2fb401d862c26bb325b271a"
2020-09-03T16:18:54+02:00: GET /chunk
2020-09-03T16:18:54+02:00: download chunk "/mnt/pbs-backup/local/home-pve/.chunks/b5c9/b5c9af7469b6d77cdfbf52a33c4906c3a8f6f3f1a4040cc1e0478ae870569ad9"
2020-09-03T16:18:56+02:00: GET /chunk
2020-09-03T16:18:56+02:00: download chunk "/mnt/pbs-backup/local/home-pve/.chunks/640a/640a1e1968ffcae1d194f70c0baa79aad763a01367ce7a48d79f3d23213c3743"
2020-09-03T16:18:56+02:00: GET /chunk
2020-09-03T16:18:56+02:00: download chunk "/mnt/pbs-backup/local/home-pve/.chunks/4868/486876e99665e817bbb4fa67aec624c38ea40b37886d64572e7f9733a03d139d"
2020-09-03T16:18:57+02:00: GET /chunk
2020-09-03T16:18:57+02:00: download chunk "/mnt/pbs-backup/local/home-pve/.chunks/6226/6226ba07359cb87da2db1fa968c05c742b9724761212676aa90b27fb6940b42c"
...
...
 
it's not slow because of the small chunk size.. how fast is the PBS instance doing the sync? can you do a benchmark there?
 
it's not slow because of the small chunk size.. how fast is the PBS instance doing the sync? can you do a benchmark there?

On router I can see max 28Mbps. On PBS doing sync, in reports I can see around 3MB/s. Of course it's failing after 2 hours, but I successfully finished initial sync of my backup test data (around 70GB), sync restarted manually after 2hrs fail.

If you can tell me how you want to measure sync, I can prepare another backups for test and send output.
 
you can just use the proxmox-backup-client benchmark feature to get a rough impression how fast your PBS systems are.
 
Destination PBS:

Code:
# proxmox-backup-client benchmark --repository 10.50.51.11:HOME-PVE
Uploaded 224 chunks in 5 seconds.
Time per request: 22532 microseconds.
TLS speed: 186.14 MB/s
SHA256 speed: 209.98 MB/s
Compression speed: 578.09 MB/s
Decompress speed: 2084.20 MB/s
AES256/GCM speed: 655.03 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 186.14 MB/s (32%)  │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 209.98 MB/s (10%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 578.09 MB/s (27%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 2084.20 MB/s (26%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 655.03 MB/s (17%)  │
└───────────────────────────────────┴────────────────────┘

Source PBS:

Code:
# proxmox-backup-client benchmark --repository 10.20.3.99:HOME-PVE
Uploaded 698 chunks in 5 seconds.
Time per request: 7183 microseconds.
TLS speed: 583.91 MB/s
SHA256 speed: 436.20 MB/s
Compression speed: 1073.79 MB/s
Decompress speed: 3451.78 MB/s
AES256/GCM speed: 2212.10 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 583.91 MB/s (99%)  │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 436.20 MB/s (21%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1073.79 MB/s (50%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3451.78 MB/s (43%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 2212.10 MB/s (58%) │
└───────────────────────────────────┴────────────────────┘
 
Destination -> Source PBS:

Code:
# proxmox-backup-client benchmark --repository 10.20.3.99:HOME-PVE
Uploaded 13 chunks in 19 seconds.
Time per request: 1490096 microseconds.
TLS speed: 2.81 MB/s
SHA256 speed: 183.53 MB/s
Compression speed: 604.03 MB/s
Decompress speed: 2043.62 MB/s
AES256/GCM speed: 593.55 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 2.81 MB/s (0%)     │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 183.53 MB/s (9%)   │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 604.03 MB/s (28%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 2043.62 MB/s (25%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 593.55 MB/s (16%)  │
└───────────────────────────────────┴────────────────────┘

Source -> Destination PBS:

Code:
# proxmox-backup-client benchmark --repository 10.50.51.11:HOME-PVE
Uploaded 13 chunks in 16 seconds.
Time per request: 1241470 microseconds.
TLS speed: 3.38 MB/s
SHA256 speed: 438.95 MB/s
Compression speed: 1147.54 MB/s
Decompress speed: 3740.22 MB/s
AES256/GCM speed: 1977.26 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 3.38 MB/s (1%)     │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 438.95 MB/s (21%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1147.54 MB/s (53%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3740.22 MB/s (46%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1977.26 MB/s (52%) │
└───────────────────────────────────┴────────────────────┘

I see now, probably it's latency issue (ping ~20ms) because it's in another location.
 
sounds like there is some network/TLS issue in your network then ;)
 
sounds like there is some network/TLS issue in your network then ;)

Only TLS can be on PBS itself ;)
There is no network issue, everything is fine, except PBS cross-site performance. There is cross-site IPsec tunnel, scp is giving me 25.4MB/s on 1GB test file, router CPU running at 10% avg during this test. PBS sync test using WAN IP directly is giving me same poor results.

Later I'll try on-site tests with different latency emulation and we'll see.

Anyway, thanks for help and good luck with future PBS development, it looks really promising ;)
 
So, seems your backup software performance is really heavy dependent on latency. This is not so nice for sync/backups over WAN. And also confirming there is no issue on my side :p

Test systems are on same VLAN, running on PVE using same bridge interface. Both are PBS with latest packages, 4 vCPU and 4GB RAM. As you can see also performance with 5ms latency already "sucks" ;)

Adding latency to system:
tc qdisc add dev ens18 root netem delay <NUMBER>ms

25ms:
Code:
Uploaded 10 chunks in 16 seconds.
Time per request: 1698077 microseconds.
TLS speed: 2.47 MB/s
SHA256 speed: 428.81 MB/s
Compression speed: 1152.45 MB/s
Decompress speed: 3735.80 MB/s
AES256/GCM speed: 1979.45 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 2.47 MB/s (0%)     │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 428.81 MB/s (20%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1152.45 MB/s (53%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3735.80 MB/s (46%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1979.45 MB/s (52%) │
└───────────────────────────────────┴────────────────────┘

10ms:
Code:
Uploaded 16 chunks in 11 seconds.
Time per request: 715939 microseconds.
TLS speed: 5.86 MB/s
SHA256 speed: 431.57 MB/s
Compression speed: 1153.24 MB/s
Decompress speed: 3719.71 MB/s
AES256/GCM speed: 1976.92 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 5.86 MB/s (1%)     │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 431.57 MB/s (20%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1153.24 MB/s (53%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3719.71 MB/s (46%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1976.92 MB/s (52%) │
└───────────────────────────────────┴────────────────────┘

5ms:
Code:
Uploaded 21 chunks in 7 seconds.
Time per request: 368201 microseconds.
TLS speed: 11.39 MB/s
SHA256 speed: 433.42 MB/s
Compression speed: 1156.27 MB/s
Decompress speed: 3750.29 MB/s
AES256/GCM speed: 1970.52 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 11.39 MB/s (2%)    │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 433.42 MB/s (20%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1156.27 MB/s (54%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3750.29 MB/s (47%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1970.52 MB/s (52%) │
└───────────────────────────────────┴────────────────────┘

1ms:
Code:
Uploaded 67 chunks in 5 seconds.
Time per request: 84826 microseconds.
TLS speed: 49.45 MB/s
SHA256 speed: 434.19 MB/s
Compression speed: 1147.26 MB/s
Decompress speed: 3767.07 MB/s
AES256/GCM speed: 1989.25 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 49.45 MB/s (8%)    │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 434.19 MB/s (20%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1147.26 MB/s (53%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3767.07 MB/s (47%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1989.25 MB/s (52%) │
└───────────────────────────────────┴────────────────────┘

without limit (~0.3ms):
Code:
Uploaded 537 chunks in 5 seconds.
Time per request: 9356 microseconds.
TLS speed: 448.29 MB/s
SHA256 speed: 435.31 MB/s
Compression speed: 1098.92 MB/s
Decompress speed: 3554.11 MB/s
AES256/GCM speed: 2231.60 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 448.29 MB/s (76%)  │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 435.31 MB/s (21%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1098.92 MB/s (51%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3554.11 MB/s (44%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 2231.60 MB/s (59%) │
└───────────────────────────────────┴────────────────────┘
 
  • Like
Reactions: guletz
I have the same problem. Incredibly slow: 500kb/s download from remote over a 10 Mbit/s line and fast idleing servers.
 
there's a bug affecting higher latency links, like I indicated in the bug entry we can reproduce it and are investigating
 
Looks latest version 0.8.15-1 is working fine now. This on 300/300Mbps line to PBS over IPsec tunnel and 20ms latency. Nice work ;)
Code:
Uploaded 37 chunks in 6 seconds.
Time per request: 167692 microseconds.
TLS speed: 25.01 MB/s
SHA256 speed: 433.44 MB/s
Compression speed: 1079.73 MB/s
Decompress speed: 3394.75 MB/s
AES256/GCM speed: 2226.12 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 25.01 MB/s (4%)    │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 433.44 MB/s (20%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 1079.73 MB/s (50%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 3394.75 MB/s (42%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 2226.12 MB/s (59%) │
└───────────────────────────────────┴────────────────────┘
 
  • Like
Reactions: TwiX
that one was definitely a team effort (with special thanks to @dcsapak). glad it also showed the desired results in this real-world scenario :)
 
  • Like
Reactions: TwiX and mhaluska