S3 Upload Fails with Large VMs (Cache Issue?)

Hello,
I have a very similar problem. After upgrading to v4, I wanted to use S3 storage as a secondary backup.

S3 storage used: wasabi.com
Configuration: no problem
Upload speed: 100Mbit/s (my provider limit)

When backing up a smaller VM (32GB), everything works fine. However, when I try a larger one (approx. 1.7TB), the job gets stuck at 100% and does not complete. Interestingly, at this point, PBS enters a strange state - PVE loses access to the API (the backup appears to be unavailable), and even the UI in PBS is unavailable. When I go to the terminal, PBS appears to be working. But only after rebooting (or systemctl restart proxmox-backup-proxy) does PBS stabilize the state - the job will not be completed.

FULL Job log in PVE
INFO: starting new backup job: vzdump 100 --storage backup-s3 --notification-mode notification-system --notes-template '{{guestname}}' --remove 0 --node px --mode snapshot
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2025-08-15 21:15:10
INFO: status = running
INFO: VM Name: trinity
INFO: include disk 'scsi0' 'local-zfs:vm-100-disk-0' 180G
INFO: include disk 'scsi1' 'bpool:vm-100-disk-0' 1500G
INFO: include disk 'scsi2' 'local-zfs:vm-100-disk-1' 100G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/100/2025-08-15T19:15:10Z'
INFO: enabling encryption
INFO: skipping guest-agent 'fs-freeze', disabled in VM options
INFO: started backup task 'f7c90457-5e9a-49a3-af82-739d11495d8c'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: scsi1: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: scsi2: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: 0% (1.1 GiB of 1.7 TiB) in 3s, read: 374.7 MiB/s, write: 336.0 MiB/s
INFO: 1% (18.0 GiB of 1.7 TiB) in 54s, read: 339.7 MiB/s, write: 325.8 MiB/s
INFO: 2% (35.8 GiB of 1.7 TiB) in 1m 45s, read: 356.1 MiB/s, write: 332.6 MiB/s
INFO: 3% (53.7 GiB of 1.7 TiB) in 2m 34s, read: 375.5 MiB/s, write: 334.4 MiB/s
INFO: 4% (72.4 GiB of 1.7 TiB) in 3m 1s, read: 706.8 MiB/s, write: 251.4 MiB/s
INFO: 5% (90.2 GiB of 1.7 TiB) in 3m 28s, read: 675.0 MiB/s, write: 241.3 MiB/s
INFO: 6% (107.0 GiB of 1.7 TiB) in 4m 20s, read: 330.8 MiB/s, write: 131.1 MiB/s
INFO: 7% (124.6 GiB of 1.7 TiB) in 6m 40s, read: 129.1 MiB/s, write: 124.5 MiB/s
INFO: 8% (142.6 GiB of 1.7 TiB) in 8m 48s, read: 143.6 MiB/s, write: 141.7 MiB/s
INFO: 9% (160.3 GiB of 1.7 TiB) in 11m 4s, read: 133.7 MiB/s, write: 131.8 MiB/s
INFO: 10% (178.1 GiB of 1.7 TiB) in 13m 4s, read: 151.6 MiB/s, write: 149.5 MiB/s
INFO: 11% (195.9 GiB of 1.7 TiB) in 14m 50s, read: 172.2 MiB/s, write: 166.5 MiB/s
INFO: 12% (213.7 GiB of 1.7 TiB) in 16m 12s, read: 221.7 MiB/s, write: 129.6 MiB/s
INFO: 13% (233.7 GiB of 1.7 TiB) in 18m 8s, read: 176.7 MiB/s, write: 85.6 MiB/s
INFO: 14% (249.9 GiB of 1.7 TiB) in 18m 13s, read: 3.2 GiB/s, write: 26.4 MiB/s
INFO: 15% (270.8 GiB of 1.7 TiB) in 18m 19s, read: 3.5 GiB/s, write: 22.0 MiB/s
INFO: 16% (286.8 GiB of 1.7 TiB) in 18m 24s, read: 3.2 GiB/s, write: 27.2 MiB/s
INFO: 17% (305.1 GiB of 1.7 TiB) in 18m 30s, read: 3.1 GiB/s, write: 22.0 MiB/s
INFO: 18% (323.5 GiB of 1.7 TiB) in 18m 36s, read: 3.1 GiB/s, write: 24.7 MiB/s
INFO: 19% (338.6 GiB of 1.7 TiB) in 18m 41s, read: 3.0 GiB/s, write: 26.4 MiB/s
INFO: 20% (356.7 GiB of 1.7 TiB) in 18m 47s, read: 3.0 GiB/s, write: 26.7 MiB/s
INFO: 21% (375.9 GiB of 1.7 TiB) in 18m 54s, read: 2.7 GiB/s, write: 34.9 MiB/s
INFO: 22% (393.2 GiB of 1.7 TiB) in 19m, read: 2.9 GiB/s, write: 22.0 MiB/s
INFO: 23% (409.5 GiB of 1.7 TiB) in 19m 5s, read: 3.3 GiB/s, write: 15.2 MiB/s
INFO: 24% (430.8 GiB of 1.7 TiB) in 19m 12s, read: 3.0 GiB/s, write: 20.6 MiB/s
INFO: 25% (445.7 GiB of 1.7 TiB) in 19m 17s, read: 3.0 GiB/s, write: 27.2 MiB/s
INFO: 26% (466.4 GiB of 1.7 TiB) in 19m 23s, read: 3.5 GiB/s, write: 1.3 MiB/s
INFO: 27% (483.7 GiB of 1.7 TiB) in 19m 29s, read: 2.9 GiB/s, write: 22.7 MiB/s
INFO: 28% (500.7 GiB of 1.7 TiB) in 19m 35s, read: 2.8 GiB/s, write: 26.7 MiB/s
INFO: 29% (516.7 GiB of 1.7 TiB) in 19m 41s, read: 2.7 GiB/s, write: 29.3 MiB/s
INFO: 30% (535.5 GiB of 1.7 TiB) in 19m 47s, read: 3.1 GiB/s, write: 12.0 MiB/s
INFO: 31% (554.8 GiB of 1.7 TiB) in 19m 53s, read: 3.2 GiB/s, write: 1.3 MiB/s
INFO: 32% (571.0 GiB of 1.7 TiB) in 20m, read: 2.3 GiB/s, write: 20.0 MiB/s
INFO: 33% (588.8 GiB of 1.7 TiB) in 20m 6s, read: 3.0 GiB/s, write: 22.0 MiB/s
INFO: 34% (606.8 GiB of 1.7 TiB) in 20m 12s, read: 3.0 GiB/s, write: 22.7 MiB/s
INFO: 35% (624.8 GiB of 1.7 TiB) in 20m 17s, read: 3.6 GiB/s, write: 1.6 MiB/s
INFO: 36% (642.7 GiB of 1.7 TiB) in 20m 22s, read: 3.6 GiB/s, write: 819.2 KiB/s
INFO: 37% (660.8 GiB of 1.7 TiB) in 20m 27s, read: 3.6 GiB/s, write: 1.6 MiB/s
INFO: 38% (677.7 GiB of 1.7 TiB) in 20m 32s, read: 3.4 GiB/s, write: 3.2 MiB/s
INFO: 39% (695.4 GiB of 1.7 TiB) in 20m 37s, read: 3.5 GiB/s, write: 4.0 MiB/s
INFO: 40% (713.8 GiB of 1.7 TiB) in 20m 47s, read: 1.8 GiB/s, write: 137.2 MiB/s
INFO: 41% (730.9 GiB of 1.7 TiB) in 20m 53s, read: 2.8 GiB/s, write: 100.7 MiB/s
INFO: 42% (748.7 GiB of 1.7 TiB) in 20m 58s, read: 3.6 GiB/s, write: 3.2 MiB/s
INFO: 43% (766.0 GiB of 1.7 TiB) in 21m 4s, read: 2.9 GiB/s, write: 22.7 MiB/s
INFO: 44% (783.8 GiB of 1.7 TiB) in 21m 9s, read: 3.6 GiB/s, write: 1.6 MiB/s
INFO: 45% (801.5 GiB of 1.7 TiB) in 21m 14s, read: 3.5 GiB/s, write: 1.6 MiB/s
INFO: 46% (819.6 GiB of 1.7 TiB) in 21m 20s, read: 3.0 GiB/s, write: 22.7 MiB/s
INFO: 47% (836.7 GiB of 1.7 TiB) in 21m 26s, read: 2.8 GiB/s, write: 34.0 MiB/s
INFO: 48% (855.9 GiB of 1.7 TiB) in 21m 52s, read: 755.8 MiB/s, write: 22.5 MiB/s
INFO: 49% (873.1 GiB of 1.7 TiB) in 21m 58s, read: 2.9 GiB/s, write: 22.7 MiB/s
INFO: 50% (890.8 GiB of 1.7 TiB) in 22m 4s, read: 3.0 GiB/s, write: 22.0 MiB/s
INFO: 51% (908.5 GiB of 1.7 TiB) in 22m 11s, read: 2.5 GiB/s, write: 19.4 MiB/s
INFO: 52% (926.6 GiB of 1.7 TiB) in 22m 17s, read: 3.0 GiB/s, write: 22.7 MiB/s
INFO: 53% (945.5 GiB of 1.7 TiB) in 22m 23s, read: 3.1 GiB/s, write: 22.0 MiB/s
INFO: 54% (964.7 GiB of 1.7 TiB) in 22m 29s, read: 3.2 GiB/s, write: 23.3 MiB/s
INFO: 55% (980.8 GiB of 1.7 TiB) in 22m 35s, read: 2.7 GiB/s, write: 35.3 MiB/s
INFO: 56% (996.9 GiB of 1.7 TiB) in 22m 41s, read: 2.7 GiB/s, write: 30.0 MiB/s
INFO: 57% (1014.8 GiB of 1.7 TiB) in 22m 46s, read: 3.6 GiB/s, write: 2.4 MiB/s
INFO: 58% (1.0 TiB of 1.7 TiB) in 22m 52s, read: 3.0 GiB/s, write: 22.0 MiB/s
INFO: 59% (1.0 TiB of 1.7 TiB) in 22m 58s, read: 3.1 GiB/s, write: 22.7 MiB/s
INFO: 60% (1.0 TiB of 1.7 TiB) in 23m 4s, read: 2.9 GiB/s, write: 22.0 MiB/s
INFO: 61% (1.1 TiB of 1.7 TiB) in 23m 10s, read: 3.0 GiB/s, write: 23.3 MiB/s
INFO: 62% (1.1 TiB of 1.7 TiB) in 23m 16s, read: 3.0 GiB/s, write: 21.3 MiB/s
INFO: 63% (1.1 TiB of 1.7 TiB) in 23m 25s, read: 1.8 GiB/s, write: 110.7 MiB/s
INFO: 64% (1.1 TiB of 1.7 TiB) in 23m 37s, read: 1.6 GiB/s, write: 128.0 MiB/s
INFO: 65% (1.1 TiB of 1.7 TiB) in 23m 47s, read: 1.7 GiB/s, write: 123.2 MiB/s
INFO: 66% (1.1 TiB of 1.7 TiB) in 23m 54s, read: 2.6 GiB/s, write: 48.0 MiB/s
INFO: 67% (1.2 TiB of 1.7 TiB) in 24m, read: 3.0 GiB/s, write: 22.0 MiB/s
INFO: 68% (1.2 TiB of 1.7 TiB) in 24m 6s, read: 3.1 GiB/s, write: 22.0 MiB/s
INFO: 69% (1.2 TiB of 1.7 TiB) in 24m 13s, read: 2.6 GiB/s, write: 18.9 MiB/s
INFO: 70% (1.2 TiB of 1.7 TiB) in 24m 19s, read: 3.1 GiB/s, write: 22.7 MiB/s
INFO: 71% (1.2 TiB of 1.7 TiB) in 24m 25s, read: 3.0 GiB/s, write: 22.0 MiB/s
INFO: 72% (1.3 TiB of 1.7 TiB) in 24m 31s, read: 3.0 GiB/s, write: 22.0 MiB/s
INFO: 73% (1.3 TiB of 1.7 TiB) in 24m 42s, read: 1.4 GiB/s, write: 79.6 MiB/s
INFO: 74% (1.3 TiB of 1.7 TiB) in 24m 49s, read: 2.7 GiB/s, write: 37.7 MiB/s
INFO: 75% (1.3 TiB of 1.7 TiB) in 24m 55s, read: 3.1 GiB/s, write: 22.7 MiB/s
INFO: 76% (1.3 TiB of 1.7 TiB) in 25m 1s, read: 3.0 GiB/s, write: 22.0 MiB/s
INFO: 77% (1.3 TiB of 1.7 TiB) in 25m 7s, read: 2.9 GiB/s, write: 22.7 MiB/s
INFO: 78% (1.4 TiB of 1.7 TiB) in 25m 13s, read: 2.9 GiB/s, write: 22.0 MiB/s
INFO: 79% (1.4 TiB of 1.7 TiB) in 25m 19s, read: 3.1 GiB/s, write: 22.0 MiB/s
INFO: 80% (1.4 TiB of 1.7 TiB) in 25m 25s, read: 2.9 GiB/s, write: 22.0 MiB/s
INFO: 81% (1.4 TiB of 1.7 TiB) in 25m 31s, read: 3.0 GiB/s, write: 23.3 MiB/s
INFO: 82% (1.4 TiB of 1.7 TiB) in 25m 37s, read: 3.0 GiB/s, write: 25.3 MiB/s
INFO: 83% (1.4 TiB of 1.7 TiB) in 25m 44s, read: 2.8 GiB/s, write: 36.0 MiB/s
INFO: 84% (1.5 TiB of 1.7 TiB) in 25m 50s, read: 3.1 GiB/s, write: 22.0 MiB/s
INFO: 85% (1.5 TiB of 1.7 TiB) in 25m 56s, read: 2.4 GiB/s, write: 22.7 MiB/s
INFO: 86% (1.5 TiB of 1.7 TiB) in 26m 2s, read: 3.1 GiB/s, write: 23.3 MiB/s
INFO: 87% (1.5 TiB of 1.7 TiB) in 26m 8s, read: 3.1 GiB/s, write: 22.7 MiB/s
INFO: 88% (1.5 TiB of 1.7 TiB) in 26m 14s, read: 3.0 GiB/s, write: 22.7 MiB/s
INFO: 89% (1.5 TiB of 1.7 TiB) in 26m 19s, read: 3.2 GiB/s, write: 26.4 MiB/s
INFO: 90% (1.6 TiB of 1.7 TiB) in 27m 11s, read: 342.5 MiB/s, write: 55.2 MiB/s
INFO: 91% (1.6 TiB of 1.7 TiB) in 40m 3s, read: 23.5 MiB/s, write: 23.3 MiB/s
INFO: 92% (1.6 TiB of 1.7 TiB) in 54m 57s, read: 20.4 MiB/s, write: 20.3 MiB/s
INFO: 93% (1.6 TiB of 1.7 TiB) in 1h 14m 45s, read: 15.4 MiB/s, write: 15.2 MiB/s
INFO: 94% (1.6 TiB of 1.7 TiB) in 1h 33m 14s, read: 16.4 MiB/s, write: 16.3 MiB/s
INFO: 95% (1.7 TiB of 1.7 TiB) in 1h 52m 19s, read: 16.1 MiB/s, write: 15.9 MiB/s
INFO: 96% (1.7 TiB of 1.7 TiB) in 2h 8m 4s, read: 19.1 MiB/s, write: 18.8 MiB/s
INFO: 97% (1.7 TiB of 1.7 TiB) in 2h 23m 57s, read: 19.1 MiB/s, write: 18.7 MiB/s
INFO: 98% (1.7 TiB of 1.7 TiB) in 2h 38m 43s, read: 20.6 MiB/s, write: 19.8 MiB/s
INFO: 99% (1.7 TiB of 1.7 TiB) in 2h 52m 19s, read: 22.5 MiB/s, write: 21.4 MiB/s
INFO: 100% (1.7 TiB of 1.7 TiB) in 3h 4m 30s, read: 24.7 MiB/s, write: 22.6 MiB/s
------------------------------------ HERE WAS REBOOT PBS after 2 days waiting --------------------------------------------
ERROR: backup close image failed: command error: stream closed because of a broken pipe
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 100 failed - backup close image failed: command error: stream closed because of a broken pipe
INFO: Failed at 2025-08-18 11:54:17
INFO: Backup job finished with errors
INFO: skipping disabled matcher 'default-matcher'
INFO: notified via target `banancz`
TASK ERROR: job error

Truncated job log in PSB


2025-08-15T21:15:11+02:00: starting new backup on datastore 'backup-s3' from ::ffff:10.1.50.2: "vm/100/2025-08-15T19:15:10Z"
2025-08-15T21:15:11+02:00: GET /previous: 400 Bad Request: no valid previous backup
2025-08-15T21:15:11+02:00: created new fixed index 1 ("vm/100/2025-08-15T19:15:10Z/drive-scsi0.img.fidx")
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk 52f629dbf21a2afae6dd39af8f513b59e381317fd2af689bee5d68b555126670
2025-08-15T21:15:11+02:00: created new fixed index 2 ("vm/100/2025-08-15T19:15:10Z/drive-scsi1.img.fidx")
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk 52f629dbf21a2afae6dd39af8f513b59e381317fd2af689bee5d68b555126670
2025-08-15T21:15:11+02:00: created new fixed index 3 ("vm/100/2025-08-15T19:15:10Z/drive-scsi2.img.fidx")
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk 52f629dbf21a2afae6dd39af8f513b59e381317fd2af689bee5d68b555126670
2025-08-15T21:15:11+02:00: Uploaded blob to object store: .cnt/vm/100/2025-08-15T19:15:10Z/qemu-server.conf.blob
2025-08-15T21:15:11+02:00: add blob "/s3cache/vm/100/2025-08-15T19:15:10Z/qemu-server.conf.blob" (489 bytes, comp: 489)
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk f876990d6815b9b307a7292129f4ca9537167ef54ec66335e2c67c7d950b3d44
2025-08-15T21:15:11+02:00: Upload of new chunk 0acf9b0bcaf5d6d6fe68fa98ad5ef8a7fcda3373eb508e99c0832323b3e13b15
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk a1f8f63b97aecbcfae8da192e61b41c23acb653035af162c92a3451277b10989
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk aec08a01ceb37a2eb1b6dc38473fe45f92f34b772f59ff80f126aa0b808992a7
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk ae3741081018658cd75df1a41796e7d2e683d823d706098629462eae19358797
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk f8fe5b82bbf4d6d00743550b333343b163a22c68222a49e2625efc52c6b68233
2025-08-15T21:15:11+02:00: Skip upload of already encountered chunk 9ba8dee55d8acbe2d52be7345ccbb9f2a55214a2d6dc64a139f8f0d400fdde11
2025-08-15T21:15:12+02:00: Skip upload of already encountered chunk f29cf54eca0e51c01c0feee32b11ac21cdb4fea27e232e826a5319ba437b6816
2025-08-15T21:15:12+02:00: Skip upload of already encountered chunk 830483e326ba71876b6f62633fc5c3cabcab05f9716dc2e3ef5388bf6e0560a0
...
2025-08-16T00:19:40+02:00: Caching of chunk 9241292c5420dfbce235dd77bfb01596ac9502f548ad9dfd0b540dca2d28c71e
2025-08-16T00:19:40+02:00: Caching of chunk 6fc9c3e0f2eb966725c1c611e01ed16884115d1052a8276bf67fcd5a96f58a71
2025-08-16T00:19:41+02:00: Caching of chunk f9befd5c8804019ad8175510eecc8c33829dca6fa07b5b8ecc73a0a86bd8f146
2025-08-16T00:19:41+02:00: Caching of chunk a7fcaf8f7bc71a6d8ce960fc0af2d72faa2fc6eb219051d7a1e25eebc04573db
2025-08-16T00:19:42+02:00: Caching of chunk 785d1806dabe619c9fdadb01d4f96935b58d60b4093030d6dec04fb694dfdd7c
2025-08-16T00:19:42+02:00: Caching of chunk 21dd57eebc39db04fbff205218990f2115db6d70bcf3e59f80965246fe91f9df
2025-08-16T00:19:42+02:00: Caching of chunk 4be4d182220e6567742a90680cbd90debb28c7e9918ae09a58c64f8de097ebe6
Status say: stopped: unknown

I am able to replicate everything, the problem occurs every time. If you need additional logs, please provide the path and I will send them to you.

Thx for help.
 
Hello,
I have a very similar problem. After upgrading to v4, I wanted to use S3 storage as a secondary backup.

S3 storage used: wasabi.com
Configuration: no problem
Upload speed: 100Mbit/s (my provider limit)

When backing up a smaller VM (32GB), everything works fine. However, when I try a larger one (approx. 1.7TB), the job gets stuck at 100% and does not complete. Interestingly, at this point, PBS enters a strange state - PVE loses access to the API (the backup appears to be unavailable), and even the UI in PBS is unavailable. When I go to the terminal, PBS appears to be working. But only after rebooting (or systemctl restart proxmox-backup-proxy) does PBS stabilize the state - the job will not be completed.

FULL Job log in PVE


Truncated job log in PSB



Status say: stopped: unknown

I am able to replicate everything, the problem occurs every time. If you need additional logs, please provide the path and I will send them to you.

Thx for help.
Hi,
this seems to be an unrelated issue as discussed in this thread, the chunk upload being at fault. Therefore, please open an issue a bugzilla.proxmox.com so this is tracked and investigated, thanks.
 
Is there a way to check what this cache size is?
This is logged in the systemd journal at the first access to the datastore after a service restart or maintenance mode change. You should find log lines stating e.g.
Code:
Using datastore cache with capacity 2328 for store aws-s3-store
the given number being the avaliable in-memory cache slots to store chunk digests. Note that currently the cache does not rewarm previously cached chunks, so the available cache capacity will decrease with each service restart, this is however being worked on, see https://lore.proxmox.com/pbs-devel/20250801141024.626365-1-c.ebner@proxmox.com/T/
 
Was able to reproduce this issue, unfortunately I cannot provide an easy workaround for the time being. The request timeout is to limited for the chunk upload over low bandwidth connections. Working on a fix for this.

See also https://forum.proxmox.com/threads/pbs-4-0-wasabi-post-fixed_chunk-400-bad-request.169620/post-793007
Some further investigations showed that the request timeout issue is not the only problem here, but it rather seems that the requests to the S3 endpoint congest the network enough for other issues to arise, e.g. DNS requests failing, leading to the chunk upload requests failing as well. You can check if the issues with the chunk upload are less likely if you add an entry in /etc/hosts for your s3 endpoint, so there is no need for the additional DNS queries and/or you are shaping the upload traffic via tools like e.g. tc, limiting the upload to below the maximum available bandwidth.

Unfortunately, there is no easy fix for the time being, a proper shared rate limiting implementation for the S3 client most likely required.
 
Last edited:
Can any of you check if setting the put-rate-limit in the endpoint configuration located at /etc/proxmox-backup/s3.cfg can be used to circumvent the upload issues for the time being?

This parameter is currently not exposed in the ui and has to be set manually in the config. The given value is requests per second. An example config to limit to 10 put request per second is given by:

Code:
s3-endpoint: <ID>
    access-key <ACCESS-KEY>
    endpoint <ENDPOINT>
    put-rate-limit 10
    region <REGION>
    secret-key <SECRET-KEY>

I've been testing with the suggested additional "put-rate-limit 10" in the s3.cfg and i haven't had any issue since. so it seems that for me this is a work around that is working
 
I've been testing with the suggested additional "put-rate-limit 10" in the s3.cfg and i haven't had any issue since. so it seems that for me this is a work around that is working
It's certainly better with those limits, but still fails for me.
 
same problem here - no luck with put-rate-limit 10 and/or traffic limit in my router with different vales. No larger backup (around > 10 GB) is uploaded completely, the network is becoming completely congested; no other traffic is coming through and then it fails. I have 250 MB down and 50 MB up capacity.
I use restic with the same S3 endpoint and that works fine for ages.
 
same problem here - no luck with put-rate-limit 10 and/or traffic limit in my router with different vales.
Have you also tried to set the put-rate-limit to 1? So 1 request/second.

I have 250 MB down and 50 MB up capacity.
Is this MB or Mbit? but yes, the issue is that the request timeout for the put object calls was to low. The network congestion will be addressed by implementing a shared rate limiter for the s3 client in PBS
 
put-rate-limit 1 worked for the first time.
My upload is MBit, not MB.
Will the shared rate limiter be a feature of the next minor release?
No ETA, sorry. But the proposed stop gap should already allow to use custom traffic shaping using e.g. tc filters. By this one can effectively limit the traffic to/from the s3 endpoint.
 
Last edited:
I'm seeing:
Code:
2025-08-21T00:01:15-07:00: SKIPPED: verify Backblaze:ct/111/2025-08-18T14:29:19Z (recently verified)
2025-08-21T00:01:15-07:00: percentage done: 56.25% (9/16 groups)
2025-08-21T00:01:15-07:00: verify group Backblaze:ct/113 (5 snapshots)
2025-08-21T00:01:15-07:00: verify Backblaze:ct/113/2025-08-21T05:07:14Z
2025-08-21T00:01:15-07:00:   check pct.conf.blob
2025-08-21T00:01:15-07:00:   check root.pxar.didx
2025-08-21T00:04:14-07:00: "can't verify chunk, load failed - client error (SendRequest)"
2025-08-21T00:04:14-07:00: corrupted chunk renamed to "/mnt/backblaze/.chunks/f3ae/f3ae24280a491318b53cd0b43df10a57082a4bccf50d97e5bdfd9adc30d02046.0.bad"
2025-08-21T00:04:40-07:00:   verified 1768.08/3116.49 MiB in 204.99 seconds, speed 8.63/15.20 MiB/s (1 errors)
2025-08-21T00:04:40-07:00: verify Backblaze:ct/113/2025-08-21T05:07:14Z/root.pxar.didx failed: chunks could not be verified
2025-08-21T00:04:40-07:00:   check catalog.pcat1.didx
2025-08-21T00:04:41-07:00:   verified 0.24/0.68 MiB in 0.85 seconds, speed 0.29/0.80 MiB/s (0 errors)
2025-08-21T00:04:41-07:00: percentage done: 57.50% (9/16 groups, 1/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-20T05:06:56Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 58.75% (9/16 groups, 2/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-19T13:56:33Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 60.00% (9/16 groups, 3/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-19T11:06:19Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 61.25% (9/16 groups, 4/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-18T14:45:01Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 62.50% (10/16 groups)
2025-08-21T00:04:41-07:00: verify group Backblaze:ct/115 (5 snapshots)
2025-08-21T00:04:41-07:00: verify Backblaze:ct/115/2025-08-21T05:09:04Z
2025-08-21T00:04:41-07:00:   check pct.conf.blob
2025-08-21T00:04:42-07:00:   check root.pxar.didx
2025-08-21T00:04:57-07:00: "can't verify chunk, load failed - client error (SendRequest)"
2025-08-21T00:04:57-07:00: corrupted chunk renamed to "/mnt/backblaze/.chunks/1581/158149be41b8488bee40e202dc4e599b80333d76016b3c4484e47fdc5bd63e2b.0.bad"
2025-08-21T00:12:58-07:00:   verified 3859.46/8820.43 MiB in 496.54 seconds, speed 7.77/17.76 MiB/s (1 errors)
2025-08-21T00:12:58-07:00: verify Backblaze:ct/115/2025-08-21T05:09:04Z/root.pxar.didx failed: chunks could not be verified
 
I'm seeing:
Code:
2025-08-21T00:01:15-07:00: SKIPPED: verify Backblaze:ct/111/2025-08-18T14:29:19Z (recently verified)
2025-08-21T00:01:15-07:00: percentage done: 56.25% (9/16 groups)
2025-08-21T00:01:15-07:00: verify group Backblaze:ct/113 (5 snapshots)
2025-08-21T00:01:15-07:00: verify Backblaze:ct/113/2025-08-21T05:07:14Z
2025-08-21T00:01:15-07:00:   check pct.conf.blob
2025-08-21T00:01:15-07:00:   check root.pxar.didx
2025-08-21T00:04:14-07:00: "can't verify chunk, load failed - client error (SendRequest)"
2025-08-21T00:04:14-07:00: corrupted chunk renamed to "/mnt/backblaze/.chunks/f3ae/f3ae24280a491318b53cd0b43df10a57082a4bccf50d97e5bdfd9adc30d02046.0.bad"
2025-08-21T00:04:40-07:00:   verified 1768.08/3116.49 MiB in 204.99 seconds, speed 8.63/15.20 MiB/s (1 errors)
2025-08-21T00:04:40-07:00: verify Backblaze:ct/113/2025-08-21T05:07:14Z/root.pxar.didx failed: chunks could not be verified
2025-08-21T00:04:40-07:00:   check catalog.pcat1.didx
2025-08-21T00:04:41-07:00:   verified 0.24/0.68 MiB in 0.85 seconds, speed 0.29/0.80 MiB/s (0 errors)
2025-08-21T00:04:41-07:00: percentage done: 57.50% (9/16 groups, 1/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-20T05:06:56Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 58.75% (9/16 groups, 2/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-19T13:56:33Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 60.00% (9/16 groups, 3/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-19T11:06:19Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 61.25% (9/16 groups, 4/5 snapshots in group #10)
2025-08-21T00:04:41-07:00: SKIPPED: verify Backblaze:ct/113/2025-08-18T14:45:01Z (recently verified)
2025-08-21T00:04:41-07:00: percentage done: 62.50% (10/16 groups)
2025-08-21T00:04:41-07:00: verify group Backblaze:ct/115 (5 snapshots)
2025-08-21T00:04:41-07:00: verify Backblaze:ct/115/2025-08-21T05:09:04Z
2025-08-21T00:04:41-07:00:   check pct.conf.blob
2025-08-21T00:04:42-07:00:   check root.pxar.didx
2025-08-21T00:04:57-07:00: "can't verify chunk, load failed - client error (SendRequest)"
2025-08-21T00:04:57-07:00: corrupted chunk renamed to "/mnt/backblaze/.chunks/1581/158149be41b8488bee40e202dc4e599b80333d76016b3c4484e47fdc5bd63e2b.0.bad"
2025-08-21T00:12:58-07:00:   verified 3859.46/8820.43 MiB in 496.54 seconds, speed 7.77/17.76 MiB/s (1 errors)
2025-08-21T00:12:58-07:00: verify Backblaze:ct/115/2025-08-21T05:09:04Z/root.pxar.didx failed: chunks could not be verified
Thanks for providing these insights. Seems to be a combination of the network congestion issue but in download and the incorrect renaming in case of failed chunk fetching as reported in https://bugzilla.proxmox.com/show_bug.cgi?id=6665. Please add this information to that issue if possible, thanks.
 
  • Like
Reactions: Oachkatze
Hi,

the traffic control rate limits are currently not honored by the s3 api client. But your observation with respect to the rate in limits are indeed interesting. So it might indeed be that the s3 client is running into timeouts.

Can any of you check if setting the put-rate-limit in the endpoint configuration located at /etc/proxmox-backup/s3.cfg can be used to circumvent the upload issues for the time being?

This parameter is currently not exposed in the ui and has to be set manually in the config. The given value is requests per second. An example config to limit to 10 put request per second is given by:

Code:
s3-endpoint: <ID>
    access-key <ACCESS-KEY>
    endpoint <ENDPOINT>
    put-rate-limit 10
    region <REGION>
    secret-key <SECRET-KEY>
Setting put-rate-limit=10 has also improved the situation for me. Failed s3 uploads are much rarer now, on 2 machines with upload bandwidth of ~10Mbits/s and 1GBps.
 
My upload bandwidth is 50Mbits/s and I was encountering tons of errors. I tried all kinds of tweaks on minio (tweaking requests_max) and my router (traffic shaping, etc) but those didn't do anything.

The only thing that worked was setting:

put-rate-limit=3

Everything is pretty solid now.
 
Last edited:
Please try without any pur rate limit. There were changes to retry logic and request timeouts.