S3 Upload Fails with Large VMs (Cache Issue?)

Oachkatze

Member
Jan 15, 2023
36
5
13
Hi everyone,


since upgrading to Proxmox Backup Server 4.0, I’ve started using the new S3 target support, which is absolutely great so far. However, I’m running into an issue with larger VMs:
My upload bandwidth is limited to around 10 Mbit/s, and for some larger backups, the upload process fails partway through. From what I understand, PBS uses a local cache for S3, which defaults to 25 GB.

Smaller CT runs great with the S3 Bucket.

My suspicion is that the cache fills up before the data can be uploaded to the S3 storage, causing the backup to abort.


My questions:

  • Is there a way to increase or configure the S3 cache size in PBS?
  • Has anyone else experienced similar issues when uploading large backups with limited upstream bandwidth?
Any help or pointers would be greatly appreciated!
Thanks in advance,
Simon

1754646392430.png
 
HI,
please post the contents of the systemd journal from around the time when the chunk upload error happens. This should give more error context.
PBS uses a local cache for S3, which defaults to 25 GB.
No, the cache capacity is dynamically determined when first accessing the datastore. Did you follow the recommendation and use a dedicated device, partition or zfs datastet with quota for the local cache? See https://pbs.proxmox.com/docs/storage.html#datastores-with-s3-backend

My suspicion is that the cache fills up before the data can be uploaded to the S3 storage, causing the backup to abort.
If you usable storage changed since the last time the cache capacity was determined, then this can be the cause, yes. Also, there is a known issue with the
cache not being rewarmed correctly when the PBS services are restarted, the host is rebooted or the datastore is set into maintenance mode. This is actively being worked on, see https://lore.proxmox.com/pbs-devel/20250801141024.626365-1-c.ebner@proxmox.com/T/

For the time being, manual intervention is required to reclaim this space for the cases mentioned above.
 
HALLO,
Bitte posten Sie den Inhalt des Systemd-Journals aus der Zeit, als der Chunk-Upload-Fehler auftrat. Dies sollte mehr Fehlerkontext liefern.

Nein, die Cache-Kapazität wird beim ersten Zugriff auf den Datenspeicher dynamisch ermittelt. Haben Sie der Empfehlung gefolgt und ein dediziertes Gerät, eine Partition oder ein ZFS-Dataset mit Kontingent für den lokalen Cache verwendet? Siehe https://pbs.proxmox.com/docs/storage.html#datastores-with-s3-backend


Wenn sich Ihr nutzbarer Speicher seit der letzten Ermittlung der Cache-Kapazität geändert hat, kann dies die Ursache sein. Außerdem gibt es ein bekanntes Problem mit dem
Der Cache wird nicht korrekt wieder aufgewärmt, wenn die PBS-Dienste neu gestartet werden, der Host neu gestartet wird oder der Datenspeicher in den Wartungsmodus versetzt wird. Daran wird aktiv gearbeitet, siehe https://lore.proxmox.com/pbs-devel/20250801141024.626365-1-c.ebner@proxmox.com/T/

Um diesen Platz für die oben genannten Fälle zurückzugewinnen, ist vorerst ein manueller Eingriff erforderlich.
1754652101124.png


I was able to quickly grab a screenshot – I’ll take a closer look at the other things a bit later. But thank you already for your help!


Best regards,
Simon
 
View attachment 89108


I was able to quickly grab a screenshot – I’ll take a closer look at the other things a bit later. But thank you already for your help!


Best regards,
Simon
Sorry, I was not precise enough. The interesting part is to be found in the systemd journal of the PBS host.
 
I think i have the same issue.

Small Containers backup just fine, bigger ones, or VMs end up with an error

the relevant parts of journalctl are:

Its between two reboots, and the first one is with a VM, the second one with is a CT.
Both were manual backup tasks for testing.

journalctl on PBS
Code:
Aug 09 02:05:34 pbs proxmox-backup-proxy[719]: TASK ERROR: backup ended but finished flag is not set.


-- Boot 0a0d5fa66f1d4c6ea1f85e17ad4223f0 --
Aug 09 02:19:07 pbs systemd[1]: Starting proxmox-backup-proxy.service - Proxmox Backup API Proxy Server...
Aug 09 02:19:08 pbs proxmox-backup-proxy[650]: applied rrd journal (2 entries in 0.005 seconds)
Aug 09 02:19:08 pbs proxmox-backup-proxy[650]: rrd journal successfully committed (0 files in 0.000 seconds)
Aug 09 02:19:08 pbs proxmox-backup-proxy[650]: catching shutdown signal
Aug 09 02:19:08 pbs proxmox-backup-proxy[650]: catching reload signal
Aug 09 02:19:08 pbs systemd[1]: Started proxmox-backup-proxy.service - Proxmox Backup API Proxy Server.
Aug 09 02:19:08 pbs proxmox-backup-proxy[650]: Using datastore cache with capacity 7499 for store s3-backup
Aug 09 02:21:11 pbs proxmox-backup-proxy[650]: TASK ERROR: backup ended but finished flag is not set.



And After that the corresponding logs from the PVE:
Code:
INFO: starting new backup job: vzdump 400 --storage pbs --node pve-i9 --notification-mode notification-system --mode snapshot --remove 0 --notes-template '{{guestname}}'
INFO: Starting Backup of VM 400 (qemu)
INFO: Backup started at 2025-08-09 02:04:54
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: win10
INFO: include disk 'scsi0' 'vmstorage:vm-400-disk-1' 100G
INFO: include disk 'efidisk0' 'vmstorage:vm-400-disk-0' 1M
INFO: include disk 'tpmstate0' 'vmstorage:vm-400-disk-2' 4M
INFO: creating Proxmox Backup Server archive 'vm/400/2025-08-09T00:04:54Z'
INFO: starting kvm to execute backup task
swtpm_setup: Not overwriting existing state file.
INFO: attaching TPM drive to QEMU for backup
INFO: started backup task '0f2d4dfb-8b11-4bd0-b6f4-7a14634f98f6'
INFO: efidisk0: dirty-bitmap status: created new
INFO: scsi0: dirty-bitmap status: created new
INFO: tpmstate0-backup: dirty-bitmap status: created new
INFO:   0% (432.0 MiB of 100.0 GiB) in 3s, read: 144.0 MiB/s, write: 17.3 MiB/s
INFO:   1% (1.0 GiB of 100.0 GiB) in 6s, read: 208.0 MiB/s, write: 133.3 MiB/s
INFO:   1% (1.1 GiB of 100.0 GiB) in 37s, read: 1.4 MiB/s, write: 1.4 MiB/s
ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: failed to upload chunk to s3 backend
INFO: aborting backup job
INFO: stopping kvm after backup task
ERROR: Backup of VM 400 failed - backup write data failed: command error: write_data upload error: pipelined request failed: failed to upload chunk to s3 backend
INFO: Failed at 2025-08-09 02:05:34
INFO: Backup job finished with errors
INFO: notified via target `smtp-mail`
TASK ERROR: job errors




()
INFO: starting new backup job: vzdump 206 --notification-mode notification-system --storage pbs --node pve-i9 --remove 0 --notes-template '{{guestname}}' --mode snapshot
INFO: Starting Backup of VM 206 (lxc)
INFO: Backup started at 2025-08-09 02:19:55
INFO: status = running
INFO: CT Name: immich.******.**
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/mnt/fotos') from backup (not a volume)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
INFO: creating Proxmox Backup Server archive 'ct/206/2025-08-09T00:19:55Z'
INFO: set max number of entries in memory for file-based backups to 1048576
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp1980205_206/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 206 --backup-time 1754698795 --entries-max 1048576 --repository root@pam@192.168.179.26:s3-backup --ns pve-i9
INFO: Starting backup: [pve-i9]:ct/206/2025-08-09T00:19:55Z  
INFO: Client name: pve-i9  
INFO: Starting backup protocol: Sat Aug  9 02:19:55 2025  
INFO: No previous manifest available.  
INFO: Upload config file '/var/tmp/vzdumptmp1980205_206/etc/vzdump/pct.conf' to 'root@pam@192.168.179.26:8007:s3-backup' as pct.conf.blob  
INFO: Upload directory '/mnt/vzsnap0' to 'root@pam@192.168.179.26:8007:s3-backup' as root.pxar.didx  
INFO: processed 900.984 MiB in 1m, uploaded 602.117 MiB
INFO: unclosed encoder dropped
INFO: closed encoder dropped with state
INFO: unfinished encoder state dropped
INFO: finished encoder state with errors
INFO: catalog upload error - channel closed  
INFO: Error: failed to upload chunk to s3 backend
INFO: cleanup temporary 'vzdump' snapshot
ERROR: Backup of VM 206 failed - command 'lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup '--crypt-mode=none' pct.conf:/var/tmp/vzdumptmp1980205_206/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' --backup-type ct --backup-id 206 --backup-time 1754698795 --entries-max 1048576 --repository root@pam@192.168.179.26:s3-backup --ns pve-i9' failed: exit code 255
INFO: Failed at 2025-08-09 02:21:12
INFO: Backup job finished with errors
INFO: notified via target `smtp-mail`
TASK ERROR: job errors
 
backup write data failed: command error: write_data upload error: pipelined request failed: failed to upload chunk to s3 backend
Double checked the code on this one, the error points at the upload failing and not an issue with the cache. Since you have no further error in the systemd journal, the requests to the API seem however to be just fine. Which would point more towards errors while sending the requests, running into a request timeout or concurrent uploads of the same chunk object hindering each other.

Is this the only backup running to this datastore at the time? Can you exclude network dropouts by e.g. continuously pinging the s3 endpoint from the PBS host during the backup run.

Also, in order to reproduce, what provider are you using?
 
  • Like
Reactions: Oachkatze
Double checked the code on this one, the error points at the upload failing and not an issue with the cache. Since you have no further error in the systemd journal, the requests to the API seem however to be just fine. Which would point more towards errors while sending the requests, running into a request timeout or concurrent uploads of the same chunk object hindering each other.

Is this the only backup running to this datastore at the time? Can you exclude network dropouts by e.g. continuously pinging the s3 endpoint from the PBS host during the backup run.

Also, in order to reproduce, what provider are you using?

Hi Chris,

I just saw that CRZ has already sent the log file – yeah, it’s really strange...
The only thing I can imagine is that since my upload speed is only 10 Mbit/s and the cache shows just 25 GB, it might somehow be playing a role here. With smaller CTs it works perfectly, since the cache doesn’t fill up.

The S3 provider I’m using is Hetzner.
The connection setup itself went without any problems.
 
Right now it's from my Homelab to Hetzners S3 for evaluation purposes.
And yes, only manually backing up 1 vm/ct at a time.


I have 50mbit upload to it.
And i have set up an extra cache disk as described with 128GB as ext4
(https://pbs.proxmox.com/docs/storage.html#datastore-s3-backend)

Is there a way to set a maximum number of concurrent connections?
This way i could test if this is a problem

Best regards,
crz
 
Last edited:
Hello,
I have now created a bucket in Nü remberg for testing (which shouldn’t cause any issues), but unfortunately the same problem occurs. It briefly starts the backup and then stops again with the same error message.

Attached you’ll find the log file from the PBS.

Thanks in advance for your help.



Code:
g 11 08:50:23 pbs proxmox-backup-proxy[820]: Using datastore cache with capacity 976 for store Hetzner
Aug 11 09:05:43 pbs proxmox-backup-proxy[820]: rrd journal successfully committed (41 files in 1.166 seconds)
Aug 11 09:13:41 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:13:41 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:13:57 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:14:12 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:14:27 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:14:42 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:14:58 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:15:13 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:15:28 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:15:44 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:15:59 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:16:14 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:16:30 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:16:45 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:17:00 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:17:01 pbs CRON[270455]: pam_unix(cron:session): session opened for user root(uid=0) by root(uid=0)
Aug 11 09:17:01 pbs CRON[270457]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Aug 11 09:17:01 pbs CRON[270455]: pam_unix(cron:session): session closed for user root
Aug 11 09:17:15 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:17:31 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:17:46 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:18:01 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:18:17 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:18:32 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:18:48 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:19:03 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:19:18 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:19:33 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:19:49 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:20:04 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:20:19 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:20:35 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:20:50 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:21:06 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:21:22 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:21:38 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:21:54 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:22:09 pbs proxmox-backup-proxy[820]: <?xml version="1.0" encoding="UTF-8"?>
                                               <Error>
                                                   <Code>BadRequest</Code>
                                                   <Message>N/A</Message>
                                                   <RequestId>N/A</RequestId>
                                                   <HostId>N/A</HostId>
                                               </Error>
Aug 11 09:22:11 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:22:26 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:22:39 pbs proxmox-backup-proxy[820]: <html><body><h1>408 Request Time-out</h1>
                                               Your browser didn't send a complete request in time.
                                               </body></html>
Aug 11 09:22:39 pbs proxmox-backup-proxy[820]: <html><body><h1>408 Request Time-out</h1>
                                               Your browser didn't send a complete request in time.
                                               </body></html>
Aug 11 09:22:42 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:22:58 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:23:14 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:23:24 pbs proxmox-backup-proxy[820]: TASK ERROR: backup ended but finished flag is not set.
Aug 11 09:23:29 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:23:45 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:24:00 pbs proxmox-backup-proxy[820]: unable to parse worker status 'UPID:pbs:00000D50:00C88D0D:00000000:67ABFBAE:aptupdate::roUPID:pbs:000002C4:00002415:0000002D:67ABF998:verifica>
Aug 11 09:24:13 pbs login[271107]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
Aug 11 09:24:13 pbs systemd-logind[588]: New session 112 of user root.
Aug 11 09:24:13 pbs systemd[1]: Started session-112.scope - Session 112 of User root.
 
Hello,

I have been having what looks like the same issue as Oachkatze and crz.

I only have ~18Mbit/s upload on my connection and have been seeing backups fail after few minutes after starting with the same errors as above. I am using storj as my s3 provider.

In the course of investigating this I have found that if I set a "Rate In" limit under Configuration -> Traffic Control in PBS to 1.5MiBs (~12.5MBit/s, i.e. slightly under what my connection can support) then my backups generally proceed to completion.

I'm not sure what this points towards in terms of root cause but I hope it is at least useful as a workaround for people experiencing this issue.

On a potentially related note I first set a "Rate Out" limit, but this was not respected. PBS would be happily maxing out my upload whilst reporting 0% used of a 1.5MiB/s "Rate Out" limit.
 
do you have an idea on how to better test and hunt the error down?
Do you also see the request timeout error in the systemd journal? Please post an excerpt from around the time when the upload fails. Also, does the error always happen at the same percentage/upload size?
 
This is syslog from the PVE:

Code:
Aug 18 05:00:02 pve-i9 pvescheduler[311581]: <root@pam> starting task UPID:pve-i9:0004C11E:058B3B4F:68A29732:vzdump::root@pam:
Aug 18 05:00:02 pve-i9 pvescheduler[311582]: INFO: starting new backup job: vzdump 202 200 201 203 206 209 210 211 --fleecing '1,storage=vmstorage' --storage s3-hetzner-nbg1 --mode snapshot --prune-backups 'keep-all=1' --notification-mode legacy-sendmail --mailnotification failure --quiet 1 --notes-template 'Node:{{node}} -- VMID: {{vmid}} -- Hostname: {{guestname}}'
Aug 18 05:00:02 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 200 (lxc)
Aug 18 05:00:27 pve-i9 pvescheduler[311582]: INFO: Finished Backup of VM 200 (00:00:25)
Aug 18 05:00:27 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 201 (lxc)
Aug 18 05:00:51 pve-i9 pvescheduler[311582]: INFO: Finished Backup of VM 201 (00:00:24)
Aug 18 05:00:51 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 202 (lxc)
Aug 18 05:01:03 pve-i9 pvescheduler[311582]: INFO: Finished Backup of VM 202 (00:00:12)
Aug 18 05:01:03 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 203 (lxc)
Aug 18 05:01:42 pve-i9 pvescheduler[311582]: INFO: Finished Backup of VM 203 (00:00:39)
Aug 18 05:01:42 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 206 (lxc)
Aug 18 05:02:22 pve-i9 pvescheduler[311582]: INFO: Finished Backup of VM 206 (00:00:40)
Aug 18 05:02:23 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 209 (lxc)
Aug 18 05:03:01 pve-i9 pvescheduler[311582]: INFO: Finished Backup of VM 209 (00:00:38)
Aug 18 05:03:01 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 210 (lxc)
Aug 18 05:03:17 pve-i9 pvescheduler[311582]: ERROR: Backup of VM 210 failed - command 'lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:44 -m g:44:44:1 -m g:45:100045:61 -m g:106:104:1 -m g:107:100107:65429 -- /usr/bin/proxmox-backup-client backup '--crypt-mode=none' pct.conf:/var/tmp/vzdumptmp311582_210/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' --backup-type ct --backup-id 210 --backup-time 1755486181 --entries-max 1048576 --repository root@pam@192.168.179.26:pve-i9' failed: exit code 255
Aug 18 05:03:17 pve-i9 pvescheduler[311582]: INFO: Starting Backup of VM 211 (lxc)
Aug 18 05:03:34 pve-i9 pvescheduler[311582]: INFO: Finished Backup of VM 211 (00:00:17)
Aug 18 05:03:34 pve-i9 pvescheduler[311582]: INFO: Backup job finished with errors
Aug 18 05:03:34 pve-i9 pvescheduler[311582]: job errors

This is Syslog from PBS:

Code:
Aug 18 05:00:27 pbs proxmox-backup-proxy[285108]: Upload backup log to datastore 'pve-i9', root namespace ct/200/2025-08-18T03:00:02Z/client.log.blob
Aug 18 05:00:51 pbs proxmox-backup-proxy[285108]: Upload backup log to datastore 'pve-i9', root namespace ct/201/2025-08-18T03:00:27Z/client.log.blob
Aug 18 05:01:03 pbs proxmox-backup-proxy[285108]: Upload backup log to datastore 'pve-i9', root namespace ct/202/2025-08-18T03:00:51Z/client.log.blob
Aug 18 05:01:42 pbs proxmox-backup-proxy[285108]: Upload backup log to datastore 'pve-i9', root namespace ct/203/2025-08-18T03:01:03Z/client.log.blob
Aug 18 05:02:22 pbs proxmox-backup-proxy[285108]: Upload backup log to datastore 'pve-i9', root namespace ct/206/2025-08-18T03:01:42Z/client.log.blob
Aug 18 05:03:01 pbs proxmox-backup-proxy[285108]: Upload backup log to datastore 'pve-i9', root namespace ct/209/2025-08-18T03:02:23Z/client.log.blob
Aug 18 05:03:16 pbs proxmox-backup-proxy[285108]: TASK ERROR: backup ended but finished flag is not set.
Aug 18 05:03:34 pbs proxmox-backup-proxy[285108]: Upload backup log to datastore 'pve-i9', root namespace ct/211/2025-08-18T03:03:17Z/client.log.blob
 
Please post the systemd journal excerpt from the PBS as well, thanks!
 
Thanks, overlooked that one. What about:
Also, does the error always happen at the same percentage/upload size?
 
I'm not sure, but i would suspect it's not the same percentage.

Thats from the backup log on the PVE side.
Maybe this will help?
Also there are no percentage counts for the backup

INFO: Starting Backup of VM 210 (lxc)
INFO: Backup started at 2025-08-18 05:03:01
INFO: status = running
INFO: CT Name: jellyfin
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/mnt/jellyfin-media') from backup (not a volume)
INFO: excluding bind mount point mp1 ('/mnt/jellyfin-metadata') from backup (not a volume)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
INFO: creating Proxmox Backup Server archive 'ct/210/2025-08-18T03:03:01Z'
INFO: set max number of entries in memory for file-based backups to 1048576
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:44 -m g:44:44:1 -m g:45:100045:61 -m g:106:104:1 -m g:107:100107:65429 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp311582_210/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 210 --backup-time 1755486181 --entries-max 1048576 --repository root@pam@192.168.179.26:pve-i9
INFO: Starting backup: ct/210/2025-08-18T03:03:01Z
INFO: Client name: pve-i9
INFO: Starting backup protocol: Mon Aug 18 05:03:01 2025
INFO: Downloading previous manifest (Sun Aug 17 05:02:57 2025)
INFO: Upload config file '/var/tmp/vzdumptmp311582_210/etc/vzdump/pct.conf' to 'root@pam@192.168.179.26:8007:pve-i9' as pct.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'root@pam@192.168.179.26:8007:pve-i9' as root.pxar.didx
INFO: unclosed encoder dropped
INFO: closed encoder dropped with state
INFO: unfinished encoder state dropped
INFO: finished encoder state with errors
INFO: catalog upload error - channel closed

INFO: Error: failed to upload chunk to s3 backend
INFO: cleanup temporary 'vzdump' snapshot
ERROR: Backup of VM 210 failed - command 'lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:44 -m g:44:44:1 -m g:45:100045:61 -m g:106:104:1 -m g:107:100107:65429 -- /usr/bin/proxmox-backup-client backup '--crypt-mode=none' pct.conf:/var/tmp/vzdumptmp311582_210/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' --backup-type ct --backup-id 210 --backup-time 1755486181 --entries-max 1048576 --repository root@pam@192.168.179.26:pve-i9' failed: exit code 255
INFO: Failed at 2025-08-18 05:03:17
 
Hi,
Hello,

I have been having what looks like the same issue as Oachkatze and crz.

I only have ~18Mbit/s upload on my connection and have been seeing backups fail after few minutes after starting with the same errors as above. I am using storj as my s3 provider.

In the course of investigating this I have found that if I set a "Rate In" limit under Configuration -> Traffic Control in PBS to 1.5MiBs (~12.5MBit/s, i.e. slightly under what my connection can support) then my backups generally proceed to completion.

I'm not sure what this points towards in terms of root cause but I hope it is at least useful as a workaround for people experiencing this issue.

On a potentially related note I first set a "Rate Out" limit, but this was not respected. PBS would be happily maxing out my upload whilst reporting 0% used of a 1.5MiB/s "Rate Out" limit.
the traffic control rate limits are currently not honored by the s3 api client. But your observation with respect to the rate in limits are indeed interesting. So it might indeed be that the s3 client is running into timeouts.

Can any of you check if setting the put-rate-limit in the endpoint configuration located at /etc/proxmox-backup/s3.cfg can be used to circumvent the upload issues for the time being?

This parameter is currently not exposed in the ui and has to be set manually in the config. The given value is requests per second. An example config to limit to 10 put request per second is given by:

Code:
s3-endpoint: <ID>
    access-key <ACCESS-KEY>
    endpoint <ENDPOINT>
    put-rate-limit 10
    region <REGION>
    secret-key <SECRET-KEY>
 
  • Like
Reactions: mattctl