PBS 4.0.20 - s3 cache becomes unavailable

Sep 12, 2024
30
5
8
Hi!

since PBS 4.0.20 (most probably it was this update) the local s3 cache becomes unavailable during sync or verification tasks. I did not find a pattern when this happens, but in shell the actual directory is mounted and I files are accessible. Reboot solves the problem for some time.

syslog error:
Nov 19 07:46:07 pbs-cloud proxmox-backup-proxy[676]: GET /api2/json/admin/datastore/pbs-cloud-s3/status: 400 Bad Request: [client [::ffff:192.168.1.4]:56556] mkstemp "/run/proxmox-backup/active-operations/pbs-cloud-s3.tmp_XXXXXX" failed: ENOSPC: No space left on device

Str1atum
 
the local s3 cache becomes unavailable during sync or verification tasks
what do you mean with this exactly? If the cache is mounted and accessible, then there should be no problem.

GET /api2/json/admin/datastore/pbs-cloud-s3/status: 400 Bad Request: [client [::ffff:192.168.1.4]:56556] mkstemp "/run/proxmox-backup/active-operations/pbs-cloud-s3.tmp_XXXXXX" failed: ENOSPC: No space left on device
Looks more like the tmpdir mounted on run is out of space. Please check and post the output of mount, df -h and df -hi next time this happens.
 
please post "df -i /run" in any case!
 
as a workaround, you can bump the number of inodes available for /run - it has slight security implications (if somebody can create arbitrary files under /run they might be able to DoS other services running on your system):

Code:
mount -o remount,nr_inodes=2M

for example would quadruple the limit, but you can also pick a higher number if PBS is the only thing running there.
 
  • Like
Reactions: Chris
root@pbs-cloud:~# df -i /run
Filesystem Inodes IUsed IFree IUse% Mounted on
tmpfs 489124 489124 0 100% /run

Strangely everything was working without issues for nearly 2 years. PBS is the only service running. Is 4 GB just too low for the latest releases?
 
well, the S3 feature is not that old yet ;) the latest release brought some bug fixes for S3 to close some problematic edge cases/races, and those introduced a new locking mechanism that requires a lock file per touched chunk. those lockfiles live under /run, and if you have very little memory (or a lot of chunks are touched) you might run into this inode limit. a fix is already on the list:

https://lore.proxmox.com/pbs-devel/20251119143148.9383-1-c.ebner@proxmox.com/T/#u