PBS 4.0.20 - s3 cache becomes unavailable

Sep 12, 2024
30
5
8
Hi!

since PBS 4.0.20 (most probably it was this update) the local s3 cache becomes unavailable during sync or verification tasks. I did not find a pattern when this happens, but in shell the actual directory is mounted and I files are accessible. Reboot solves the problem for some time.

syslog error:
Nov 19 07:46:07 pbs-cloud proxmox-backup-proxy[676]: GET /api2/json/admin/datastore/pbs-cloud-s3/status: 400 Bad Request: [client [::ffff:192.168.1.4]:56556] mkstemp "/run/proxmox-backup/active-operations/pbs-cloud-s3.tmp_XXXXXX" failed: ENOSPC: No space left on device

Str1atum
 
the local s3 cache becomes unavailable during sync or verification tasks
what do you mean with this exactly? If the cache is mounted and accessible, then there should be no problem.

GET /api2/json/admin/datastore/pbs-cloud-s3/status: 400 Bad Request: [client [::ffff:192.168.1.4]:56556] mkstemp "/run/proxmox-backup/active-operations/pbs-cloud-s3.tmp_XXXXXX" failed: ENOSPC: No space left on device
Looks more like the tmpdir mounted on run is out of space. Please check and post the output of mount, df -h and df -hi next time this happens.
 
please post "df -i /run" in any case!
 
as a workaround, you can bump the number of inodes available for /run - it has slight security implications (if somebody can create arbitrary files under /run they might be able to DoS other services running on your system):

Code:
mount -o remount,nr_inodes=2M

for example would quadruple the limit, but you can also pick a higher number if PBS is the only thing running there.
 
  • Like
Reactions: Chris
root@pbs-cloud:~# df -i /run
Filesystem Inodes IUsed IFree IUse% Mounted on
tmpfs 489124 489124 0 100% /run

Strangely everything was working without issues for nearly 2 years. PBS is the only service running. Is 4 GB just too low for the latest releases?
 
well, the S3 feature is not that old yet ;) the latest release brought some bug fixes for S3 to close some problematic edge cases/races, and those introduced a new locking mechanism that requires a lock file per touched chunk. those lockfiles live under /run, and if you have very little memory (or a lot of chunks are touched) you might run into this inode limit. a fix is already on the list:

https://lore.proxmox.com/pbs-devel/20251119143148.9383-1-c.ebner@proxmox.com/T/#u
 
I'm seeing the same behavior today.

Code:
Filesystem              Inodes  IUsed     IFree IUse% Mounted on
udev                    489697    559    489138    1% /dev
tmpfs                   500602 500601         1  100% /run

Let me know if there is anything else I can provide. I too am using the S3 feature attempting to backup a fairly sizable disk.

Code:
INFO:  42% (172.3 GiB of 408.0 GiB) in 11m 43s, read: 102.7 MiB/s, write: 102.7 MiB/s
ERROR: backup write data failed: command error: write_data upload error: pipelined request failed: mkstemp "/run/proxmox-backup/locks/XxXxXxXx/.chunks/XxXx/XxXxXx7fb6f6b47de4daafd458611b9e65956edb5d6feecdb6219d8.tmp_XXXXXX" failed: ENOSPC: No space left on device
INFO: aborting backup job
INFO: resuming VM again
 
I'm experiencing the same behaviour, after PBS start it backs the whole cluster up once and fails on consequent backups with 0 free inodes left in /run.

Bash:
Nov 22 09:47:14 backup proxmox-backup-proxy[615]: GET /api2/json/admin/datastore/s3-eu/status: 400 Bad Request: [client [::ffff:192.168.99.117]:58556] mkstemp "/run/proxmox-backup/active-operations/s3-eu.tmp_XXXXXX" failed: ENOSPC: No space left on device

Bash:
# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
udev                 2035659     494 2035165    1% /dev
tmpfs                2047213 2047213       0  100% /run
/dev/mapper/pbs-root 3908128 1949116 1959012   50% /
tmpfs                2047213       1 2047212    1% /dev/shm
tmpfs                2047213       3 2047210    1% /run/lock
tmpfs                   1024       1    1023    1% /run/credentials/systemd-journald.service
tmpfs                1048576       9 1048567    1% /tmp
tmpfs                   1024       1    1023    1% /run/credentials/getty@tty1.service
 
Last edited: