PBS4: Storage usage on S3

devaux

Active Member
Feb 3, 2024
204
53
28
Hi everyone,I’m currently testing the new S3-compatible object storage backend in Proxmox Backup Server, and I’m really impressed with its potential! The integration of S3 storage as a datastore is a game-changer for offsite backups. However, I’ve run into an issue with storage usage reporting and have a question about caching that I’d love to get some clarity on from the community.Issue Description:I’ve set up a datastore using an S3-compatible bucket, and backups are working well. However, there’s a discrepancy between the storage usage reported in the PBS UI and what my S3 provider shows:
  • PBS UI: Reports ~105 GB of storage used.
  • S3 Provider: Reports ~600 GB of storage used.
Interestingly, when I sum up the storage of all the VMs being backed up, the ~600 GB reported by the S3 provider seems accurate. I’ve also noticed that the PBS UI storage usage sometimes decreases even after adding new VMs to the backup, which is puzzling. I’m trying to understand if this is expected behavior, a bug in the UI, or a misconfiguration.Caching Question:I’ve configured a local cache for the S3-backed datastore as recommended in the documentation. Can someone explain how the caching mechanism works in detail? Specifically:
  • Does the local cache store temporary data, or is it a permanent copy of some chunks?
  • Do I need to manually maintain or clear the cache, or is it fully managed by PBS (e.g., during garbage collection)?
 
In the same thread I have a trouble to share.
I begin to test the S3 storage functionality.
I create a new bucket on Backblaze and a new pair of keys for PBS.
First problem was that PBS don't accept a bucket name that contain capital letters: it causes a regex error during the datastore creation.
Secondary (with a bucket name accepted) i created a new datastore on a S3 backend, but the creation was interrupted, after few seconds, by an error: "access time safety check failed: failed to upload chunk to se backend: chunk upload failed: unexpected status code 501 Not Implemented"
On the bucket PBS creates a directory named as the datastore and inside a file named ".inuse" and nothing else, either in the local cache directory.
 
In the same thread I have a trouble to share.
I begin to test the S3 storage functionality.
I create a new bucket on Backblaze and a new pair of keys for PBS.
First problem was that PBS don't accept a bucket name that contain capital letters: it causes a regex error during the datastore creation.
Secondary (with a bucket name accepted) i created a new datastore on a S3 backend, but the creation was interrupted, after few seconds, by an error: "access time safety check failed: failed to upload chunk to se backend: chunk upload failed: unexpected status code 501 Not Implemented"
On the bucket PBS creates a directory named as the datastore and inside a file named ".inuse" and nothing else, either in the local cache directory.

Configuration => S3 Endpoints => Select/Create your Endpoint => Provider Quirks => Skip If-None-Match header

And then process to import the datastore in "Datastore". Maybe you have first to "clean/delete" your specified cache folder on the PBS.
 
Hi everyone,I’m currently testing the new S3-compatible object storage backend in Proxmox Backup Server, and I’m really impressed with its potential! The integration of S3 storage as a datastore is a game-changer for offsite backups. However, I’ve run into an issue with storage usage reporting and have a question about caching that I’d love to get some clarity on from the community.Issue Description:I’ve set up a datastore using an S3-compatible bucket, and backups are working well. However, there’s a discrepancy between the storage usage reported in the PBS UI and what my S3 provider shows:
  • PBS UI: Reports ~105 GB of storage used.
  • S3 Provider: Reports ~600 GB of storage used.
The current dashboard shows only the statistics for the local cache, an s3 specific implementation is still lacking, see https://bugzilla.proxmox.com/show_bug.cgi?id=6563
Interestingly, when I sum up the storage of all the VMs being backed up, the ~600 GB reported by the S3 provider seems accurate. I’ve also noticed that the PBS UI storage usage sometimes decreases even after adding new VMs to the backup, which is puzzling. I’m trying to understand if this is expected behavior, a bug in the UI, or a misconfiguration.Caching Question:I’ve configured a local cache for the S3-backed datastore as recommended in the documentation. Can someone explain how the caching mechanism works in detail? Specifically:
  • Does the local cache store temporary data, or is it a permanent copy of some chunks?
  • Do I need to manually maintain or clear the cache, or is it fully managed by PBS (e.g., during garbage collection)?
The local cache is a least recently used cache, storing the most recently used chunks and all the logical metadata for PBS datastore operation, the latter being the reason that the cache needs to be persistent, as the PBS instance will rely on it (there is however and S3 refresh button to refresh these contents in the more dropdown on the datastore contents tab).

The cache will clear chunks once the capacity has been reached, so no manual intervention is required there. There is however still an issue with missing cache rewarm, not reclaiming cache space on service restart, system reboot or when setting the datastore maintenance mode. This is actively being worked on already, see https://lore.proxmox.com/pbs-devel/20250801141024.626365-1-c.ebner@proxmox.com/T/ So in these cases it might be currently required to intervene manually.
 
  • Like
Reactions: devaux
Configuration => S3 Endpoints => Select/Create your Endpoint => Provider Quirks => Skip If-None-Match header

And then process to import the datastore in "Datastore". Maybe you have first to "clean/delete" your specified cache folder on the PBS.
All done: I created the new datastore with this option activated and did a test backup. Thank you.
 
The current dashboard shows only the statistics for the local cache, an s3 specific implementation is still lacking, see https://bugzilla.proxmox.com/show_bug.cgi?id=6563

The local cache is a least recently used cache, storing the most recently used chunks and all the logical metadata for PBS datastore operation, the latter being the reason that the cache needs to be persistent, as the PBS instance will rely on it (there is however and S3 refresh button to refresh these contents in the more dropdown on the datastore contents tab).

The cache will clear chunks once the capacity has been reached, so no manual intervention is required there. There is however still an issue with missing cache rewarm, not reclaiming cache space on service restart, system reboot or when setting the datastore maintenance mode. This is actively being worked on already, see https://lore.proxmox.com/pbs-devel/20250801141024.626365-1-c.ebner@proxmox.com/T/ So in these cases it might be currently required to intervene manually.
Thank you for this little tidbit "(there is however and S3 refresh button to refresh these contents in the more dropdown on the datastore contents tab)".

Being curious I have been trying to move locally stored backups to S3 and was disappointed not to see any in PBS, even though I thought it should.

Anyway, after doing the S3 refresh, and a brief wait, they all appeared.

Just need to do some more tests and see if this is viable, or if I have to start a new backup history in cloud storage,