storj for s3 storage

daschmidt

Member
Feb 5, 2023
83
9
13
Österreich
Good morning,

Has anyone successfully integrated Storj as S3? The permissions must be correct; I set up rclone successfully with the same credentials. When setting up the datastore, I get the following error: failed to access bucket: bucket does not exist or no permission to access it (400)

Code:
s3-endpoint: storj
        access-key xxxxxx
        endpoint gateway.eu1.storjshare.io
        secret-key xxxxxx


tried it also gateway.storjshare.io
 
Please try setting the region accordingly (i guess it should be eu1 in your case?). To explain, the region is part of the aws sign v4 scheme used to authenticate requests to the api.
 
Also, forgot to mention that you should use the path style bucket addressing, as your bucket name is not part of the endpoint url.
 
  • Like
Reactions: igort and daschmidt
Hello,

I connected PBS ver. 4.0.20 to Storj bucket without any problems, but after upgrading to 4.1.0 and rebooting the server it no longer works - "Datastore is not available". What should I try?

Thanks.
 
Check if the datastore local cache path is correctly mounted, e.g. in the output of mount. Further, you can check if the s3 endpoint is reachabe by using proxmox-backup-manager s3 check <endpoint-id> <bucket-name>
 
I see the point. It was obviously a bad idea to set /tmp for Local Cache. The reboot probably deleted the /tmp/.chunks directory.
lookup_datastore failed - unable to open chunk store at "/tmp/.chunks" - No such file or directory (os error 2)

Is the solution to create /tmp/.chunks or how to change the Local Cache path for an existing S3 datastore?
 
Is the solution to create /tmp/.chunks or how to change the Local Cache path for an existing S3 datastore?
I would recommend to just remove the datastore without deleting any contents and then recreate it using the same name and enpdoint (with reuse datastore and overwrite in-use marker flags set) ,but with a persisten local store cache path. Further, you might want to set the cache up on a dedicated partition or with a quota, see https://pbs.proxmox.com/docs/storage.html#datastores-with-s3-backend
 
Last edited:
Hello everyone

I've been trying to use Storj as an S3 backend for PBS 4.1.6 and I'm running into multiple issues that in combination make it effectively unusable for me right now. I'd like to share the full picture in one place in case others see the same and to get some feedback from the team.

Setup
  • PBS 4.1.6 with S3 datastore backend (other s3 backends work like Hetzner or Backblaze, but they aren't super reliable for PBS)
  • Tried with the hosted Storj gateway (gateway.eu1.storjshare.io, path-style, region eu1, quirk skip-if-none-match-header)
  • Also tried with a local self-hosted Storj gateway (gateway-st) behind nginx as TLS terminator, since PBS requires HTTPS and the gateway only speaks HTTP
  • All other S3 clients I tested against the same endpoints (mc, aws cli, rclone, curl with SigV4) work without issues
Problem 1 — hosted Storj gateway: delete snapshot always fails with 502

Writing backups and verifying them works fine. But deleting a snapshot through the web UI or via proxmox-backup-manager snapshot forget fails every single time with:

Code:
failed to delete snapshot - "unexpected status code 502 Bad Gateway" (400)

The proxy log shows the raw upstream HTML:

Code:
proxmox-backup-proxy[xxxxx]: <html><body><h1>502 Bad Gateway</h1>
The server returned an invalid or incomplete response.
</body></html>

This is not intermittent — it happens on every delete attempt I've made. Retrying does not help.

Problem 2 — hosted Storj gateway: verify occasionally fails with missing etag

During verify jobs I sometimes see:

Code:
can't verify chunk, load failed - missing header 'etag'
TASK ERROR: verification failed - please check the log for details

It doesn't affect every chunk but it does cause verify jobs to fail.

Problem 3 — local gateway-st behind nginx: verify fails massively

To work around the hosted gateway issues I set up a local Storj gateway and put nginx in front of it for TLS. Writing backups works and verify used to succeed when the backup was freshly written and read back through the local gateway during the initial verify. A subsequent verify run however fails on thousands of chunks:

Code:
verified 24.72/116.00 MiB in 29.15 seconds, speed 0.85/3.98 MiB/s (2531 errors)
verify ... failed: chunks could not be verified

The gateway itself logs for each failed GET:

Code:
Error: Unable to write all the data to client uplink:
eestream: unexpected error: eestream: read completed buffer

I spent quite a bit of time narrowing this down:
  • Pulling the same chunks directly via mc against the gateway (127.0.0.1:7777) works every time, including recursive parallel downloads
  • The same via mc through nginx also works, single and parallel
  • Single curl range request with SigV4 through nginx works (206, correct Content-Range)
  • Only the PBS verify reliably triggers the eestream error in the gateway
So the gateway does clearly react differently to PBS's access pattern than to any other S3 client I tested, but the behavior is only visible in combination with PBS.


Current state
  • Backups are written successfully to Storj using the hosted gateway
  • Verify partially works on the hosted gateway, almost entirely fails on the local one
  • Delete does not work at all on the hosted gateway
  • Prune and GC I haven't even gotten to in any meaningful way because of the above
For me this means Storj as S3 backend is currently not usable — I can put data in, but I can't reliably check it, and I can't remove it.

Questions
  • Is anyone running PBS against Storj successfully with full delete/prune/GC working?
  • Would it be reasonable for the S3 client in PBS to retry on transient 5xx from the endpoint? A single 502 currently aborts the whole snapshot delete.
  • Is the missing-etag case something PBS could be more lenient about, or should that strictly be enforced?
  • Any suggestions from the team where to dig further, or is this something that would warrant a proper bug report since other tools like mc or aws work fine with both gateways?

Happy to provide more logs, packet captures, or run targeted tests. I know S3 backend is still tech preview, so this is meant as constructive feedback, not a complaint.

Thanks and Best Regards from
Switzerland
 
EDIT:

After digging deeper and deeper into this I got kinda lost and forgot to "fail fast". So I stepped back and looked for an alternative and it turns out rclone talks to Storj natively via the uplink library. No hosted gateway, no self-hosted gateway-st, no nginx in front.

Setup is embarrassingly simple:

Code:
rclone config create storj storj \
    provider=existing \
    access_grant="YOUR_ACCESS_GRANT"

Code:
rclone serve s3 storj: \
    --addr 127.0.0.1:7443 \
    --cert /path/to/fullchain.pem \
    --key  /path/to/privkey.pem \
    --auth-key "myaccess,mysecret" \
    --etag-hash MD5

Point PBS at it as a regular S3 endpoint and done. rclone handles TLS itself, signs/verifies SigV4, and talks uplink directly to the Storj satellite and storage nodes.

Result so far:
  • Writes work
  • Verify works (no more missing etag)
  • Delete works (the original reason I started this whole rabbit hole)
  • Prune + GC run through cleanly
Basically all three issues I described above disappeared at once, because the entire Storj gateway layer (both the hosted edge and the self-hosted gateway) is no longer in the path. The only Storj code running is the uplink library embedded in rclone, which is maintained upstream.

Caveat: encryption and erasure coding now happen locally on the PBS host instead of on Storj's side. CPU-wise that's fine in my setup, but worth mentioning if someone is on a very small machine.

Leaving this here in case anyone else hits the same wall. For me this is now the setup I'll run long-term — Storj itself as storage is great — super fast — but for PBS the gateway layer was the actual problem.