[TUTORIAL] PBS on TrueNAS - Have your cake and eat it too

plus this was in the first few lines of the OP....
But Udos remark wasn't on the OPs post but antubis question:
Just a thought... since TrueNAS Scale is based on Debian as well as PBS... why not install the PBS directly on the TrueNAS system (without LXC or VMs)?

The answer to that is "no" which is exactly what Udo explained. So your dispute is a just a missunderstanding because both of you talked about different things ;)
 
  • Like
Reactions: UdoB and scyto
@PwrBank , i think i have nailed the permissions needed, and i went one step further and wrote a script that can be run as a cron job on truenas that backs up the pbs dataset to Azure (in my case). The script checks PBS isn't writing to the store, stops the proxmox backup service (not the container), does a snapshot, restarts the pbs service in the container, then snapshot is mounted, rcloned to azure and it is alll cleaned up (snapshot deleted and then mount unmounted). - lemme know if you are interested, for me this is going to be my production pbs1 from now on.

1747199394177.png1747199470934.png

having done this its probably not worth stopping the pbs service as snapshots are so quick, so a pre- script for the sync task to make sure there are no backup jobs running and one is probably good to go for 99% of use cases and use the synctask to manage the snapshot lifecycle - but now i have a template to use with other services where that might be less true....
 
Last edited:
@PwrBank , i think i have nailed the permissions needed, and i went one step further and wrote a script that can be run as a cron job on truenas that backs up the pbs dataset to Azure (in my case). The script checks PBS isn't writing to the store, stops the proxmox backup service (not the container), does a snapshot, restarts the pbs service in the container, then snapshot is mounted, rcloned to azure and it is alll cleaned up (snapshot deleted and then mount unmounted). - lemme know if you are interested, for me this is going to be my production pbs1 from now on.

I really wouldn't do this if you care sbout your data, in this thread a simmiliar script failed:

 
  • Like
Reactions: UdoB
I really wouldn't do this if you care sbout your data, in this thread a simmiliar script failed:
and that is why

1. i stop the pbs backup service (only when there are no current jobs running)
2. i snapshot the pbs datasets
3. restart pbs server so it is available for backups (because at this point its out of the picture)
4. only backup the r/o mounted snapshot

backing up a live file system like that script - that's hilarious! - his post shows the race conditions of the blocks rclone has indexed at the start of the job 'disappearing'

primary backup of the pbs will be to another pbs i haven't yet built, though considering my # of backups and speed with metadata option i won't use pbs sync but two jobs in the cluster - one to each pbs (my whole backup cycle every 2 hours is like 45s)

tl;dr never let a backup tool backup live files without understanding the implications..... and tl;dr my pbs data is quite safe with this approach, if it isn;t then we have bigger issues as it would mean the on-disk state can't be trusted....
 
Last edited:
  • Like
Reactions: Johannes S and UdoB
oh god they did it over tailscale too, rofl - I have a 10gb external link and i am very close to Azure US West 2 and a years worth of pbs data for my main proxmox cluster is maybe 500GB tops
 
Last edited:
  • Like
Reactions: Johannes S
considering my # of backups and speed with metadata option i won't use pbs sync but two jobs in the cluster
Probably you know that, but I thinks it's worth to mention: every time you switch the destination the "dirty bitmap" is dropped. If you toggle every time then every backup needs to read the complete source.

That's why "one single primary PBS" plus "some secondaries via remote sync" is/was better.

(( Disclaimer: that was the actual behavior last time I checked. Any news on this one...? ))
 
  • Like
Reactions: Johannes S
tl;dr never let a backup tool backup live files without understanding the implications..... and tl;dr my pbs data is quite safe with this approach, if it isn;t then we have bigger issues as it would mean the on-disk state can't be trusted....


All good :) You obviouvsly know what you are doing and know of possible problems and how to avoid them. My sermon was more for people googling for rclone pbs backups or something like that and ending up in this thread so they think things through before they proceed ;)
 
Last edited:
Probably you know that, but I thinks it's worth to mention: every time you switch the destination the "dirty bitmap" is dropped. If you toggle every time then every backup needs to read the complete source.
no i didn't, i wasnt going to change the destination, just have tow jobs, I had assumed each job maintains its own bitmap and comapares against the metadata (i used metadat mode) - i will do some testing and look oout for what you said, its not like the sync jobs will be long, i just heard they were slower. Thanks for the tip.
 
  • Like
Reactions: Johannes S and UdoB
You obviouvsly know what you are doing and know of possible problems
well i like to think that, but shhhh, do you hear that - sounds like my pride coming before a fall - like this week when i eventually relalized i was having cephFS issues on one node because with a script i had accidentally overwritte the keyfile with a different key.... in /etc/pve/priv .... thank god i never rebooted the other two nodes....

yeah on the script i spent two hours designing it with chatgpt to account for a bunch of failure modes before i even tested it once (i can't code, not even bash scripts, so thats going to be my achillies heal and where i make more mistakes....) but i have designed the requirements for geo dispersed cloud control planes to be used by hundreds of thousands of customers and millions of users - my devs hate me for thinking though all the edge case failures, hahaha.

next up, testing the same methodologies with urbackup and backrest to figure out what i want to do for general data backups from loca machines > truenas and truenas > cloud

i have been using pbs to backup some raspberrypis and my ceph volumes - its worked well so far.... i think PBS is exceptional piece of tech that I would love to expanded to more use cases natively / with gui etc.
 
Last edited:
  • Like
Reactions: Johannes S
It would be great to have native cloud storage support in PBS, my preference would be to also have Azure (i have free credits in Azure each year, and never use them all up).

The approach I took is generic, it uses rclone, the only thing unique to the script is incus commands to stop the pbs services, a tweak to pbs service to stop it being socket activated (i need to think about that one more) and how I extract azure creds from the truenas middleware layer rather than storing them in the script (though i don't like chatgpts suggestion of storing them in a tmp file and will rework that when i have time).


FWIW, here is i think the current version (i cant check at moment as my BMC failed causing the server to be non-useable)
(the emjoi is hilarious evidence of using chatgpt)

It would be great if i didn't have write all of this and maintain i and PBS just did the orchestration without ever going offline, i am sure if proxmox folks did it they would know how best to manage the sync to cloud storage).

Code:
#!/bin/bash
exec > >(tee -a "$LOGFILE") 2>&1

set -euo pipefail

# === CONFIG ===
INCUS_CNAME="pbs1"
ZFS_DATASET="rust/local-backups/pbs"
SNAP_NAME="cloudbackup-$(date +%Y%m%d-%H%M)"
SNAP_MNT="/mnt/pbs-snapshot-${SNAP_NAME}"
AZURE_CONTAINER="pbs"
LOGFILE="/var/log/pbs-cloud-backup.log"
TASK_ID=2

MAX_WAIT_MINUTES=15
WAIT_INTERVAL=30
MAX_ATTEMPTS=$(( MAX_WAIT_MINUTES * 60 / WAIT_INTERVAL ))
ATTEMPT=0

# === Logging Setup ===
LOG_TAG="PBSCloudBackup"
log_info()   { local msg="[$(date '+%F %T')] ℹ️  $1"; echo "$msg"; logger -t "$LOG_TAG" "$msg"; }
log_warn()   { local msg="[$(date '+%F %T')] ⚠️  $1"; echo "$msg"; logger -p user.warn -t "$LOG_TAG" "$msg"; }
log_error()  { local msg="[$(date '+%F %T')] ❌ $1"; echo "$msg"; logger -p user.err -t "$LOG_TAG" "$msg"; }

# === Redirect all stdout/stderr to log file + terminal ===
exec > >(tee -a "$LOGFILE") 2>&1

log_info " Starting PBS snapshot + Azure backup job"

# === STEP 0: Confirm container exists ===
if ! /usr/bin/incus list --format json | jq -e '.[] | select(.name == "'"$INCUS_CNAME"'")' >/dev/null; then
  log_error "Container '$INCUS_CNAME' not found. Aborting."
  exit 1
fi

# === STEP 1: Wait for PBS to be idle ===
log_info "⏳ Checking for running PBS tasks..."
while /usr/bin/incus exec "$INCUS_CNAME" -- \
    proxmox-backup-manager task list --output-format json \
    | jq -e '.[] | select(has("endtime") | not)' >/dev/null; do

    if (( ATTEMPT++ >= MAX_ATTEMPTS )); then
        log_error "Timeout: PBS still has running tasks after $MAX_WAIT_MINUTES minutes"
        exit 1
    fi

    log_info "PBS busy... retrying ($ATTEMPT/$MAX_ATTEMPTS)"
    sleep "$WAIT_INTERVAL"
done
log_info "✅ PBS is idle. Proceeding with backup"

# === STEP 2: Stop PBS inside container ===
log_info " Stopping PBS server..."
if ! /usr/bin/incus exec "$INCUS_CNAME" -- systemctl stop proxmox-backup; then
  log_error "Failed to stop PBS in container '$INCUS_CNAME'"
  exit 1
fi

# === STEP 3: Snapshot ===
log_info " Taking ZFS snapshot: ${SNAP_NAME}"
zfs snapshot "${ZFS_DATASET}@${SNAP_NAME}"

# === STEP 4: Restart PBS ===
log_info " Restarting PBS server..."
if ! /usr/bin/incus exec "$INCUS_CNAME" -- systemctl start proxmox-backup; then
  log_error "Failed to start PBS in container '$INCUS_CNAME'"
  exit 1
fi

# === STEP 5: Mount Snapshot ===
log_info " Mounting snapshot: ${SNAP_MNT}"
mkdir -p "$SNAP_MNT"
if ! mount -t zfs -o ro "${ZFS_DATASET}@${SNAP_NAME}" "$SNAP_MNT"; then
  log_error "Failed to mount snapshot. Cleaning up."
  zfs destroy "${ZFS_DATASET}@${SNAP_NAME}" || true
  rmdir "$SNAP_MNT"
  exit 1
fi

# === STEP 6: Get Azure credentials ===
log_info " Extracting Azure credentials from Cloud Sync task ID ${TASK_ID}..."
CRED_ID=$(midclt call cloudsync.query | jq ".[] | select(.id == ${TASK_ID}) | .credentials.id")

read ACCOUNT KEY ENDPOINT < <(
  midclt call cloudsync.credentials.query \
    | jq -r ".[] | select(.id == ${CRED_ID}) | .provider | [.account, .key, .endpoint] | @tsv"
)

if [[ -z "$ACCOUNT" || -z "$KEY" || -z "$ENDPOINT" ]]; then
  log_error "❌ Failed to extract Azure credentials. Aborting."
  umount "$SNAP_MNT"
  zfs destroy "${ZFS_DATASET}@${SNAP_NAME}"
  rmdir "$SNAP_MNT"
  exit 1
fi

CONFIG_FILE=$(mktemp)
cat <<EOF > "$CONFIG_FILE"
[azure]
type = azureblob
account = ${ACCOUNT}
key = ${KEY}
endpoint = ${ENDPOINT}
access_tier = Cool
EOF

# === STEP 7: Sync to Azure ===
log_info "☁️ Syncing snapshot to Azure container: '$AZURE_CONTAINER'"
if ! rclone --config "$CONFIG_FILE" sync "$SNAP_MNT" "azure:${AZURE_CONTAINER}" \
  --transfers=32 \
  --checkers=16 \
  --azureblob-chunk-size=100M \
  --buffer-size=265M \
  --log-file="$LOGFILE" \
  --log-level INFO \
  --stats=10s \
  --stats-one-line \
  --stats-one-line-date \
  --stats-log-level NOTICE \
  --create-empty-src-dirs; then
  log_error "❌ Rclone sync failed"
  exit 1
fi

# === STEP 8: Cleanup ===
log_info " Cleaning up snapshot and temp files"
rm -f "$CONFIG_FILE"
umount "$SNAP_MNT"
zfs destroy "${ZFS_DATASET}@${SNAP_NAME}"
rmdir "$SNAP_MNT"

log_info "✅ Backup complete!"
 
Last edited:
  • Like
Reactions: Johannes S
As an update, the only thing that isn't working as expected (within reason) is the PBS benchmarking tool. It tries to reference 127.0.0.1 for testing instead of the actual IP of the PBS instance.

Other than that, still working like a champ!
 
As an update, the only thing that isn't working as expected (within reason) is the PBS benchmarking tool. It tries to reference 127.0.0.1 for testing instead of the actual IP of the PBS instance.

Other than that, still working like a champ!
tell me how to test and i will test and see if i have the same issue
 
tell me how to test and i will test and see if i have the same issue
Run
Bash:
proxmox-backup-client benchmark --repository NameOfPBSstorage
On one of your PVE hosts that have the PBS attached

I get this back
1753464411847.png
 
@PwrBank
thats normal behaviour and nothing to do with running pbs in a LXC on truenas

you didn't specify a valid pbs datastore URL so it assumed you meant locally, this is by design (see the pbs client docs)
 
  • Like
Reactions: Johannes S
Ahh you are right. I put the connection info in from PBS and it worke

glad you got it working :-)

this is mine, i need to go find out if the TLS speed is because of the 1gbps connection on my proxmox node :-) (the server has 25gbe to the switch)

Code:
Uploaded 140 chunks in 5 seconds.
Time per request: 37906 microseconds.
TLS speed: 110.65 MB/s   
SHA256 speed: 1844.55 MB/s   
Compression speed: 621.61 MB/s   
Decompress speed: 902.09 MB/s   
AES256/GCM speed: 5035.04 MB/s   
Verify speed: 609.42 MB/s   
┌───────────────────────────────────┬─────────────────────┐
│ Name                              │ Value               │
╞═══════════════════════════════════╪═════════════════════╡
│ TLS (maximal backup upload speed) │ 110.65 MB/s (9%)    │
├───────────────────────────────────┼─────────────────────┤
│ SHA256 checksum computation speed │ 1844.55 MB/s (91%)  │
├───────────────────────────────────┼─────────────────────┤
│ ZStd level 1 compression speed    │ 621.61 MB/s (83%)   │
├───────────────────────────────────┼─────────────────────┤
│ ZStd level 1 decompression speed  │ 902.09 MB/s (75%)   │
├───────────────────────────────────┼─────────────────────┤
│ Chunk verification speed          │ 609.42 MB/s (80%)   │
├───────────────────────────────────┼─────────────────────┤
│ AES256 GCM encryption speed       │ 5035.04 MB/s (138%) │
└───────────────────────────────────┴─────────────────────┘
 
  • Like
Reactions: Johannes S