Aug 8, 2023
Milan, Italy

Sorry if it's a stupid question but I can't solve it by myself.
I just created a couple of weeks ago a new PBS server (version 3.0.3), a physical machine with 1 TB of storage (single SSD)

I set the automatic prune process to keep last 7 days of backup, we floated around 90 % of space used for some days. Unfortunately, last night pbs finished space and all backup fails.

Now i see pbs disk usage at 100.00 % and I don't know how to unlock the situation: plan was to change backup retention to only 5 days, I tried but how can I manually delete days 6 and 7 ?

I tried in datastore -> name_of_my_datastore -> content -> prune all -> Keep last 5 -> Prune -> but i receive error "ENOSPC: No space left on device"
i tried to manually delete some less important backup using the red recycle bin icon, it works but pbs is still at 100 % usage, no real file deleted.
I tried to manually start Garbage collection but i receive error "unable to start garbage collection job on datastore pbs - ENOSPC: No space left on device"
I can't even connect to shell because of full disk

How can I clean something ?

Nobody can help us?
Now it's difficult even to use web gui, i keep receiving popup saying pbs has no space for tmp files.
I think I can't manually delete backup from file system to avoid corruption.

Any help is apreciated. Thanks
if your datastore, etc. is on zfs on you're using 100%, the only solution is to delete files from it to make free space or to extend the pool to gain free space
garbage collection won't work since that needs to update the 'atime' of files which needs a bit of space on zfs to work

in general: i'd recommend never letting your storage (regardless of type) run full.
Sorry, don't want to be rude, but this was already clear.
Yes, file system is zfs, how can i manually delete files?
Directly on file system, deleting only older folder in /mnt/.chunks ? Or deleting all the content of /mnt/.chunks and rebuild all backup from scratch?
As I wrote I can't do anything on web gui.

Directly on file system, deleting only older folder in /mnt/.chunks ?
thats one option, but know that if you delete random chunks, it will corrupt backups that reference that chunk (and will make the backup unrestorable)

if you don't have any other data on the pool, i'd start deleting snapshots (not the chunk) that you don't need anymore and see if that is enough free space to let the garbage collect run through
only if you don't have any other choice, would i delete random chunks (or the whole datastore)

the other option is to attach a disk to the zpool temporarily, let the gc run and remove it again afterwards with 'zpool remove' (though i must admit i did not need this yet, so i would test that procedure ,e.g. in a vm, beforehand so you can be sure it works like you want to)
Ok, so I decided to manually clean some old chunks (from 10th october), then I immediately regain control of web gui and commands which is a good first step.

However, I tried a manual garbage collection and I discover that some files I deleted was used by other backup made even after 10th of october. Maybe because of deduplication, as far as I understand.

So i decided to delete all the backups and recreate them from scratch to avoid confusion. I used the red recycle bin icon next to every VM, then I run a garbage collection. What i see now is absolutely zero backup but still 920 gb in use.

Why there are 856 gb of pending removal? How to force this deletion to start recreate backup?



Find the reply by myself in the guide:

The garbage collection will only remove chunks that haven't been usedfor at least one day (exactly 24h 5m). This grace period is necessary becausechunks in use are marked by touching the chunk which updates the atime(access time) property. Filesystems are mounted with the relatime optionby default. This results in a better performance by only updating theatime property if the last access has been at least 24 hours ago. Thedownside is that touching a chunk within these 24 hours will not alwaysupdate its atime property.
Chunks in the grace period will be logged at the end of the garbagecollection task as Pending removals.

So my question now is: Can I start create the new backup or I have to wait 24 h? The deleted chuck still count as occupied space? there's no option to manually force Garbage collector immediately?

Tried to launch some backup, everything fails with "backup write data failed: command error: write_data upload error: pipelined request failed: inserting chunk on store 'pbs' failed for 487f8a49d3d5bc5a285954461f458c8c30710957493575099bf55ce3849b1298 - mkstemp "/mnt/.chunks/487f/487f8a49d3d5bc5a285954461f458c8c30710957493575099bf55ce3849b1298.tmp_XXXXXX" failed: ENOENT: No such file or directory"

So I deleted and recreate datastore. Now I'm rebuilding all the backup and system seems to work.

Best solution? NO.
If I had important data on those backup I can't simply delete the whole datastore to solve a "stupid" problem like full storage, however it seems Ok now. I'm surprised PBS doesn't use some sort of auto-defense system to halt all backup when 99.5 % of space is used. Backup will fail anyway but at least we keep the ability to use the GUI and send command to clean space.

I too had the same problem and I agree with the request to be able to set a safety margin to be able to recover in case the system goes "out of space"


