I'll circle back to square one.
if the filesystem behind your datastore is 100% full, you do need to do the following:
- prevent new backups from being made (else it will be full again)
- free up or add additional space
- then run GC
if you just run GC, it will fail, since running GC requires write access to the datastore (for locking, for marking chunks, ..). it doesn't need much space, but enough so that write operations don't fail.
the GC will calculate a cutoff, only chunks older than this cutoff and not referenced by any index are considered for removal. any chunks newer than this cutoff and not referenced are counted as "pending". the cutoff is calculated like this: max(24h+5min, age of oldest running task).
like I said, if you run GC multiple times with long enough gaps in between and your pending doesn't go down, there can be two causes:
- there's a worker running across all the GC runs (if you rebooted the node, that seems unlikely
)
- the pending chunks have a wrong timestamp that is in the future
the second one is easy to verify, just run find with the appropriate parameters. for example (on a datastore that has no operations running!):
Code:
touch /tmp/reference
find /path/to/datastore/.chunks -type f \( -newermm /tmp/reference -or -neweraa /tmp/reference \)
if there are any chunks with atime or mtime in the future, their path will be printed.
another option would be:
- something messes with your chunk store (e.g., by updating the timestamps every hour so that chunks never expire even if they are not referenced)
if you have any scripts/cron jobs/.. that might touch the chunk store, disable them!