[SOLVED] remove chunks from aborted partial pxar backup

msproact

Active Member
Nov 12, 2019
35
6
28
Nürnberg
www.proact.eu
Due to excludes first not working as I expected (see thread .pxarexclude not working as expected) I aborted the first backup-manager-client backups.

However before I did so it stored a lot of chunks of VM image files and other stuff that was not supposed to be backed up.

On aborting the backup the backup disappeared from the datastore content list in PBS. So I thought by clicking "Start GC" on the datastore I will get the chunks used for it back, as long as they have not been reused for one of the subsequent backups with working excludes.

However, it does not seem to be the case:

Code:
2020-07-15T11:10:45+02:00: starting garbage collection on store usb
2020-07-15T11:10:45+02:00: Start GC phase1 (mark used chunks)
2020-07-15T11:10:47+02:00: Start GC phase2 (sweep unused chunks)
2020-07-15T11:10:47+02:00: percentage done: 1, chunk count: 931
2020-07-15T11:10:47+02:00: percentage done: 2, chunk count: 1904
2020-07-15T11:10:47+02:00: percentage done: 3, chunk count: 2824
2020-07-15T11:10:47+02:00: percentage done: 4, chunk count: 3803
[…]
2020-07-15T11:10:48+02:00: percentage done: 97, chunk count: 93784
2020-07-15T11:10:48+02:00: percentage done: 98, chunk count: 94761
2020-07-15T11:10:48+02:00: percentage done: 99, chunk count: 95736
2020-07-15T11:10:48+02:00: Removed bytes: 0
2020-07-15T11:10:48+02:00: Removed chunks: 0
2020-07-15T11:10:48+02:00: Pending removals: 67004125560 bytes (41642 chunks)
2020-07-15T11:10:48+02:00: Original data bytes: 2121239972535
2020-07-15T11:10:48+02:00: Disk bytes: 108477348693 (5 %)
2020-07-15T11:10:48+02:00: Disk chunks: 55040
2020-07-15T11:10:48+02:00: Average chunk size: 1970882
2020-07-15T11:10:48+02:00: TASK OK

However I see Pending removals: 67004125560 bytes (41642 chunks) in there. How can I make it remove those chunks?

As the partial backup did not show up, I suspect it was deleted cause it was incomplete, so there are some chunks that are not referenced. I thought those would be deleted straight away without me having to prune something first.
 
We use a grace window of one day (to be specific 24 hours and 5 minutes) for removing pending chunks.

This mainly comes from the fact that garbage collection can go over a day boundary between mark (check all backup indexes for which chunks are used and touch them) and sweep (remove those chunks which weren't touched) and the behavior of checking access times under the by default enabled relative access-time mount option (relatime)
relatime:
...
Since Linux 2.6.30, the kernel defaults to the behavior provided by this option (unless noatime was specified), and the strictatime option is required to obtain traditional semantics. In addition, since Linux 2.6.30, the file's last access time is always updated if it is more than 1 day old.
-- https://manpages.debian.org/buster/mount/mount.8.en.html#FILESYSTEM-INDEPENDENT_MOUNT_OPTIONS

So we cannot trust the timestamps for newer ones and enabling strictatime could be a big performance hit.
 
We use a grace window of one day (to be specific 24 hours and 5 minutes) for removing pending chunks.

This mainly comes from the fact that garbage collection can go over a day boundary between mark (check all backup indexes for which chunks are used and touch them) and sweep (remove those chunks which weren't touched) and the behavior of checking access times under the by default enabled relative access-time mount option (relatime)

-- https://manpages.debian.org/buster/mount/mount.8.en.html#FILESYSTEM-INDEPENDENT_MOUNT_OPTIONS

So we cannot trust the timestamps for newer ones and enabling strictatime could be a big performance hit.

Thank you so much for this information. I had a couple partial backups that failed due to insufficient space. I managed to increase the storage and ran the backup again last night. It's still running now.
Anyways knowing that the grace period of 24 hours helped a lot. I just initiated the GC on that drive and freed up some 7% of space.

Thank you again.
Is there a way to enter shell from the GUI? just figured I could use SSH, but that is an extra step.
 
As I understand it there is no way to force it to remove these chunks? But to wait 24h5m? I guess I'll just have to advance the system time and revert it after removal then. I suspect that haxx will break something else? But I need that space, like NOW :/

1604696748677.png
 
No, there's no built-in.

For testing purpose (I do not want to guarantee anything for that), you could stop the backup server daemons, assemble a one-liner which iterates all files in the datastore .chunks folder and updates its mtime to two days ago (e.g., touch -d '-2 days'), that can require quite a long time, and after that start the server daemons again and re-run GC.
 
No, there's no built-in.

For testing purpose (I do not want to guarantee anything for that), you could stop the backup server daemons, assemble a one-liner which iterates all files in the datastore .chunks folder and updates its mtime to two days ago (e.g., touch -d '-2 days'), that can require quite a long time, and after that start the server daemons again and re-run GC.
It is better to change system date to '+2 days'. It works.
 
It is better to change system date to '+2 days'. It works.
I'd really not recommend that, messing with the whole system time can break lots of things, and if a NTP service is setup, which should be the case, it will start to resync the time immediately, which is done mostly in rather careful steps to avoid issues.
If it worked for you or this was a test system, great, but for others reading this: please avoid messing with the system time, especially on production system.
 
I'd really not recommend that, messing with the whole system time can break lots of things, and if a NTP service is setup, which should be the case, it will start to resync the time immediately, which is done mostly in rather careful steps to avoid issues.
If it worked for you or this was a test system, great, but for others reading this: please avoid messing with the system time, especially on production system.
Hi Thomas, you are right. But your solution with touch -d '-2 days' doesn't work. I think there is no other way to cleanup disk manually.
 
It is better to change system date to '+2 days'. It works.
This works perfectly. All backups were failing due to no space. No time to wait one day + 5 hours
Tried your solution. Changed backup to + 2 days and ran garbage collection manually and changed date back to same.
Wonderful results !!! No got 50% of space back.. Cheers !
 
Again, moving the system time to the future is a invasive change that can be quite problematic.
But your solution with touch -d '-2 days' doesn't work.
It does if done correctly, e.g.:

find /path/to/datastore/.chunks -type f -print0 | xargs -0 touch -d "-2 days"

Just adapt the first /path/... argument of the find command.
 
Again, moving the system time to the future is a invasive change that can be quite problematic.

It does if done correctly, e.g.:

find /path/to/datastore/.chunks -type f -print0 | xargs -0 touch -d "-2 days"

Just adapt the first /path/... argument of the find command.

Yes i can confirm it works.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!