Hello,
i would like to discuss the best/fastest way to update the content of a external datastore.
Environment:
We use a PBS with a big datatsore which stores all our backups.
We also have a bunch of external harddrives, one for each day of the week, which we plug one in every day, free their space (prune & gc) and then sync a part of the data to them. After that we remove that external datastore / harddrive and put it into a fire proof safe.
This way we have 7 disks, each for each day of the week, containing the latest snapshots of that day away from the PBs.
The external datastores are really filled till maximum (more then 90%), so that i could not just sync the new content on them and then delete (prune & gc) the content from the week before, but have to clear (prune&gc) them completly before starting to sync.
This way the gc as well as the sync takes really long, as it could not use the deduplication mechanism / cannot reuse already existing chunks from the backups the week before.
I now wonder if there is a better way to do that? Partially pruning&gc , then syncing, then pruning & gc multiple times is also not possible because of the 24h that needs to be between them. In fact i have 24h only for every external datastore, as i change them every day. So if i insert a external datastore i can prune&gc it (last time this was done was a week before) then sync it. If the free space is not sufficient i can do the same on the next day, but then i need to replace that drive with the next one for the next day.
Another question is, if there is no solution for reusing those chunks from the week before, which way is the fastest to clear that external datastore at all. That i at least can save some time (and mechanical wear on the drives) by not prune&gc the whole content of that external drive.
i would like to discuss the best/fastest way to update the content of a external datastore.
Environment:
We use a PBS with a big datatsore which stores all our backups.
We also have a bunch of external harddrives, one for each day of the week, which we plug one in every day, free their space (prune & gc) and then sync a part of the data to them. After that we remove that external datastore / harddrive and put it into a fire proof safe.
This way we have 7 disks, each for each day of the week, containing the latest snapshots of that day away from the PBs.
The external datastores are really filled till maximum (more then 90%), so that i could not just sync the new content on them and then delete (prune & gc) the content from the week before, but have to clear (prune&gc) them completly before starting to sync.
This way the gc as well as the sync takes really long, as it could not use the deduplication mechanism / cannot reuse already existing chunks from the backups the week before.
I now wonder if there is a better way to do that? Partially pruning&gc , then syncing, then pruning & gc multiple times is also not possible because of the 24h that needs to be between them. In fact i have 24h only for every external datastore, as i change them every day. So if i insert a external datastore i can prune&gc it (last time this was done was a week before) then sync it. If the free space is not sufficient i can do the same on the next day, but then i need to replace that drive with the next one for the next day.
Another question is, if there is no solution for reusing those chunks from the week before, which way is the fastest to clear that external datastore at all. That i at least can save some time (and mechanical wear on the drives) by not prune&gc the whole content of that external drive.