PBS storage suddenly (like over night) full!?!

proxwolfe

Well-Known Member
Jun 20, 2020
501
52
48
49
Hi,

I have a PBS running for a while (two years?) now with apporx. 25TB worth of storage.

Regularly, I check how much space is left / when the datastore is going to be full. A couple of days ago, it said that it was running full in a bit of 50 days. So I decided to make room and so I combed through all the backups and deleted a bunch. The GC logs show that 2.4TB worth of data were removed two days ago.

Today, it says that storage is full. I did not add any new machines to be backed up and (see above) I actively removed data. And yet, suddenly, it is full. How can that be?

And, more importantly, how do I free up space now? Because while I can still "delete" stuff, GC won't run anymore and actually free up space.

Thanks!
 
Hi!
note that the "Estimated Full" indication is not very accurate (it's just a linear regression), especially if you don't backup regularly.
Are you sure the GC log said "Removed Garbage" and not "Pending removals". Running the GC once won't remove any data, you have to run it another time after 24hours.
Also which filesystem do you use and why can't you run GC anymore?
 
And yet, suddenly, it is full. How can that be?

assuming the datastore is not shared with other, non-PBS usage - the backup delta was bigger than before, and the datastore was filled up as a result? the estimation can always just be an estimation, there is no crystal ball that tells us PBS how much space future backups will need..

you need to free up enough space by pruning (or adding additional space, if that is an option) to allow GC to complete. I would highly recommend ensuring no new backups are made while you are trying to free up space, else you will have to start over.
 
  • Like
Reactions: ggoller
Are you sure the GC log said "Removed Garbage" and not "Pending removals". Running the GC once won't remove any data, you have to run it another time after 24hours.
Hmm, it wasn't the log but rather the status line. Looking at it again, maybe this is the sum total of all collected garbage ever?

In any case, GC ran and did something.

Also which filesystem do you use and why can't you run GC anymore?
The datastore is on ZFS.

Why GC can't run anymore, I don't know. It complained about the disk being full or no space being left. This would suggest that GC needs some space to run but I don't know how it works.
 
assuming the datastore is not shared with other, non-PBS usage - the backup delta was bigger than before, and the datastore was filled up as a result? the estimation can always just be an estimation, there is no crystal ball that tells us PBS how much space future backups will need..
Theoretically, I agree. But in reality there was nothing that should cause a big backup delta. No new VMs, no new drives in VMs, no large data changes on drives in VMs...

you need to free up enough space by pruning
That's what didn't work. It is my understanding that pruning doesn't remove data but only marks it as removable and GC is the one that actually should remove that data. But that did not work (seemingly because it needs some free space to operate).

(or adding additional space, if that is an option)
Yeah, that's what I ended up doing. I replaced two drives (one vdev) with larger drives. Took a couple of days to complete but now I've got a couple of TBs of free space again.
I would highly recommend ensuring no new backups are made while you are trying to free up space, else you will have to start over.
Yes, I suspended all backup jobs to this PBS and set up an interim PBS inside my PVE cluster with a spare drive I had lying around.
 
Theoretically, I agree. But in reality there was nothing that should cause a big backup delta. No new VMs, no new drives in VMs, no large data changes on drives in VMs...

no large data changes doesn't mean that the backup delta hasn't changed - e.g., if trim is not set up / working.. especially Windows VMs have acted up in the past with regards to that..


That's what didn't work. It is my understanding that pruning doesn't remove data but only marks it as removable and GC is the one that actually should remove that data. But that did not work (seemingly because it needs some free space to operate).

pruning does free up some space because it will delete metadata.. whether that is enough to matter depends n how big your backups are, and how aggressive you are willing to prune ;)
 
@proxwolfe remember to set a zfs quote on the datastore, so you won't run into the same problem again!
 
  • Like
Reactions: proxwolfe
pruning does free up some space because it will delete metadata.. whether that is enough to matter depends n how big your backups are, and how aggressive you are willing to prune
Right. Well, I have tried twice and what little space pruning released was not enough to let GC run.

And didn't want to prune everything in order let GC then remove it all because then - what would have been the point in trying to save the datastore. But I get what you're saying.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!