Free space BUG?

Nov 27, 2020
13
4
8
48
I think it is a bug in the backup server due to zfs list / zpool wrong free space reporting ..
here is the backupserver and then the real usage .. this is the same storage ..
so the backupserver does not really know when the zfs storage is full

zpool list is
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
px2zfs 50.9T 13.0T 38.0T - - 1% 25% 1.00x ONLINE -

zfs list
NAME USED AVAIL REFER MOUNTPOINT
px2zfs 13.1T 27.4T 2.94T /px2zfs

it would also be very helpful if the Backupserver shows much the actually daily chunks use on the harddrive
not just the full size all the time, this view is not very helpful at all.
1606473121247.png
1606473113841.png

BackupServer View

1606472896285.png

Proxmox PVE View

1606472545475.png
1606472471179.png
 
Last edited:
  • Like
Reactions: Hunter-NL
zfs list
NAME USED AVAIL REFER MOUNTPOINT
px2zfs 13.1T 27.4T 2.94T /px2zfs

seems correct to me?
its a bit weird since the size shrinks when space is used outside of the dataset but in the pool, but the % of the dataset is correct? (used / (used+avail))

it would also be very helpful if the Backupserver shows much the actually daily chunks use on the harddrive
not just the full size all the time, this view is not very helpful at all.
there are no 'actually daily chunks' since all backups share the deduplicated chunks
so that size is the only 'true' usage (namely the size of the data that can be restored)

which other size would you like to display here?
 
Well imho important on the backupserver view is when the hd is nearly full atm the real data occupies 32.24% but the px2pbs which is the backup server shows 9.67% which is wrong.
 
it is correct as in 'it uses 9.67% of the available space of that dataset'

you can of course always use a whole pool (then it is the same) or set quotas on the zfs datasets (this way you can limit what the specific datasets consume)

trying to find out how much of the disk is full may seem easy, but is not that simple

just imagine someone has an ext4 on lvmthin, now we would have to somehow find out the thinpool and query that

it is always best if you use proper monitoring for such things as disk space
 
  • Like
Reactions: Ovidiu
I also like to see the amount of data that has been back upped of each backup session. Not the total disk size. Because that makes no sense. I've for example a 10TB disk assigned to a VM. And it has 20% disk usage. The first full backup is then 2TB is size. And maybe due deduplication 1,6TB backup storage required. Each incremental backup is the delta from this full backup and can be 100GB in size or 1,5TB. But not each session is the full disk of 10TB. So showing for each backup session the 10TB is useless.
 
Each incremental backup is the delta from this full backup and can be 100GB in size or 1,5TB. But not each session is the full disk of 10TB. So showing for each backup session the 10TB is useless.
since the backups are not incremental but deduplicated chunks, there is no 'incremental size'

for example:

if you have a disk with 2 GB
the first backup will take 2GB
then you change 1GB (for easiness assume it is aligned with chunk boundaries)
now the backup will only take an additional 1GB
but!

if you now delete the first backup (and wait a day and run garbage collection), your second backup will still reference 2GB and thus 'take' 2GB

it is even more complicated since the chunks are deduplicated across the whole datastore, meaning that if another vm has the exact same data, the second backup in the example above would have not added any usage

there are a few 'sizes' that would kinda make sense
* show the total size of all chunks that are referenced (like we are doing now)
* show the total size of all unique chunks of that backup
this would not count duplicated chunks (e.g. 0 chunks),
the problem is here that the backup now has to have a map which chunks were already sent,
which is kinda memory intensive potentially (if you have large vms/cts) and would still not be the 'incremental size'
* show the size of the datastore unique chunks the backup uses
i guess this is the most similar to your request, but this is hard and expensive to compute, since
this changes with every prune/delete/backup operation (for any number of snapshots)
and we would have to have a complete mapping of chunks <-> snapshots which is very memory/disk and compute intensive

so in conclusion, there may be a 'better' size metric, but none which are really feasible to implement and the 'cheapest' and still correct one is the one we already show
 
During a backup task the output is for example:
2020-12-01T03:03:07+01:00: Size: 1099511627776
2020-12-01T03:03:07+01:00: Chunk count: 262144
2020-12-01T03:03:07+01:00: Upload size: 157403840512 (14%)
2020-12-01T03:03:07+01:00: Duplicates: 224616+1 (85%)
2020-12-01T03:03:07+01:00: Compression: 88%
The Size is the total disk size in VM.
Upload Size is the amount of data backup. 14% of total VM disk size.
Resulted into 85% Duplicates and 88% compression factor.
Why not showing these numbers also in the view next to the Size column? Or make it configurable which columns are shown.

And yes, I do agree with you that after the Prune & GC the deduplication etc can be different. But at the time the backup session was done it uploaded 157GB of data. This gives me more information than everywhere showing the total disk size.
 
And yes, I do agree with you that after the Prune & GC the deduplication etc can be different. But at the time the backup session was done it uploaded 157GB of data. This gives me more information than everywhere showing the total disk size.
but thats only half the story.... yes the client has uploaded that data, but the chunks may have already existed on the server and not taken any additional space...
and as soon as you do pruning, new backup etc. this size has no meaning again...

but maybe we can show this data (if we have it still with the snapshot, i am not sure we save it there also)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!