Backup size

Kirani

New Member
Jul 13, 2022
8
0
1
Hi, I am currently running a backup job from from two PVE 7.2-11 nodes to a PBS 2.2-6 server.

As an example, one VM is running Windows server with one 35GB disk, and one 300GB disk. The 300GB disk is formatted empty.

Storage on the PVE node is LVM-thin and shows minimal usage, in line with the actual data on disks, however a backup of the above VM is 360GB is size.

The 300GB disk has SSD emulation and Discard enabled, and is running Windows 10.

From what I have read I shouldn't need to run any manual trim commands; why is the backup still so large? Or is that size not the true backup size?

This is the backup configuration:

Code:
()
agent: 1
boot: order=scsi0;net0
cores: 2
memory: 4092
meta: creation-qemu=6.2.0,ctime=1657416664
name: DELTA
net0: virtio=CA:15:1C:4C:B5:CD,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: win10
scsi0: local-lvm:vm-105-disk-0,discard=on,format=raw,size=35G,ssd=1
scsi1: local-lvm:vm-105-disk-2,discard=on,size=300G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=f4047ad8-2e28-4e6b-88ff-55c4f1161caa
sockets: 2
vmgenid: 2a0548d4-edac-447f-8a47-6087eb2d704e
#qmdump#map:scsi0:drive-scsi0:local-lvm:raw:
#qmdump#map:scsi1:drive-scsi1:local-lvm:raw:

Thanks for any help!
 
Where do you get those numbers? PBS doesn't know how big a backup is and only shows the size of the raw storages you backed up. It just knows the size of the complete datastore where it can't be completely differentiated what data belongs to which guest because everything will be chopped in 4MB chunk files and there will be duduplicated across guests. So multiple guests can use the same data.

For a rough estimation you could have a look at the logs of the backup job after backing up a VM. It should tell you how much of the data doesn't had to be send to the PBS because its already stored there (deduplication) and how much of the data was empty data/zero blocks which won't use up any space.
 
Last edited:
I was looking at the backup in PVE Datacenter > Node > VM > Backup > Size

My datastore usage in PBS seems high for the data I have on the VMS so trying to narrow down how it is being used
 
You can ignore all those size numbers in the webUI except for the total datastore size.

You should check if discard is working and if prune and GC jobs are run regularily. Keep in mind that PBSs GC task will only delete data that was pruned atleast 24 hours and 5 minutes ago. So a GC right after a prune won't free up any space.
 
What would be the best way to ensure discard is working? I can't see where I can get an accurate measurement of how much disk space is being used for the VMs either in PVE or PBS.

If I total all those webUI numbers for the VMs being backed up it shows around 1.2TB, and my PBS datastore shows 1.07TB used, so not for off. I know you mentioned that they should be ignored, I was just looking to compare.

If I total all the space used inside the VMs - either Windows or Linux - it is only 525GB. I have set retention to only a single backup for the time being, too, to try and help analyze.

From the numbers it looks like the entire raw disk is being backed up for each VM instead of just the data used. I did try running optimize in Windows on the server with the largest disks to try and trim, but it didn't make a difference.

I did remove unneeded disks from one backup last night, so I wonder if the 24 hour wait for GC deletion is playing a part, and I need to wait for that.
 
I assume this shows discard is working correctly:

Code:
#  lvs -a | egrep 'LV|vm-105-disk-2'
  LV              VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  vm-105-disk-2   pve Vwi-aotz-- 300.00g data        0.03
 
Doing some more digging on this, I checked the LV information for each of the disks being backed up:

Code:
LV              VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  vm-104-disk-0   pve Vwi-aotz--  32.00g data        69.14                                 
  vm-105-disk-0   pve Vwi-aotz--  35.00g data        79.43                                 
  vm-105-disk-2   pve Vwi-aotz-- 300.00g data        0.39                                   
  vm-110-disk-0   pve Vwi-aotz--  64.00g data        64.47
  vm-102-disk-0   pve Vwi-aotz-- 155.00g data        70.02                                 
  vm-102-disk-1   pve Vwi-aotz-- 305.00g data        57.61                                 
  vm-102-disk-2   pve Vwi-aotz-- 305.00g data        50.90

Calculating those data percentages, this totals to around 530GB, which is the same total I get from looking inside each guest VM as the space used. This suggests to me that discard is working correctly.

I have one backup job that has retention set to keep-last=1, but my repository is showing as 1.12TB used.

If I look at the backup job configuration for each VM and combine the disk information:

Code:
scsi0: local-lvm:vm-104-disk-0,discard=on,format=raw,size=32G
scsi0: local-lvm:vm-105-disk-0,discard=on,format=raw,size=35G,ssd=1
scsi1: local-lvm:vm-105-disk-2,discard=on,size=300G,ssd=1
scsi0: local-lvm:vm-110-disk-0,discard=on,size=64G,ssd=1
scsi0: local-lvm:vm-102-disk-1,discard=on,size=305G,ssd=1
scsi1: local-lvm:vm-102-disk-2,discard=on,size=305G,ssd=1
scsi5: local-lvm:vm-102-disk-0,discard=on,size=155G,ssd=1

This totals 1.2TB.

It looks to me as if the full disk is being backed up, regardless of discard being configured and working. Is this possibly the issue?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!