PBS backs up deleted files

Joao Correa

Member
Nov 20, 2017
21
10
8
40
Hello!
I realized that even excluding files, before the backup is performed, these files are counted and backed up.
This is not good, especially when working with large files.

The problem can be easily isolated and reproduced
For example:
1. Create a new virtual disk on a virtual machine
2. Check the backup option, only for this disk
3. On the virtual machine, prepare the file system (I tested it with ext4 and xfs on a Debian 10 VM)
4. Copy a file to the new disk (a + - 200mb iso for example)
5. Make a backup (the backup will be initial)
6. Make a new backup (very small incremental backup because there is no new data)
7. Copy a new file to the disk (an iso of + - 350mb for example)

Seleção_582.png

8. Delete the new copied 350mb file (rm -rf file)
9. Make a new incremental backup
At this point you notice that the amount of data copied, corresponds to the size of the deleted file.

Seleção_580.png

As I said, I tested with ext4 and xfs. I also tested some cache options on the virtual disk, but the problem remained.

I am using the most current version of PBS and PVE.

Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.55-1-pve: 5.4.55-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.10.17-2-pve: 4.10.17-20
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-4
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-4
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

Code:
proxmox-backup: 1.0-4 (running kernel: 5.4.78-2-pve)
proxmox-backup-server: 1.0.6-1 (running version: 1.0.6)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.44-2-pve: 5.4.44-2
ifupdown2: 3.0.0-1+pve3
libjs-extjs: 6.0.1-10
proxmox-backup-docs: 1.0.6-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-xtermjs: 4.7.0-3
smartmontools: 7.1-pve2
zfsutils-linux: 0.8.5-pve1

My background storage is type LVM-Thin

Is there any way to solve this problem?

PS: When I restore the backup, the deleted file does not appear.
 
Last edited:
when backing up virtual machines, the underlying virtual disks gets backed up, not the files inside the vm. the emulation layer (qemu) has no idea
that a file was deleted, for that you have to call 'fstrim' or the equivalent command from inside your vm (and make sure you check the 'discard' box on the virtual disk)

this tells the emulation layer that those blocks are free again

note that even in that case, the blocks look like they have been changed, so they get read again (but probably only zeroes which get deduplicated anyway)
 
  • Like
Reactions: diaolin and Moayad
Hmm
So by default the image backup procedure will transfer the modified blocks to the Proxmox Backup Server (PBS), even if the data within the file system has been deleted.

However, if I run fstrim, the backup procedure will read the disk again (this also happens when I turn off / on the vm) but will not transfer the deleted blocks to PBS.

Seleção_583.png

As a good practice I have to assess which of the two procedures is executed in the shortest time, to make the best choice.

Am I right?
 
Running fstrim, the backup procedure reads the entire disk, but does not copy the deleted data.

Using the configuration below, the backup reads only the blocks from the deleted file and does not copy them to PBS. \o/

S.O. Ubuntu 20.05 (kernel 5.4.0-65)
Disk Config
scsi1: StorLvmT:vm-203-disk-1,discard=on,size=11G,ssd=1

In /etc/fstab config to discard (VM)
/dev/sdb1 /bkp ext4 defaults,discard 0 0

I don't know what the behavior would be on a VM with a kernel lower than 5
 
Very interesting but many advisory says different:
what does it say different?

discard/trim in qemu is used to mark blocks as unused to the storage, so it can free them?

the fstrim command will trim all unused blocks, while mounting a fs with discard will (depending on the fs) trim only deleted files
 
what does it say different?

discard/trim in qemu is used to mark blocks as unused to the storage, so it can free them?

the fstrim command will trim all unused blocks, while mounting a fs with discard will (depending on the fs) trim only deleted files
Sorry, i mean

Core Filesystems:

• ext4 – the default extended option is not to discard blocks at filesystem make time, retain this, and do not add the “discard” extended option as some information will tell you to do.


What do you think about this?

Ciao, Diaolin
 
• ext4 – the default extended option is not to discard blocks at filesystem make time, retain this, and do not add the “discard” extended option as some information will tell you to do.


What do you think about this?
sorry i do not understand...

for physical disks it is recommended to run fstrim periodically
but on virtual systems this will touch all free blocks (which marks them dirty and pbs will try to back them up again) so a 'discard' mount option is more efficient
 
The mount man says: (in the Ubuntu 18.04)
Code:
 discard
              Disable/enable the discard mount option.  The  discard  function
              issues  frequent  commands to let the block device reclaim space
              freed by the filesystem.  This is useful for SSD devices, thinly
              provisioned LUNs and virtual machine images, but may have a sig‐
              nificant performance impact.  (The fstrim command is also avail‐
              able to initiate batch trims from userspace.)

I'm using it only in a test environment and I'm concerned with the line: "...but may have a significant performance impact.."
 
I'm using it only in a test environment and I'm concerned with the line: "...but may have a significant performance impact.."
because it does it the moment you delete files. depending on the underlying storage, a trim command may produce additional load, etc.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!