File level backup for 100.000.000 files

DynFi User

Renowned Member
Apr 18, 2016
152
17
83
49
dynfi.com
We have a large volume that we need to backup which contains 100.000.000 files, with a ∆ / day of about 50.000 files (400GB).

For the time being this file system is mounted directly in PBS using fuse kernel driver with mount -t ceph ip.srv.1,ip.srv.2,ip.srv.3,ipsrv.4:/ /mnt/mycephfs -o name=myname,secret=xxxxxxxx

We then launch the backup script directly within the pbs and are able to backup the file system using pbs.

In order to simulate real life scenario, i have setup on one of the VM that's also connected to the CephFS a script which generates files in order to do some testing (scripts creates 50.000 files with 65% of small files [10 to 100k] and 35% of big files [1 to 50Mo].

For the time being we have about 450.000 files, we are far from the 100.000.000 files.

We then trigger the backup and collect performance data (If someone is interested, i could share this with users of the forum).
I am tring to see if we can guarantee performances of such backup and if performances are going to be linear and won't take months to be done.

Process is the following today:
launch script for file generation >>> generate 50.000 files on CephFS >>> launch backup script from pbs >>> finish backup >>> (and back to first sequence)

I am trying to evaluate the following :
  • Is it realistic to backup 100.000.000 using pbs with file level backup ?
  • Is the CephFS mounting inside the PBS a solution that could be used with no problem
  • What would be your advises to speedup such backup (if any) ?
    • most of the time seems to be spent calculating index of files
    • we are seeing an increase in time for every new batch of 50.000 files we are doing
 
Last edited:
so for now, pbs needs to always read all files when doing a file-based backup (in contrast to vm backups, which happen on block level)
so the more files you have the more reads it'll do. it still only uploads the chunks new againts the previous backup, and still only
writes new chunks to the datastore

we are looking to improving that, but nothing concrete so far
 
so for now, pbs needs to always read all files when doing a file-based backup (in contrast to vm backups, which happen on block level)
so the more files you have the more reads it'll do. it still only uploads the chunks new againts the previous backup, and still only
writes new chunks to the datastore

we are looking to improving that, but nothing concrete so far
Just as a quick sanity check, does this behavior apply when backing up from a ZFS file system?

Alternatively, has there been any progress on this since June?

I'm trying to read up in the forums and plan before moving my PVE homelab backups onto PBS and found this post. One of my LXC systems has a 4TB mount which rarely has any file changes. The idea of it spending hours generating checksums during every backup just to send almost no data is extra load on the server that I'd rather avoid. My old backups had been using ZFS snapshots and incremental sends, but I was hoping to simplify things while gaining remote-encryption by moving to PBS.

I imagine there's some tricks you could do like retaining ZFS snapshots locally and diffing between them to create a shortlist of files to re-hash, but that's probably not nearly as simple as it looks to me from the outside...
 
Last edited:
File level backup is not performant in PBS.
PBS is good for backing up VMs.

If your file are hosted on a ZFS system, you'd better use some scripts + ZFS snapshots which are native to your OS.
Both your native system + target needs to be on ZFS though.
 
  • Like
Reactions: jbrzozoski

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!