Using proxmox-backup-client to backup 1.8 TB of nextcloud user's files - took 22h !

rakurtz

Member
Jan 23, 2021
31
5
8
Germany
Hello dear Proxmox,

for our university's institute (law...) i had to build a new server infrastructure for VMs. Since we successfully use proxmox since a couple of years now, of course i built the new infrastructure with proxmox again:

Our setup:
2 Proxmox VE nodes with local zfs (uses ssd for ZIL and Cache) and storage replication to each other
1 Proxmox Backup Server with local zfs storage (uses ssd for ZIL and Cache)
1 older netapp storage we use through nfs

Backup speed benchmarks
Each systems speed seams to be quite sufficient:

Code:
# proxmox-backup-client benchmark on backup-server directly:

Time per request: 4962 microseconds.
TLS speed: 845.12 MB/s
SHA256 speed: 419.69 MB/s
Compression speed: 612.28 MB/s
Decompress speed: 984.28 MB/s
AES256/GCM speed: 2524.01 MB/s
Verify speed: 294.12 MB/s

but from within a VM which is in the same LAN (but different VLAN) it looks a bit different, but still ok:


Code:
# proxmox-backup-client benchmark from VM:

Time per request: 35574 microseconds.
TLS speed: 117.90 MB/s
SHA256 speed: 346.01 MB/s
Compression speed: 420.34 MB/s
Decompress speed: 658.98 MB/s
AES256/GCM speed: 1557.17 MB/s
Verify speed: 233.52 MB/s

Ping is around 0.3 ms from that VM to the proxmox-backup-server.

The issue
I am not sure if the above TLS speed of roughly 120 MB/s is related to bug 2983 (i checked all backup-server and -clients versions, we are on 1.0.6) and i also don't necessarily think, that 120 MB/s is too slow. After the initial 22h backup of our 1.8 TB nextcloud user's files (thousands of thousands of typical small files like .docx and pdfs...) i wouldn't be to scared abut huge amount of changed data to be send over the land for each follow up backup.

Problem is, it even takes almost the same amount of scanning through all those files for actual changes. This would mean that the VM would be in a constant backup-loop if i wanted daily backups.

The nextcloud user files are on an nfs share (netapp) mounted to the VM which does the backup-client job. I am planning on moving the files away from the netapp into the local zfs of the nodes. But for the actual migration of that data i wanted to temporarily stop our nextcloud service and create an actual backup, then restore the files into a freshly built VM on local node's storage. But the duration of creating a follow up backup is too long and stopping our nextcloud service for the whole duration isn't sufficient.

Any ideas of how to speed up that whole process? Where would you think the bottle neck is? Do you think, backing up all these files becomes a lot faster, when the data is moved to the local zfs?


Thank you for your thoughts on this in advance,

Martin


PS: This is some speed information on a 320 GB VM backup from one of the nodes to the backup-server:

Bildschirmfoto 2021-01-23 um 18.28.58.png

Code:
# above backup job's results:
101: 2021-01-23 18:33:38 INFO: backup is sparse: 145.35 GiB (45%) total zero data
101: 2021-01-23 18:33:38 INFO: backup was done incrementally, reused 145.35 GiB (45%)
101: 2021-01-23 18:33:38 INFO: transferred 320.00 GiB in 2137 seconds (153.3 MiB/s)
101: 2021-01-23 18:33:38 INFO: Finished Backup of VM 101 (00:35:38)


on the backup server:
Bildschirmfoto 2021-01-23 um 18.28.17.png

Bildschirmfoto 2021-01-23 um 18.28.25.png
 
Last edited:
a file level backup needs to read the data again each backup due to the way we create the archive see https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001709.html
for some preliminary documentation of why that is

for vms, we are able to utilize qemus dirty-bitmap feature which can track the changed blocks of a vms disk (meaning we only have to read the changed blocks)
there is sadly no easy way to speed up file level backups at this time besides speeding up the source storage
 
Thx for your quick reply!
By your experience: would you say that 22h for 1.8 TB of small files is a „ok“-value when or do you instinctively think that there could be much room for improvement e.g. by moving those files from a nfs storage to a local zfs?

i know it‘s difficult to answer this when you don’t know the specs and the architecture of the system...

thx in advance!
 
1.8 TiB in 22h are ~24MiB/s

the basic bottleneck you get with file backups of many small file is random seektime of your storage

for every file, the storage must seek to its position, and (mostly; it depends) sequentially read the content

there is some basic math in our repo (https://git.proxmox.com/?p=proxmox-...06b643df542e2664b08009fe;hb=refs/heads/master )
this relates to restore speed and chunk size, but can be also interpreted for the backup of small files

i know it‘s difficult to answer this when you don’t know the specs and the architecture of the system...
it is actually impossible to say without more info

for spinning disks with a max file size of 1KiB this speed would be fantastic
for ssd with files >1GiB it would be *very* bad
 
Well i guess you're absolutely right: I checked the (random) reading performance of that old Netapp NFS storage with fio.

Unterirdisch as we german speakers would say. Less than 1 Mbyte/s for 4K chunks...

Since the new place for the files performs with at least 179 Mbyte/s (also 4K chunks, getting better with bigger chunks), backing up via proxmox-backup-client is very suitable for us now.

Anyways - thx for the great introduction into the science of storage performance ;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!