Proxmox Backup Server vs Rsync

eagleman04

New Member
Apr 16, 2024
4
0
1
I'm hoping the good folks here can educate me on something, as ProxMox Backup Server isn't behaving how I would expect. Currently, I have two physical machines: the first is my ProxMox Server (v8.1.10). It consists of 4X16TB HHD in a RAID10 zpool for my media. It has a separate 2TB mirrored SSD pool for pictures and a final 1 TB mirrored SSD pool for VMs and other random data. I am currenlty running 3 Open Media Vault VMs to manage these files. The zpools were created on the host and shared with the OMV VM's, where I created a XFS dataset for my Media (using the HDD pool) and EXT4 datasets for my pictures and VM data on the two SSD zpools.

The second physical machine is my Proxmox Backup Server (v3.1-5) and currently has a 4X16TB RAID10 zpool with a 1TB SSD mirrored special device to help with the IOPS.

The backup server was just built over the weekend and I manually ran a backup on each of the VMs (and their underlying datasets). As expected, it took about 12-hours for my media to come over and about 2 hours for my pictures. The VM data took approximately 30-min.

I assumed that ProxMox Backup would be like a supped up version of Rsync. Meaning that after the first backup, it would simply pull only the incremental changes in subsequent backups - and therefore be very fast. However, I manually ran another back up of the OMV VM managing my picture dataset this evening and was surprised it took 2 hours again for the backup job to complete, when there have been only minimal additions/changes to that folder. What am I missing here? Is this expected behavior?

My concern is that I was going to set up an offsite/remote PBS as well. I was going to populate this server locally first and then take it offsite so that only the incremental changes would have to go over the wire moving forward. However, if PBS is trying to send all the data everytime, this isn't feasible, as I'm quite sure my residential ISPs would flag me if I were to send TBs of data over the wire every week (much less every night).

All that said, I'd appreciate any guidance on what I'm misunderstanding or may be doing wrong. Thank you in advance.
 
I am not sure how you are doing the backups or what you are backing up

VMs backed up using PVE's integrated backup feature have multiple "levels" of deduplication (all of them work together):

- if the VM was previously backed up to the same target PBS, that previous snapshot still exists and the VM has not been powered off since, a "fast incremental" mode is used, where Qemu can inform the backup process about which parts of the virtual disk has changed, and only the changed parts are read
- if the VM was previously backed up to the same target PBS and that previous snapshot still exists, any chunk that is created at the client side that is also referenced by the previous snapshot is not re-uploaded to the server, but the server is just told to include it
- the server will never store a chunk twice - so if the client uploads a chunk that it already has, it will be de-duplicated server-side using the content-addressable storage

the last two also happen for container and host backups (file-based, for example if you run proxmox-backup-client backup .. yourself inside a VM or on a baremetal Linux system), the first can only happen for VMs, since it requires support by Qemu to track changes.

so traffic between the client and the server should only happen
- for the first snapshot in a group (the client cannot do any deduplication in that case, only the server can)
- for subsequent snapshots in a group limited only to the changed/new chunks

maybe you can post a log of an incremental run?

for file-based backups, there is ongoing work to also skip reading file contents based on metadata checking (similar to what rsync does). this is not yet finished and shipped though, see https://bugzilla.proxmox.com/show_bug.cgi?id=3174 (edit: thanks chris!)
 
Last edited:
Thank you. Apologies if this is oversimplifying but to initiate the backup I simply went into the OMV VM that manages my picture storage, navigated to the Backup tab, and hit "Backup Now" using Snapshot mode. This particular dataset is all pictures.

fabian, based on your comment "...previous snapshot still exists and the VM has not been powered off since..." I believe the issue is that I rebooted the Proxmox server (and so therefore also the underlying VMs) sometime between the first and second backup attempts. To confirm this was a factor, I ran a third backup of that same VM again this morning and it only took a few seconds. This is more in line with what I was expecting.

This leaves me with two questions then. The first is, is there anyway to preserve the snapshot between reboots so that fast incremental mode can always be used?

Assuming no to that question, my second question is regarding what is actually sent to the backup server during these longer backup times. I notice when running the backup after a reboot that the task log (snippet below) is showing primarily 0 B/s writes and at the end of the task it shows "backup was done incrementally, reused 3.46 TiB (99%)". Does this mean that only the 1% change is what was sent to the Backup server? If this is the case, having a remote server is still feasible since only the 1% of changes is what was actually sent over the wire to the Backup server.

Thank you again for your help.

Code:
INFO: starting new backup job: vzdump 102 --remove 0 --notes-template '{{guestname}}' --node prox --notification-mode auto --storage LocalBU --mode snapshot
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2024-04-16 07:18:02
INFO: status = running
INFO: VM Name: OMV-Photos
INFO: include disk 'scsi0' 'local-zfs:vm-102-disk-0' 40G
INFO: include disk 'scsi1' 'Photos:vm-102-disk-0' 3500G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/102/2024-04-16T11:18:02Z'
INFO: enabling encryption
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task ##############################
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO: scsi1: dirty-bitmap status: created new
INFO:   0% (3.1 GiB of 3.5 TiB) in 3s, read: 1.0 GiB/s, write: 6.7 MiB/s
INFO:   1% (36.0 GiB of 3.5 TiB) in 36s, read: 1020.4 MiB/s, write: 2.5 MiB/s
INFO:   2% (71.0 GiB of 3.5 TiB) in 1m 10s, read: 1.0 GiB/s, write: 361.4 KiB/s
INFO:   3% (106.5 GiB of 3.5 TiB) in 1m 43s, read: 1.1 GiB/s, write: 0 B/s
INFO:   4% (146.4 GiB of 3.5 TiB) in 1m 55s, read: 3.3 GiB/s, write: 682.7 KiB/s
INFO:   5% (181.6 GiB of 3.5 TiB) in 2m, read: 7.0 GiB/s, write: 0 B/s
INFO:   6% (216.8 GiB of 3.5 TiB) in 2m 5s, read: 7.0 GiB/s, write: 0 B/s
INFO:   7% (251.5 GiB of 3.5 TiB) in 2m 10s, read: 6.9 GiB/s, write: 0 B/s
INFO:   8% (286.6 GiB of 3.5 TiB) in 2m 15s, read: 7.0 GiB/s, write: 0 B/s
INFO:   9% (321.9 GiB of 3.5 TiB) in 2m 20s, read: 7.1 GiB/s, write: 0 B/s
INFO:  10% (357.1 GiB of 3.5 TiB) in 2m 25s, read: 7.0 GiB/s, write: 0 B/s
INFO:  11% (392.5 GiB of 3.5 TiB) in 2m 30s, read: 7.1 GiB/s, write: 0 B/s
INFO:  12% (427.5 GiB of 3.5 TiB) in 2m 35s, read: 7.0 GiB/s, write: 0 B/s
INFO:  13% (462.4 GiB of 3.5 TiB) in 2m 40s, read: 7.0 GiB/s, write: 0 B/s
INFO:  14% (497.6 GiB of 3.5 TiB) in 2m 45s, read: 7.0 GiB/s, write: 0 B/s
INFO:  15% (532.8 GiB of 3.5 TiB) in 2m 50s, read: 7.0 GiB/s, write: 0 B/s

INFO: backup is sparse: 2.99 TiB (86%) total zero data
INFO: backup was done incrementally, reused 3.46 TiB (99%)
 
This leaves me with two questions then. The first is, is there anyway to preserve the snapshot between reboots so that fast incremental mode can always be used?
https://bugzilla.proxmox.com/show_bug.cgi?id=3233 tracks that one

Assuming no to that question, my second question is regarding what is actually sent to the backup server during these longer backup times. I notice when running the backup after a reboot that the task log (snippet below) is showing primarily 0 B/s writes and at the end of the task it shows "backup was done incrementally, reused 3.46 TiB (99%)". Does this mean that only the 1% change is what was sent to the Backup server? If this is the case, having a remote server is still feasible since only the 1% of changes is what was actually sent over the wire to the Backup server.
basically, yes. although in this case, since the backup was mostly chunks filled with zero bytes anyway, the actual transfer was also a lot less than the total size for the initial backup snapshot ;)

you have to keep in mind that the delta on the chunk level and the delta inside the VM on the file level does not correspond 1:1 - you can have a sort of "amplification" between the two, since the granularity of the backup is 4M, and the file system normally operates with a lot smaller blocks.
 
  • Like
Reactions: eagleman04
Understood. Really appreciate you taking the time to explain this - and for confirming that my grand plans for a remote backup are still feasible!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!