Pbs incremental for containers

EuroDomenii · Jul 16, 2020

For virtual machines incremental is very fast ( based on QEMU dirty bitmaps, a matter of seconds) .

For LXC containers, it seems that there isn't any incremental implementation...
LVM- Thin storage, backup mode snapshot

1) initial backup

Code:

INFO: starting new backup job: vzdump 104 --node rise1rbx --storage pb --remove 0 --mode snapshot
INFO: Starting Backup of VM 104 (lxc)
INFO: Backup started at 2020-07-16 11:17:39
INFO: status = running
INFO: CT Name: bio
INFO: including mount point rootfs ('/') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
INFO: creating Proxmox Backup Server archive 'ct/104/2020-07-16T15:17:39Z'
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp13900/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found --backup-type ct --backup-id 104 --backup-time 1594912659 --repository root@pam@localhost:store2
INFO: Starting backup: ct/104/2020-07-16T15:17:39Z
INFO: Client name: rise1rbx
INFO: Starting protocol: 2020-07-16T11:17:39-04:00
INFO: Upload config file '/var/tmp/vzdumptmp13900/etc/vzdump/pct.conf' to 'BackupRepository { user: Some("root@pam"), host: Some("localhost"), store: "store2" }' as pct.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'BackupRepository { user: Some("root@pam"), host: Some("localhost"), store: "store2" }' as root.pxar.didx
INFO: root.pxar.didx: Uploaded 30542429611 bytes as 8531 chunks in 342 seconds (85 MB/s).
INFO: root.pxar.didx: Average chunk size was 3580169 bytes.
INFO: root.pxar.didx: Time per request: 40153 microseconds.
INFO: catalog.pcat1.didx: Uploaded 1104138 bytes as 4 chunks in 342 seconds (0 MB/s).
INFO: catalog.pcat1.didx: Average chunk size was 276034 bytes.
INFO: catalog.pcat1.didx: Time per request: 85639779 microseconds.
INFO: Upload index.json to 'BackupRepository { user: Some("root@pam"), host: Some("localhost"), store: "store2" }'
INFO: Duration: PT342.594309214S
INFO: End Time: 2020-07-16T11:23:22-04:00
INFO: remove vzdump snapshot
  Logical volume "snap_vm-104-disk-0_vzdump" successfully removed
INFO: Finished Backup of VM 104 (00:05:44)
INFO: Backup finished at 2020-07-16 11:23:23
INFO: Backup job finished successfully
TASK OK

2) subsequent backup, after running git clone https://github.com/torvalds/linux.git

Code:

INFO: starting new backup job: vzdump 104 --node rise1rbx --storage pb --mode snapshot --remove 0
INFO: Starting Backup of VM 104 (lxc)
INFO: Backup started at 2020-07-16 11:30:52
INFO: status = running
INFO: CT Name: bio
INFO: including mount point rootfs ('/') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
  WARNING: You have not turned on protection against thin pools running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
  Logical volume "snap_vm-104-disk-0_vzdump" created.
  WARNING: Sum of all thin volume sizes (1.66 TiB) exceeds the size of thin pool vmdata/vmstore and the size of whole volume group (343.46 GiB).
INFO: creating Proxmox Backup Server archive 'ct/104/2020-07-16T15:30:52Z'
INFO: run: lxc-usernsexec -m u:0:100000:65536 -m g:0:100000:65536 -- /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp22961/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found --backup-type ct --backup-id 104 --backup-time 1594913452 --repository root@pam@localhost:store2
INFO: Starting backup: ct/104/2020-07-16T15:30:52Z
INFO: Client name: rise1rbx
INFO: Starting protocol: 2020-07-16T11:31:02-04:00
INFO: Upload config file '/var/tmp/vzdumptmp22961/etc/vzdump/pct.conf' to 'BackupRepository { user: Some("root@pam"), host: Some("localhost"), store: "store2" }' as pct.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'BackupRepository { user: Some("root@pam"), host: Some("localhost"), store: "store2" }' as root.pxar.didx
INFO: root.pxar.didx: Uploaded 34743929478 bytes as 9598 chunks in 358 seconds (92 MB/s).
INFO: root.pxar.didx: Average chunk size was 3619913 bytes.
INFO: root.pxar.didx: Time per request: 37400 microseconds.
INFO: catalog.pcat1.didx: Uploaded 2639555 bytes as 8 chunks in 358 seconds (0 MB/s).
INFO: catalog.pcat1.didx: Average chunk size was 329944 bytes.
INFO: catalog.pcat1.didx: Time per request: 44873615 microseconds.
INFO: Upload index.json to 'BackupRepository { user: Some("root@pam"), host: Some("localhost"), store: "store2" }'
INFO: Duration: PT359.011667856S
INFO: End Time: 2020-07-16T11:37:01-04:00
INFO: remove vzdump snapshot
  Logical volume "snap_vm-104-disk-0_vzdump" successfully removed
INFO: Finished Backup of VM 104 (00:06:09)
INFO: Backup finished at 2020-07-16 11:37:01
INFO: Backup job finished successfully
TASK OK

Stefan_R · Jul 21, 2020

All backups made with PBS are incremental in nature. It is correct however, that containers do not support a technology similar to dirty bitmap tracking, so we always have to read all data. Only changed chunks are transferred over the net though.

For more information see our documentation:
https://pbs.proxmox.com/docs/

Morphushka · Sep 7, 2022

Stefan_R said:
so we always have to read all data

May you to be precise: for example lxc container is 40Gb, used 10Gb. "all data" is to read all 40Gb of disk, or to read only busy space (10Gb) ?

Dunuin · Sep 7, 2022

Morphushka said:
May you to be precise: for example lxc container is 40Gb, used 10Gb. "all data" is to read all 40Gb of disk, or to read only busy space (10Gb) ?

It always needs to read the full 40GB.

fabian · Sep 7, 2022

no, for container/host backups the whole directory tree will be read, the "empty"/"unused" space of the underlying disk (image) is ignored. for VM backups (which are block based) the empty space is read as well - but VMs with a dirty bitmaps can skip parts which are unchanged since the last backup (which applies to both "used" unchanged parts and "empty" unchanged parts - there is no such difference at the level where VM backups happen

).

Dunuin · Sep 7, 2022

Your right, mixed that up with VMs.

pascalbrax · Jun 20, 2023

I'm getting frustrated at my LXCs... I have a container that contains a Terabyte of photos, that's rarely touched, no new files are added to the container and I have a scheduled backup with PBS every night.

Never the less, the PBS logs shows a lot, A LOT of chunks being uploaded to PBS every night, it should just read the files, if what fabian said it's true, but instead every night I have a 4 hours and 40 minutes backup job even if the container wasn't accessed for a whole week.

Thanks god the dirty bitmap on my VMs works at least.

fabian · Jun 20, 2023

please post the full (client) log.. that the backup takes >4h could also just be because reading the data takes that long. or it could be that something does in fact touch the files in a way that causes the chunks to be invalidated, but I'd think you'd quickly notice if your PBS server was adding a terabyte of chunks for each backup..

pascalbrax · Jun 20, 2023

I can show you what's happening right now since the backup job is running from 6:40 to around 11:20...

from PVE:

from PBS:

fabian · Jun 20, 2023

yeah, those messages just say that the client tells the server "this backup contains this chunk at this offset", not that those chunks are freshly uploaded..

Code:

2023-03-15T08:59:46+01:00: POST /dynamic_chunk
2023-03-15T08:59:46+01:00: upload_chunk done: 1530143 bytes, 6a783db9f8e9df5b4750b15bfaabd1b1fd4242c944aeeb5576daa58e51fb4157

that's what actual uploaded chunks look like in the server side log, but like I said, that client will tell you at the end how much it had to upload, and how much was re-used from the previous snapshot.

pascalbrax · Jun 20, 2023

That's from yesterday's backup log:

I have a hard time decoding the log messages, what I thought was that "upload size ... (100%)" means PBS uploaded the 100% of the LXC.

fabian · Jun 20, 2023

the main archive (root.pxar.didx) is above it, and there it says:

Size: 2105618544997 (roughly 1.9 TB, this is the input data)
Chunk count 555341
Upload size 55979292 (roughly 50MB!, this is what got uploaded, and it's only 0.00265% of the input data)
Duplicates: 555331+0 (the first one is the number of chunks the client deduplicated and didn't upload, the second one indicates that the server didn't discard any *uploaded* chunks that the client didn't knew the server already had - this can happen if you backup the same data in different backup groups)

so like I said, the time is basically only spent reading the data to determine it is already backed up, there were only 10 (chunk count - duplicates) new chunks that were actually uploaded, totalling 50MB

the last part is the catalog file - it consisted of 7 chunks, 6 of those were discarded by the server as being already there, 1 was new. I am not sure whether we even deduplicate the catalog client side - it's very small. even for this big backup, the whole catalog is only ~2MB.

pascalbrax · Jun 20, 2023

Thank you fabian for your reply and for the time you dedicated to me. It's now all so much clear.

I think I have to figure out how to organise my data, because I have several "big" containers and the backup jobs are lasting hours for what's basically just reading the disk.

Disks that get stressed for hours every night, at this point.

Do you suggest moving all my containers that have >1TB to a VM so that I can take advantage of the dirty bit or it's a futile excercise?

fabian · Jun 20, 2023

that would work, provided the VMs are continuously running. if you know the data rarely changes, you could also reduce the frequency of backups.

having a local metadata cache and some mechanism to leverage that for incremental backups is being planned, but it's tricky, and we don't want to have to do breaking versions of the archive formats for obvious reasons, so it needs to be done very carefully and though through. this is being tracked in https://bugzilla.proxmox.com/show_bug.cgi?id=3174 in case you want to be CCed

Morphushka · Jun 22, 2023

pascalbrax said:
Thank you fabian for your reply and for the time you dedicated to me. It's now all so much clear.

I think I have to figure out how to organise my data, because I have several "big" containers and the backup jobs are lasting hours for what's basically just reading the disk.

Disks that get stressed for hours every night, at this point.

Do you suggest moving all my containers that have >1TB to a VM so that I can take advantage of the dirty bit or it's a futile excercise?

Hello @pascalbrax , if you use zfs, you can use incremental send/recv feature. I use this for my big lxc and vm, it is always incremental (no reading phase for lxc or full backup if vm rebooted). For others I use pbs, and rarely monthly copy of big machines just for extra safety.

Search

Search

Pbs incremental for containers

EuroDomenii

Well-Known Member

Stefan_R

Proxmox Retired Staff

Morphushka

Active Member

Dunuin

Distinguished Member

fabian

Proxmox Staff Member

Dunuin

Distinguished Member

pascalbrax

New Member

fabian

Proxmox Staff Member

pascalbrax

New Member

fabian

Proxmox Staff Member

pascalbrax

New Member

fabian

Proxmox Staff Member

pascalbrax

New Member

fabian

Proxmox Staff Member

Morphushka

Active Member