LXC Containers Backing Up Incredibly Slow

tomtom13 · Aug 22, 2022

fabian said:
no, the backup client is file-system agnostic, it uses regular file/directory operations, there is no diffing at that level.

And this is what I meant by "enveloping different logic within switch case" - it could be made NOT fs agnostic. But I guess, Proxmox folk would need to weight customer benefit of performance vs extra work of implementing and maintaining this. And no, after 20 years of software engineering I'm not going to say "it will be very easy, just do X Y Z" ... I know there will be plenty of corner cases and headaches, still it's a decision that could benefit customers greatly, specially considering most supported FS's by will be snapshot friendly (zfs does, planned btrfs does, xfs is arround the corner, only good old ext4 is in the woods).

fabian said:
if you verify a single snapshot, all the chunks of that snapshot will be verified. a snapshot is always complete, there are no incremental or full snapshots - all snapshots are "equal". only the uploading part is incremental in the sense that it skips re-uploading chunks that are already there on the server. the chunk store used by the datastore takes care of the deduplication. so yes, in your case, (re)verifying a single snapshot will read and verify 4TB of data. (re)verifying ten snapshots of the same CT with little churn between the snapshots will cause only a little bit more load than verifying a single one of them, since already verified chunks within a single verification task will not be verified a second time for obvious reasons.

TL;DR "verify after backup" is not necessary unless you are really paranoid. scheduled verification with sensible re-verification settings is the right choice for most setups to reduce the load caused by verification while retaining almost all the benefits.

What I think you're referring to is that periodic verify behaves like, group verify (this one actually happens rather fast). First this could be better communicated in the documentation, secondly for odd reason scheduled verify for me seems to behave like a singular verify and takes very long time. BUT Let me test that and come back to you on this one ( few days

).

fabian · Aug 22, 2022

tomtom13 said:
And this is what I meant by "enveloping different logic within switch case" - it could be made NOT fs agnostic. But I guess, Proxmox folk would need to weight customer benefit of performance vs extra work of implementing and maintaining this. And no, after 20 years of software engineering I'm not going to say "it will be very easy, just do X Y Z" ... I know there will be plenty of corner cases and headaches, still it's a decision that could benefit customers greatly, specially considering most supported FS's by will be snapshot friendly (zfs does, planned btrfs does, xfs is arround the corner, only good old ext4 is in the woods).

we do want to implement metadata-based skipping of unnecessary data reads (similar to what other backup tools use), but it's not that easy to scale/integrate into PBS architecture (supporting many users and backup sources in a single datastore, client/server, ..).

tomtom13 said:
What I think you're referring to is that periodic verify behaves like, group verify (this one actually happens rather fast). First this could be better communicated in the documentation, secondly for odd reason scheduled verify for me seems to behave like a singular verify and takes very long time. BUT Let me test that and come back to you on this one ( few days ).

there are basically two sources of "speedup" for verification
- snapshots which don't match the reverification criteria are not reverified at all (e.g., recently verified within threshold)
- chunks are only ever verified once in a single task -> more snapshots, more duplicate chunks, less work

so yes, verifying as many snapshots as possible in a single task, and not re-verifying too often reduces the load. but any snapshots that haven't been verified at all will still need to be verified, so if you have 9 snapshots that are already verified (in previous verification tasks, not the current one) and 1 snapshot referencing mostly the same data that needs to be verified, you'll still see load for all the chunks referenced by that unverified snapshot.

note that if you use a storage that already protects against bitrot (e.g., redundant zpool) you could opt for not verifying at all - a garbage collection operation will take care of noticing (logically) missing chunks that are referenced by any snapshots, and a chunk written once by a backup task is never touched by PBS unless it is detected as corrupt later on.

tomtom13 · Aug 22, 2022

fabian said:
there are basically two sources of "speedup" for verification
- snapshots which don't match the reverification criteria are not reverified at all (e.g., recently verified within threshold)
- chunks are only ever verified once in a single task -> more snapshots, more duplicate chunks, less work

so yes, verifying as many snapshots as possible in a single task, and not re-verifying too often reduces the load. but any snapshots that haven't been verified at all will still need to be verified, so if you have 9 snapshots that are already verified (in previous verification tasks, not the current one) and 1 snapshot referencing mostly the same data that needs to be verified, you'll still see load for all the chunks referenced by that unverified snapshot.

So initial feedback is this: I tested you theory, and when I trigger the group verification of whole CT tree (image below) the PBS STILL tries to verify the whole snapshot - I'm right now an hour into it trying to verify the newest unverified snapshot - I would assume that it should try verification from the oldest unverified one ... but again it was silly assumption. So, no ... regardless of how the verify is triggered PBS when dealing with CT, will perform full verification of 3.26TB (TB or TiB ?) even if there are previously verified things.

Here is what log says:

Code:

2022-08-22T14:04:24+01:00: verify group storage_cold:ct/113 (10 snapshots)
2022-08-22T14:04:24+01:00: verify storage_cold:ct/113/2022-08-22T03:00:28Z
2022-08-22T14:04:24+01:00:   check pct.conf.blob
2022-08-22T14:04:24+01:00:   check root.pxar.didx

And is stuck there for last hour (screenshot attached)

Now when I triggered a group verification of VM 141, this is what I've got:

Code:

2022-08-22T11:01:16+01:00: verify group storage_cold:vm/141 (18 snapshots)
2022-08-22T11:01:16+01:00: verify storage_cold:vm/141/2022-08-22T09:26:40Z
2022-08-22T11:01:16+01:00:   check qemu-server.conf.blob
2022-08-22T11:01:16+01:00:   check drive-tpmstate0-backup.img.fidx
2022-08-22T11:01:16+01:00:   verified 0.00/4.00 MiB in 0.03 seconds, speed 0.15/149.22 MiB/s (0 errors)
2022-08-22T11:01:16+01:00:   check drive-scsi0.img.fidx
2022-08-22T11:02:54+01:00:   verified 7490.60/15424.00 MiB in 97.77 seconds, speed 76.62/157.77 MiB/s (0 errors)
2022-08-22T11:02:54+01:00:   check drive-efidisk0.img.fidx
2022-08-22T11:02:54+01:00:   verified 0.01/0.52 MiB in 0.01 seconds, speed 0.59/36.35 MiB/s (0 errors)
2022-08-22T11:02:54+01:00: percentage done: 5.56% (1/18 snapshots)
2022-08-22T11:02:54+01:00: verify storage_cold:vm/141/2022-08-21T21:03:31Z
2022-08-22T11:02:54+01:00:   check qemu-server.conf.blob
2022-08-22T11:02:54+01:00:   check drive-tpmstate0-backup.img.fidx
2022-08-22T11:02:54+01:00:   verified 0.00/4.00 MiB in 0.03 seconds, speed 0.13/128.11 MiB/s (0 errors)
2022-08-22T11:02:54+01:00:   check drive-scsi0.img.fidx
2022-08-22T11:02:55+01:00:   verified 53.92/176.00 MiB in 1.00 seconds, speed 54.06/176.45 MiB/s (0 errors)
2022-08-22T11:02:55+01:00:   check drive-efidisk0.img.fidx
2022-08-22T11:02:55+01:00:   verified 0.01/0.52 MiB in 0.01 seconds, speed 0.59/36.63 MiB/s (0 errors)
2022-08-22T11:02:55+01:00: percentage done: 11.11% (2/18 snapshots)
2022-08-22T11:02:55+01:00: verify storage_cold:vm/141/2022-08-20T21:42:21Z
2022-08-22T11:02:55+01:00:   check qemu-server.conf.blob
2022-08-22T11:02:55+01:00:   check drive-tpmstate0-backup.img.fidx
2022-08-22T11:02:55+01:00:   verified 0.00/4.00 MiB in 0.03 seconds, speed 0.13/130.42 MiB/s (0 errors)
2022-08-22T11:02:55+01:00:   check drive-scsi0.img.fidx
2022-08-22T11:02:57+01:00:   verified 59.05/180.00 MiB in 1.31 seconds, speed 45.07/137.37 MiB/s (0 errors)
2022-08-22T11:02:57+01:00:   check drive-efidisk0.img.fidx
2022-08-22T11:02:57+01:00:   verified 0.01/0.52 MiB in 0.01 seconds, speed 0.88/54.24 MiB/s (0 errors)
2022-08-22T11:02:57+01:00: percentage done: 16.67% (3/18 snapshots)
2022-08-22T11:02:57+01:00: verify storage_cold:vm/141/2022-08-19T09:20:50Z
2022-08-22T11:02:57+01:00:   check qemu-server.conf.blob
2022-08-22T11:02:57+01:00:   check drive-tpmstate0-backup.img.fidx
2022-08-22T11:02:57+01:00:   verified 0.00/4.00 MiB in 0.03 seconds, speed 0.12/120.28 MiB/s (0 errors)
2022-08-22T11:02:57+01:00:   check drive-scsi0.img.fidx
2022-08-22T11:02:59+01:00:   verified 144.74/392.00 MiB in 2.36 seconds, speed 61.24/165.85 MiB/s (0 errors)
2022-08-22T11:02:59+01:00:   check drive-efidisk0.img.fidx
2022-08-22T11:02:59+01:00:   verified 0.01/0.52 MiB in 0.01 seconds, speed 1.43/88.06 MiB/s (0 errors)
2022-08-22T11:02:59+01:00: percentage done: 22.22% (4/18 snapshots)
2022-08-22T11:02:59+01:00: verify storage_cold:vm/141/2022-08-18T12:08:21Z
2022-08-22T11:02:59+01:00:   check qemu-server.conf.blob
2022-08-22T11:02:59+01:00:   check drive-tpmstate0-backup.img.fidx
2022-08-22T11:02:59+01:00:   verified 0.00/4.00 MiB in 0.03 seconds, speed 0.14/136.33 MiB/s (0 errors)
2022-08-22T11:02:59+01:00:   check drive-scsi0.img.fidx
2022-08-22T11:03:02+01:00:   verified 172.81/488.00 MiB in 3.15 seconds, speed 54.79/154.73 MiB/s (0 errors)
2022-08-22T11:03:02+01:00:   check drive-efidisk0.img.fidx
2022-08-22T11:03:02+01:00:   verified 0.01/0.52 MiB in 0.01 seconds, speed 1.20/74.41 MiB/s (0 errors)
2022-08-22T11:03:02+01:00: percentage done: 27.78% (5/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-18T09:27:18Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 33.33% (6/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-17T09:53:02Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 38.89% (7/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-16T22:45:00Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 44.44% (8/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-16T22:38:48Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 50.00% (9/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-16T22:32:59Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 55.56% (10/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-11T01:24:22Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 61.11% (11/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-10T11:43:05Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 66.67% (12/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-10T09:18:58Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 72.22% (13/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-09T22:09:12Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 77.78% (14/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-09T09:26:16Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 83.33% (15/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-08T12:14:00Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 88.89% (16/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-08T10:51:57Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 94.44% (17/18 snapshots)
2022-08-22T11:03:02+01:00: SKIPPED: verify storage_cold:vm/141/2022-08-08T10:20:26Z (recently verified)
2022-08-22T11:03:02+01:00: percentage done: 100.00% (18/18 snapshots)
2022-08-22T11:03:02+01:00: TASK OK

So, something doesn't add up here. Please advise.

tomtom13 · Aug 22, 2022

Just to add, the CT 113, is a turnkey fileserver template based, that has a cron job that automatically updates packages, that is the only difference that is between each of those backups. Most recent backup job, can't even scroll through the window because there are so many lines about the chunks - if you could tell me how to obtain the log on console, I could copy & paste the backup log.

fabian · Sep 2, 2022

I tried to explain it multiple times already - a verification will always verify the full snapshot

chunks are only skipped if they have already been verified as part of the same verification task (so if you verify a group, all the "shared" chunks are only verified once when they are first encountered, and not for each snapshot that shares it). snapshots are skipped based on the criteria you give. there is no way to say "only verify new chunks", as verification status is on the snapshot level, not chunk level (we can't say the new snapshot is verified if we only check 1% of its chunks - that would be misleading!).

tomtom13 · Sep 2, 2022

fabian said:
chunks are only skipped if they have already been verified as part of the same verification task

OK, this part was not explicit @fabian , and I would say it's rather important detail (but again, it's maybe me being dumb and blind to miss it).

Side note, from further investigation: It seems that there is lot's of cpu overhead on compressing / decompressing data on the fly while doing backups (not even talking about verification). Is there a way to globally disable compression while backing up to PBS ? I mean storage is cheap, and having 10gb link being bottlenecked by cpu when backing up (130 ~ 150MB/s) seems not great. Makes me wonder whenever the verification is also impeded by any underlaying compression.

fabian · Sep 2, 2022

no, compression is done automatically atm (the uncompressed chunk variant is stored and used if compression doesn't provide any benefit, so for chunks that are not compressible the effort is only spent once at backup creation to determine this fact). it's usually not the bottle neck either (hashing, storage and network are), although it can of course cause additional CPU load. while storage might be cheap (financially), I/O is not (load-wise)

Dunuin · Sep 2, 2022

Storage is also not cheap when following the enterprise SSD-only hardware recommendation

tomtom13 · Sep 2, 2022

Dunuin said:
Storage is also not cheap when following the enterprise SSD-only hardware recommendation

yes, but not for backup ... spinning rust and tapes are still the king of backup and archive.

Dunuin · Sep 2, 2022

tomtom13 said:
yes, but not for backup ... spinning rust and tapes are still the king of backup and archive.

Not for PBS. There it is recommended not to use any HDDs for the datastore: https://pbs.proxmox.com/docs/installation.html#recommended-server-system-requirements

Recommended Server System Requirements

Backup storage:

Use only SSDs, for best results

If HDDs are used: Using a metadata cache is highly recommended, for example, add a ZFS special device mirror.

HDDs for data + SSD for metadata is just the more worse workaround in case you can't afford SSDs.

tomtom13 · Sep 2, 2022

Dunuin said:
Not for PBS. There it is recommended not to use any HDDs for the datastore:

I'll give you that. And also I'm starting to see why.

Anyway, back to the backup. I still believe that backup process is cpu bottle necked (even if it's not compression / decompression). It's visible on scrap test setup - makes me wonder how much horse power a pbs server will need when in production.

fabian · Sep 2, 2022

in case you haven't already - proxmox-backup-client benchmark gives you a rough ball park how expensive various parts of CPU-heavy operations are on a given system.

edit: well, not how "expensive" but how fast, but you get the picture

it should show you which part is a potential bottle-neck speed-wise.

Dunuin · Sep 2, 2022

tomtom13 said:
Anyway, back to the backup. I still believe that backup process is cpu bottle necked (even if it's not compression / decompression). It's visible on scrap test setup - makes me wonder how much horse power a pbs server will need when in production.

Did you run a benchmark to see where the bottleneck is?: https://pbs.proxmox.com/docs/backup-client.html#benchmarking

tomtom13 · Sep 2, 2022

Dunuin said:
Did you run a benchmark to see where the bottleneck is?: https://pbs.proxmox.com/docs/backup-client.html#benchmarking

No I didn't because I don't test it now on anything close to production hardware ... but it looks like soon I'll have to zone out a rack and start real testing (at least now I now that CT's have some backup limitations, but will need to compare production performance vs vm's).

What I have observed anyway, was that (this is for a dummy storage VM, not even touching CT problems): VM backup usually is bottlenecked by PBS operations while sending real data (PBS is maxing out CPU, while transfers on 10Gb are relatively slow, highest observed was 120MB), and part of VM volume that has holes punched out (discard) the transfer maxes out at PVE storage speed (in my case it's 4.3 gigaBYTE / s because it's on pcie gen 4 nvme), with PVE cpu sitting at about 30% on 32 thread cpu, while PBS is relatively unoccupied (cpu and io wise).

Another observation is that there seems to be no serialisation of backups, there can be multiple machines dumping their backups at single pbs at the same time (this is why advice on ssds makes sense, since parallel backups may starve the iops).

Dunuin · Sep 2, 2022

tomtom13 said:
Another observation is that there seems to be no serialisation of backups, there can be multiple machines dumping their backups at single pbs at the same time (this is why advice on ssds makes sense, since parallel backups may starve the iops).

Also keep in mind that IO will be more random and less sequential because of deduplication. Lets say you got a VM with a 2TB virtual disk. that will probably result in around 1 million of 2MB files spread across the disk. So when restoring or verifying that backup the HDDs seek time can become a problem when needing to read those 1 million files.

tomtom13 · Sep 2, 2022

Dunuin said:
Also keep in mind that IO will be more random and less sequential because of deduplication. Lets say you got a VM with a 2TB virtual disk. that will probably result in around 1 million of 2MB files spread across the disk. So when restoring or verifying that backup the HDDs seek time can become a problem when needing to read those 1 million files.

That is maybe a one VM in prod. Everything else is filled out with files starting at one gig. And test CT / VM was filled with similar size files. Bottom line is my prod doesn't compress or dedup (all pools have compression disabled because there is no point for it). I know that my prod might be a outlier, but hey, that's the stuff that I'm dealing with.

fabian · Sep 2, 2022

feel free to file an enhancement request allowing compression to be skipped (like you say, there might be environments where the overhead of attempting compression is not worth the little potential gain, e.g. if you are backing up massive archives of already-compressed data).

Dunuin · Sep 2, 2022

tomtom13 said:
Everything else is filled out with files starting at one gig

PBS will chop everything bigger than 4MB in small chunk files of max 4MB in size. So backing up a 1GB file will not result in a single big 1GB file on the PBS but in atleast 250x 4MB files. So you will always end up with millions over millions of small chunk files, no matter how your data looks like. So restoring/verifiying that 1GB file will not sequentially read a single 1GB file but random read 250x 4MB files. Because of that HDDs can easily bottleneck with their terrible IOPS performance.

tomtom13 · Sep 22, 2022

Dunuin said:
So you will always end up with millions over millions of small chunk files, no matter how your data looks like

First @Dunuin, thanks for confirming that. Second, that sounds sub optimal, if backing up CT (where I assume PVE and PBS is aware of internal file structure) there could be an master hash for a file, stored somewhere - and where backup is being performed, PBS only needs to garnish PVE with list of files and hashes -> if hash doesn't match, do the file chop thing (could even hold list of hashes for each 4MB chunk offline, so those don't get recomputed on every backup.
Of course hash should be something like sha 1/2 (in terms of fingerprint length and lack of collision) .. or configurable.

fabian said:
feel free to file an enhancement request allowing compression to be skipped (like you say, there might be environments where the overhead of attempting compression is not worth the little potential gain, e.g. if you are backing up massive archives of already-compressed data).

That's a great suggestion ! However the Dunuin revelation, gave me idea for accelerating the CT backup, before any compression actually takes place. And it's also storage agnostic - stuff you guys care about.

tomtom13 · Dec 2, 2024

Well:

Faster container backups: When backing up containers to Proxmox Backup Server, it is now possible to efficiently detect files that have not changed since the last backup snapshot. When possible, unchanged files are not processed, which can make container backups complete faster.

it is possible

LXC Containers Backing Up Incredibly Slow

Renowned Member

Proxmox Staff Member

Renowned Member

Attachments

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Distinguished Member

Renowned Member

Distinguished Member

Renowned Member

Proxmox Staff Member

Distinguished Member

Renowned Member

Distinguished Member

Renowned Member

Proxmox Staff Member

Distinguished Member

Renowned Member

Renowned Member

We value your privacy