Abysmally slow restore from backup

That's a big patch ! your binary size is 187 MB where original is 6.8 MB put me off to try.
Ah, good catch. That's the binary with debug info and without stripping. I noticed that there is a path: libproxmox-backup-qemu0-1.5.1/debian/libproxmox-backup-qemu0/usr/lib/libproxmox_backup_qemu.so.0 as well which seems to contain a binary of the size you describe, I was taking the one in libproxmox-backup-qemu0-1.5.1/target/release/libproxmox_backup_qemu.so

Here is the smaller [6.8 MB] version without debug info and stripped : https://notnullmakers.com/public/media/libproxmox_backup_qemu.so_09062025_mini
the SHA256 is: b6c27033e46b965d7b5ce8aaab1208fa76d6601697f72e8a87ba118ec2bed9f3

Again, of course I recommend you do a backup of the original /usr/lib/libproxmox_backup_qemu.so.0 binary and check e.g. with VirusTotal or other tools.

Thank you for spotting this. If somebody from the Proxmox team could clarify what is what that would be nice.
 
Last edited:
  • Like
Reactions: _gabriel
is local restore can be improved by your patch ?
I mean PBS installed alongside PVE where network is over localhost ?
I don't think you will see a big jump in performance – the latency is usually much lower on the same machine. Of course you can try to launch PBS in a different namespace and add some latency on the veth connection between namespaces or you can probably just outright add the latency to the interface the PBS is using. This should give you an idea on what I mean: https://github.com/NOT-NULL-Makers/resources/blob/main/Installfest2024/krejzy-veci-s-SSH.md
 
Last edited:
  • Like
Reactions: Johannes S
That's a big patch ! your binary size is 187 MB where original is 6.8 MB put me off to try.
I find it also highly off putting that he doesn't publish his source patch somewhere. This would allow to review the changes before applying instead of just hoping that no bad actor did a "man-in-the-middle-attack" on his webpage. Because one might trust @kaliszad ( he seems to be helpful and nice to be around with) but there might still be a risk he isn't even aware of or responsible (man-in-the-middle, a compromise on his server with the binary etc pp) for.


But even more off putting is that part of the community want to try it nontheless: I hope they use the patched binary only on test data and nothing important. Seriously folks, it's not like in Veeam or VMWare forums where a binary file from a staff account can be trusted to a certain extent (as long as the company isn't compromised).
On a open source software in contrast the only way to know whether you can trust the patch from a community member to have the source code.
If (like me) you are not much of a programmer you could still use it if the software allows reproducible builds (which sadly often enough is not the case although it's a lot better than five or ten years before): https://reproducible-builds.org/

Edit: I see now that @fabian actually already hinted on this. But I'm wondering now whether Proxmox developers already thought about providing support for reproducible builds of the software (I mean the part developed by Proxmox Server Solutions, for the rest they are depending on upstream (Debian, Ceph etc)?
 
Edit: I see now that @fabian actually already hinted on this. But I'm wondering now whether Proxmox developers already thought about providing support for reproducible builds of the software (I mean the part developed by Proxmox Server Solutions, for the rest they are depending on upstream (Debian, Ceph etc)?

it would require some infrastructure changes that are on our todo list, but haven't been implemented yet. we do make use of the fact that most of our packages are (mostly) reproducible already when doing packaging changes :)
 
  • Like
Reactions: Johannes S
it would require some infrastructure changes that are on our todo list, but haven't been implemented yet. we do make use of the fact that most of our packages are (mostly) reproducible already when doing packaging changes :)
That's great news, thanks for the insights. I know that such stuff isn't as easy as it sounds after watching the talks from the Debian guys on reproducible builds, so I highly appreciate that you have this on your agenda
 
Last edited:
your intentions are probably/hopefully good, but I can only strongly discourage everybody from installing random binary artifacts users post here.

@kaliszad if you want to get your changes reviewed/integrated/.., please submit them as patches:

https://pve.proxmox.com/wiki/Developer_Documentation#Preparing_Patches
https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

https://bugzilla.proxmox.com/show_bug.cgi?id=3163
Yes, we might publish the patches if our client on which behalf we have done the changes agrees with that/ is interested in that and is willing to sponsor additional time.

I would also like to note that people didn't bother to do proper performance debugging since at least November 2020 (considering the posting of the bug by Nico Fellner) instead jumping to false conclusions. Even Proxmox support that our client contacted wasn't helpful in this regard instead looking for excuses and talking about a possible migration to http/3, which wasn't even the problem at all.

So for now you have a large binary with debug info and a small one without debug info and that's what people can use to do their own testing in their own testing infrastructure on testing data.
 
Ah, good catch. That's the binary with debug info and without stripping. I noticed that there is a path: libproxmox-backup-qemu0-1.5.1/debian/libproxmox-backup-qemu0/usr/lib/libproxmox_backup_qemu.so.0 as well which seems to contain a binary of the size you describe, I was taking the one in libproxmox-backup-qemu0-1.5.1/target/release/libproxmox_backup_qemu.so

Here is the smaller [6.8 MB] version without debug info and stripped : https://notnullmakers.com/public/media/libproxmox_backup_qemu.so_09062025_mini
the SHA256 is: b6c27033e46b965d7b5ce8aaab1208fa76d6601697f72e8a87ba118ec2bed9f3

Again, of course I recommend you do a backup of the original /usr/lib/libproxmox_backup_qemu.so.0 binary and check e.g. with VirusTotal or other tools.

Thank you for spotting this. If somebody from the Proxmox team could clarify what is what that would be nice.

Applied the patch and didn't see any difference. Maybe I installed it wrong?
Bash:
cp /usr/lib/libproxmox_backup_qemu.so.0 proxmox_backup.bak
wget https://notnullmakers.com/public/media/libproxmox_backup_qemu.so_09062025_mini
cp libproxmox_backup_qemu.so_09062025_mini /usr/lib/libproxmox_backup_qemu.so.0

Also verified the sha1 was the same after install
1749654743793.png

Pre-install
1749654213782.png

Post-install
1749654279340.png

I've confirmed the storage can write at over 3.5GB/s, so no issue there.
 
Applied the patch and didn't see any difference. Maybe I installed it wrong?
Bash:
cp /usr/lib/libproxmox_backup_qemu.so.0 proxmox_backup.bak
wget https://notnullmakers.com/public/media/libproxmox_backup_qemu.so_09062025_mini
cp libproxmox_backup_qemu.so_09062025_mini /usr/lib/libproxmox_backup_qemu.so.0

Also verified the sha1 was the same after install
View attachment 87020

Pre-install
View attachment 87018

Post-install
View attachment 87019

I've confirmed the storage can write at over 3.5GB/s, so no issue there.
I am at a conference until Friday so I have no chance to look into this in detail. The speedup should be visible if your backups are on a remote Proxmox Backup Server and you restore from the Proxmox Virtual Environment. That will trigger the qmrestore tool which will under the hood call pbs-restore.
 
You are seriously using a "patch" from some third party on the internet without any ability to verify that it doesn't contain malicious code or a backdoor? I hope you don't do this on systems with real life data or production loads. I'm also hoping I'm not somehow using your services
 
Last edited:
You are seriously using a "patch" from some third party on the internet without any ability to verify that it doesn't contain malicious code or a backdoor? I hope you don't do this on systems with real life data or production loads. I also hope I'm not somehow using your services

Good thing I'm running this in a test environment, yeesh.
I am at a conference until Friday so I have no chance to look into this in detail. The speedup should be visible if your backups are on a remote Proxmox Backup Server and you restore from the Proxmox Virtual Environment. That will trigger the qmrestore tool which will under the hood call pbs-restore.
Yup. A PVE server and a PBS connected over a 25GbE switch.
 
  • Like
Reactions: Johannes S
Good thing I'm running this in a test environment, yeesh.

Yup. A PVE server and a PBS connected over a 25GbE switch.
Ok, I am back again.

My run for a VM with ~45 GB of incompressible data out of 50GB total looks like this:
Code:
root@hel1:~# qmrestore hel2:backup/vm/101/2025-06-09T10:04:25Z 101 --force
Formatting '/rpool/data/hel-test-1/images/101/vm-101-disk-0.raw', fmt=raw size=53687091200 preallocation=off
new volume ID is 'hel-test-1:101/vm-101-disk-0.raw'
restore proxmox backup image: /usr/bin/pbs-restore --repository backup@pbs@hel2:data vm/101/2025-06-09T10:04:25Z drive-scsi0.img.fidx /rpool/data/hel-test-1/images/101/vm-101-disk-0.raw --verbose --format raw --skip-zero
connecting to repository 'backup@pbs@hel2:data'
open block backend for target '/rpool/data/hel-test-1/images/101/vm-101-disk-0.raw'
starting to restore snapshot 'vm/101/2025-06-09T10:04:25Z'
download and verify backup index
progress 5% (read 2780823552 bytes, zeroes = 99% (2776629248 bytes), duration 0 sec)
progress 6% (read 3221225472 bytes, zeroes = 86% (2776629248 bytes), duration 0 sec)
progress 7% (read 3758096384 bytes, zeroes = 73% (2776629248 bytes), duration 0 sec)
progress 8% (read 4294967296 bytes, zeroes = 64% (2776629248 bytes), duration 1 sec)
progress 9% (read 4831838208 bytes, zeroes = 57% (2776629248 bytes), duration 1 sec)
progress 10% (read 5368709120 bytes, zeroes = 51% (2776629248 bytes), duration 2 sec)
progress 11% (read 5905580032 bytes, zeroes = 47% (2776629248 bytes), duration 2 sec)
progress 12% (read 6442450944 bytes, zeroes = 43% (2776629248 bytes), duration 3 sec)
progress 13% (read 6979321856 bytes, zeroes = 39% (2776629248 bytes), duration 3 sec)
progress 14% (read 7516192768 bytes, zeroes = 36% (2776629248 bytes), duration 4 sec)
progress 15% (read 8053063680 bytes, zeroes = 34% (2776629248 bytes), duration 4 sec)
progress 16% (read 8589934592 bytes, zeroes = 32% (2776629248 bytes), duration 5 sec)
progress 17% (read 9126805504 bytes, zeroes = 30% (2776629248 bytes), duration 5 sec)
progress 18% (read 9663676416 bytes, zeroes = 28% (2776629248 bytes), duration 6 sec)
progress 19% (read 10200547328 bytes, zeroes = 27% (2776629248 bytes), duration 6 sec)
progress 20% (read 10737418240 bytes, zeroes = 25% (2776629248 bytes), duration 7 sec)
progress 21% (read 11274289152 bytes, zeroes = 24% (2776629248 bytes), duration 7 sec)
progress 22% (read 11811160064 bytes, zeroes = 23% (2776629248 bytes), duration 7 sec)
progress 23% (read 12348030976 bytes, zeroes = 22% (2776629248 bytes), duration 8 sec)
progress 24% (read 12884901888 bytes, zeroes = 21% (2776629248 bytes), duration 8 sec)
progress 25% (read 13421772800 bytes, zeroes = 20% (2776629248 bytes), duration 9 sec)
progress 26% (read 13958643712 bytes, zeroes = 19% (2776629248 bytes), duration 9 sec)
progress 27% (read 14495514624 bytes, zeroes = 19% (2776629248 bytes), duration 10 sec)
progress 28% (read 15032385536 bytes, zeroes = 18% (2776629248 bytes), duration 10 sec)
progress 29% (read 15569256448 bytes, zeroes = 17% (2776629248 bytes), duration 11 sec)
progress 30% (read 16106127360 bytes, zeroes = 17% (2776629248 bytes), duration 11 sec)
progress 31% (read 16642998272 bytes, zeroes = 16% (2776629248 bytes), duration 12 sec)
progress 32% (read 17179869184 bytes, zeroes = 16% (2776629248 bytes), duration 12 sec)
progress 33% (read 17716740096 bytes, zeroes = 15% (2776629248 bytes), duration 13 sec)
progress 34% (read 18253611008 bytes, zeroes = 15% (2776629248 bytes), duration 13 sec)
progress 35% (read 18790481920 bytes, zeroes = 14% (2776629248 bytes), duration 14 sec)
progress 36% (read 19327352832 bytes, zeroes = 14% (2776629248 bytes), duration 14 sec)
progress 37% (read 19864223744 bytes, zeroes = 13% (2776629248 bytes), duration 15 sec)
progress 38% (read 20401094656 bytes, zeroes = 13% (2776629248 bytes), duration 15 sec)
progress 39% (read 20937965568 bytes, zeroes = 13% (2776629248 bytes), duration 16 sec)
progress 40% (read 21474836480 bytes, zeroes = 12% (2776629248 bytes), duration 16 sec)
progress 41% (read 22011707392 bytes, zeroes = 12% (2776629248 bytes), duration 17 sec)
progress 42% (read 22548578304 bytes, zeroes = 12% (2776629248 bytes), duration 17 sec)
progress 43% (read 23085449216 bytes, zeroes = 12% (2776629248 bytes), duration 17 sec)
progress 44% (read 23622320128 bytes, zeroes = 11% (2776629248 bytes), duration 18 sec)
progress 45% (read 24159191040 bytes, zeroes = 11% (2776629248 bytes), duration 18 sec)
progress 46% (read 24696061952 bytes, zeroes = 11% (2776629248 bytes), duration 19 sec)
progress 47% (read 25232932864 bytes, zeroes = 11% (2776629248 bytes), duration 19 sec)
progress 48% (read 25769803776 bytes, zeroes = 10% (2776629248 bytes), duration 20 sec)
progress 49% (read 26306674688 bytes, zeroes = 10% (2776629248 bytes), duration 20 sec)
progress 50% (read 26843545600 bytes, zeroes = 10% (2776629248 bytes), duration 21 sec)
progress 51% (read 27380416512 bytes, zeroes = 10% (2776629248 bytes), duration 21 sec)
progress 52% (read 27917287424 bytes, zeroes = 9% (2776629248 bytes), duration 22 sec)
progress 53% (read 28454158336 bytes, zeroes = 9% (2776629248 bytes), duration 22 sec)
progress 54% (read 28991029248 bytes, zeroes = 9% (2776629248 bytes), duration 23 sec)
progress 55% (read 29527900160 bytes, zeroes = 9% (2776629248 bytes), duration 23 sec)
progress 56% (read 30064771072 bytes, zeroes = 9% (2776629248 bytes), duration 24 sec)
progress 57% (read 30601641984 bytes, zeroes = 9% (2776629248 bytes), duration 24 sec)
progress 58% (read 31138512896 bytes, zeroes = 8% (2776629248 bytes), duration 25 sec)
progress 59% (read 31675383808 bytes, zeroes = 8% (2776629248 bytes), duration 25 sec)
progress 60% (read 32212254720 bytes, zeroes = 8% (2776629248 bytes), duration 25 sec)
progress 61% (read 32749125632 bytes, zeroes = 8% (2776629248 bytes), duration 26 sec)
progress 62% (read 33285996544 bytes, zeroes = 8% (2776629248 bytes), duration 26 sec)
progress 63% (read 33822867456 bytes, zeroes = 8% (2776629248 bytes), duration 27 sec)
progress 64% (read 34359738368 bytes, zeroes = 8% (2776629248 bytes), duration 27 sec)
progress 65% (read 34896609280 bytes, zeroes = 7% (2776629248 bytes), duration 28 sec)
progress 66% (read 35433480192 bytes, zeroes = 7% (2776629248 bytes), duration 28 sec)
progress 67% (read 35970351104 bytes, zeroes = 7% (2776629248 bytes), duration 29 sec)
progress 68% (read 36507222016 bytes, zeroes = 7% (2776629248 bytes), duration 29 sec)
progress 69% (read 37044092928 bytes, zeroes = 7% (2776629248 bytes), duration 30 sec)
progress 70% (read 37580963840 bytes, zeroes = 7% (2776629248 bytes), duration 30 sec)
progress 71% (read 38117834752 bytes, zeroes = 7% (2776629248 bytes), duration 31 sec)
progress 72% (read 38654705664 bytes, zeroes = 7% (2776629248 bytes), duration 31 sec)
progress 73% (read 39191576576 bytes, zeroes = 7% (2776629248 bytes), duration 32 sec)
progress 74% (read 39728447488 bytes, zeroes = 6% (2776629248 bytes), duration 32 sec)
progress 75% (read 40265318400 bytes, zeroes = 6% (2776629248 bytes), duration 33 sec)
progress 76% (read 40802189312 bytes, zeroes = 6% (2776629248 bytes), duration 33 sec)
progress 77% (read 41339060224 bytes, zeroes = 6% (2776629248 bytes), duration 34 sec)
progress 78% (read 41875931136 bytes, zeroes = 6% (2776629248 bytes), duration 34 sec)
progress 79% (read 42412802048 bytes, zeroes = 6% (2776629248 bytes), duration 35 sec)
progress 80% (read 42949672960 bytes, zeroes = 6% (2776629248 bytes), duration 35 sec)
progress 81% (read 43486543872 bytes, zeroes = 6% (2776629248 bytes), duration 36 sec)
progress 82% (read 44023414784 bytes, zeroes = 6% (2776629248 bytes), duration 36 sec)
progress 83% (read 44560285696 bytes, zeroes = 6% (2776629248 bytes), duration 37 sec)
progress 84% (read 45097156608 bytes, zeroes = 6% (2776629248 bytes), duration 37 sec)
progress 85% (read 45634027520 bytes, zeroes = 6% (2776629248 bytes), duration 38 sec)
progress 86% (read 46170898432 bytes, zeroes = 6% (2776629248 bytes), duration 38 sec)
progress 87% (read 46707769344 bytes, zeroes = 5% (2776629248 bytes), duration 39 sec)
progress 88% (read 47244640256 bytes, zeroes = 5% (2776629248 bytes), duration 39 sec)
progress 89% (read 47781511168 bytes, zeroes = 5% (2776629248 bytes), duration 39 sec)
progress 90% (read 48318382080 bytes, zeroes = 5% (2776629248 bytes), duration 40 sec)
progress 91% (read 48855252992 bytes, zeroes = 5% (2776629248 bytes), duration 40 sec)
progress 92% (read 49392123904 bytes, zeroes = 5% (2776629248 bytes), duration 41 sec)
progress 93% (read 49928994816 bytes, zeroes = 5% (2776629248 bytes), duration 41 sec)
progress 94% (read 50465865728 bytes, zeroes = 5% (2776629248 bytes), duration 42 sec)
progress 95% (read 51002736640 bytes, zeroes = 5% (2776629248 bytes), duration 42 sec)
progress 96% (read 51539607552 bytes, zeroes = 5% (2776629248 bytes), duration 43 sec)
progress 97% (read 52076478464 bytes, zeroes = 5% (2776629248 bytes), duration 43 sec)
progress 98% (read 52613349376 bytes, zeroes = 5% (2776629248 bytes), duration 44 sec)
progress 99% (read 53150220288 bytes, zeroes = 5% (2776629248 bytes), duration 44 sec)
progress 100% (read 53687091200 bytes, zeroes = 5% (2776629248 bytes), duration 45 sec)
restore image complete (bytes=53687091200, duration=45.23s, speed=1131.88MB/s)

Yes, the zeroes output is probably broken, I will investigate that over the next few days as I refine the patch.

and my storage is configured like this (from PVE perspective):

Code:
root@hel1:~# pvesm status
Name              Type     Status           Total            Used       Available        %
hel-test-1         dir     active      1927675648       150918768      1776756880    7.83%
hel2               pbs     active       862694400       662777856       199916544   76.83%
local              dir     active      1777537920          781056      1776756864    0.04%
local-zfs      zfspool     active      1927675748       150918864      1776756884    7.83%

Can you please provide the vmstat -antw 1 and mpstat 1 (from sysstat) output while the backup runs on both the PVE and the PBS node?
 
Last edited:
Ok, I am back again.
...
...
Can you please provide the vmstat -antw 1 and mpstat 1 (from sysstat) output while the backup runs on both the PVE and the PBS node?

Sure thing. I've attached the logs.

This is how my storage looks, in case it helps:
Code:
Name                   Type     Status           Total            Used       Available        %
der-pbs                 pbs     active     99163627520      6752645888     92410981632    6.81%
linstor_storage        drbd     active      6000451584       418831529      5581620055    6.98%
local                   dir     active        41414656        11588556        29826100   27.98%
local-lvm           lvmthin     active        56545280               0        56545280    0.00%
pure             purestorage     active     25155846144     16127030193      9028815950   64.11%

I created a new VM to try out, as the one I was using for testing would create a lot bigger logs, but maybe more on that in a moment.

This is a Windows 11 VM with a 60GB boot disk, a 60GB disk that's full of uncompressable data, and a 200GB empty disk. The interesting thing is, it actually did restore at a very reasonable rate, apparently around 1GB/s.

It went from 195 seconds to restore to 92 seconds.

But the odd thing is, my original test VM, which is a 300GB uncompressable Ubuntu VM still takes the same amount of time. I'll see if I can investigate why. I did a quick mpstat for that VM during a restore

PVE
Code:
root@test-pve1:~# mpstat 1
Linux 6.8.12-10-pve (test-pve1)         06/16/2025      _x86_64_        (32 CPU)

07:07:03 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
07:07:04 AM  all    1.04    0.00    1.66    3.96    0.00    0.53    0.00    0.00    0.00   92.81
07:07:05 AM  all    0.91    0.00    1.32    4.91    0.00    0.44    0.00    0.00    0.00   92.42
07:07:06 AM  all    0.82    0.00    1.57    4.24    0.00    0.50    0.00    0.00    0.00   92.87
07:07:07 AM  all    1.13    0.00    2.27    4.28    0.00    0.82    0.00    0.00    0.00   91.50
07:07:08 AM  all    0.95    0.00    2.11    4.10    0.00    0.85    0.00    0.00    0.00   91.99
07:07:09 AM  all    1.01    0.00    2.23    4.06    0.00    0.85    0.00    0.00    0.00   91.86
07:07:10 AM  all    1.13    0.00    2.33    3.94    0.00    0.82    0.00    0.00    0.00   91.77
07:07:11 AM  all    1.10    0.00    2.20    4.24    0.00    1.01    0.00    0.00    0.00   91.45
07:07:12 AM  all    0.98    0.00    1.83    4.25    0.00    0.44    0.00    0.00    0.00   92.50
07:07:13 AM  all    1.07    0.00    1.95    3.68    0.00    0.72    0.00    0.00    0.00   92.57
07:07:14 AM  all    1.07    0.00    2.23    3.96    0.00    0.75    0.00    0.00    0.00   91.99
07:07:15 AM  all    1.07    0.00    1.89    4.10    0.00    0.73    0.00    0.00    0.00   92.21
07:07:16 AM  all    1.32    0.00    2.11    3.75    0.00    0.82    0.00    0.00    0.00   92.01
07:07:17 AM  all    1.07    0.00    2.14    3.65    0.00    0.85    0.00    0.00    0.00   92.29
07:07:18 AM  all    1.13    0.00    1.99    4.16    0.00    0.82    0.00    0.00    0.00   91.90
07:07:19 AM  all    1.13    0.00    1.92    4.27    0.00    0.47    0.00    0.00    0.00   92.21
07:07:20 AM  all    1.01    0.00    1.89    4.02    0.00    0.60    0.00    0.00    0.00   92.49
07:07:21 AM  all    0.94    0.00    1.61    3.94    0.00    0.47    0.00    0.00    0.00   93.04
07:07:22 AM  all    1.10    0.00    2.01    4.21    0.00    0.50    0.00    0.00    0.00   92.18
07:07:23 AM  all    1.07    0.00    1.61    4.41    0.00    0.47    0.00    0.00    0.00   92.45
07:07:24 AM  all    0.88    0.00    1.82    3.71    0.00    0.60    0.00    0.00    0.00   92.99
07:07:25 AM  all    1.03    0.00    1.94    3.57    0.00    0.75    0.00    0.00    0.00   92.69
07:07:26 AM  all    1.00    0.00    1.85    3.80    0.00    0.82    0.00    0.00    0.00   92.53
^CAverage:     all    1.04    0.00    1.93    4.05    0.00    0.68    0.00    0.00    0.00   92.29

PBS
Code:
root@PBS:~# mpstat 1
Linux 6.12.15-production+truenas (PBS)  06/16/2025      _x86_64_        (128 CPU)

07:06:53 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
07:06:54 AM  all    0.14    0.00    0.55    0.04    0.00    0.01    0.00    0.00    0.00   99.26
07:06:55 AM  all    0.17    0.01    0.44    0.06    0.00    0.03    0.00    0.00    0.00   99.29
07:06:56 AM  all    0.14    0.00    0.63    0.08    0.00    0.02    0.00    0.00    0.00   99.13
07:06:57 AM  all    0.16    0.00    0.48    0.06    0.00    0.01    0.00    0.00    0.00   99.29
07:06:58 AM  all    0.10    0.00    0.55    0.09    0.00    0.01    0.00    0.00    0.00   99.26
07:06:59 AM  all    0.09    0.02    0.49    0.08    0.00    0.02    0.00    0.00    0.00   99.30
07:07:00 AM  all    0.12    0.00    0.62    0.06    0.00    0.05    0.00    0.00    0.00   99.15
07:07:01 AM  all    0.19    0.01    0.51    0.04    0.00    0.01    0.00    0.00    0.00   99.25
07:07:02 AM  all    0.13    0.00    0.48    0.08    0.00    0.00    0.00    0.00    0.00   99.30
07:07:03 AM  all    0.16    0.00    0.66    0.09    0.00    0.02    0.00    0.00    0.00   99.08
07:07:04 AM  all    0.10    0.00    0.59    0.06    0.00    0.02    0.00    0.00    0.00   99.22
07:07:05 AM  all    0.12    0.00    0.55    0.04    0.00    0.01    0.00    0.00    0.00   99.29
07:07:06 AM  all    0.13    0.00    0.45    0.08    0.00    0.01    0.00    0.00    0.00   99.33
07:07:07 AM  all    0.14    0.02    0.44    0.04    0.00    0.01    0.00    0.00    0.00   99.36
07:07:08 AM  all    0.09    0.00    0.44    0.08    0.00    0.00    0.00    0.00    0.00   99.40
07:07:09 AM  all    0.11    0.00    0.51    0.05    0.00    0.02    0.00    0.00    0.00   99.31
07:07:10 AM  all    0.11    0.00    0.54    0.05    0.00    0.01    0.00    0.00    0.00   99.29
07:07:11 AM  all    0.13    0.01    0.63    0.06    0.00    0.02    0.00    0.00    0.00   99.16
07:07:12 AM  all    0.12    0.00    0.42    0.05    0.00    0.02    0.00    0.00    0.00   99.40
07:07:13 AM  all    0.14    0.00    0.50    0.10    0.00    0.00    0.00    0.00    0.00   99.26
07:07:14 AM  all    0.11    0.01    0.51    0.05    0.00    0.02    0.00    0.00    0.00   99.31
07:07:15 AM  all    0.11    0.00    0.52    0.04    0.00    0.02    0.00    0.00    0.00   99.31
07:07:16 AM  all    0.22    0.01    0.71    0.08    0.00    0.01    0.00    0.00    0.00   98.98
07:07:17 AM  all    0.17    0.00    0.60    0.05    0.00    0.02    0.00    0.00    0.00   99.15
07:07:18 AM  all    0.09    0.00    0.70    0.09    0.00    0.03    0.00    0.00    0.00   99.09
07:07:19 AM  all    0.14    0.00    0.42    0.06    0.00    0.00    0.00    0.00    0.00   99.37
07:07:20 AM  all    0.18    0.00    0.49    0.07    0.00    0.02    0.00    0.00    0.00   99.24
07:07:21 AM  all    0.13    0.00    0.48    0.05    0.00    0.03    0.00    0.00    0.00   99.31
07:07:22 AM  all    0.19    0.00    0.55    0.06    0.00    0.00    0.00    0.00    0.00   99.20
07:07:23 AM  all    0.28    0.00    0.65    0.07    0.00    0.02    0.00    0.00    0.00   98.98
07:07:24 AM  all    0.22    0.02    1.42    0.03    0.00    0.02    0.00    0.00    0.00   98.29
07:07:25 AM  all    0.14    0.01    2.07    0.07    0.00    0.02    0.00    0.00    0.00   97.69
07:07:26 AM  all    0.34    0.00    1.21    0.06    0.00    0.01    0.00    0.00    0.00   98.37
07:07:27 AM  all    0.13    0.00    0.52    0.03    0.00    0.02    0.00    0.00    0.00   99.30
^CAverage:     all    0.15    0.00    0.63    0.06    0.00    0.01    0.00    0.00    0.00   99.15

Let me know if you need any other results. I'll do some more testing to figure out why the original test VM doesn't act the same.
 

Attachments

  • Like
Reactions: kaliszad
Okay, I've found something interesting, I'm digging into it now. So, the VM that I was originally testing with does not result in a higher restore speed, but an exact clone I made of the VM does...

Start the restore of the original Ubuntu VM:
Restore took 1158 seconds

Created a clone, created a backup, and restored it:
Restore took 457 seconds

Weird. I'll check some logs to see what's going on. But good news, massive improvement on the clone!
 
  • Like
Reactions: kaliszad
Okay, I've found something interesting, I'm digging into it now. So, the VM that I was originally testing with does not result in a higher restore speed, but an exact clone I made of the VM does...

Start the restore of the original Ubuntu VM:
Restore took 1158 seconds

Created a clone, created a backup, and restored it:
Restore took 457 seconds

Weird. I'll check some logs to see what's going on. But good news, massive improvement on the clone!
Hm, that's interesting. How many snapshots does the original Ubuntu VM have?

If I see that right, the %sys and %iowait numbers are lower on the PBS when you restore the 300GB Ubuntu VM vs the Windows VM or maybe I am just confused. Could you maybe run the numbers for the restore of the original Ubuntu VM and the clone ideally with vmstat and mpstat on both the PBS and PVE + the logs as you kindly provided as well?

That would be:
- 300 GB Ubuntu VM with bad restore speed:
- vmstat for PVE and PBS
- mpstat for PVE and PBS
- restore log from PVE
- 300 GB Ubuntu VM clone with faster restore speed:
- vmstat for PVE and PBS
- mpstat for PVE and PBS
- restore log from PVE

Is the hardware configuration the same for both the original Ubuntu VM and the clone? Can you please provide the files for both VMs from: /etc/pve/qemu-server/?

Thank you for doing the testing, there seems to be some progress. I am glad to see the speed up. Worst case, I could add more logging to the lib so we can better understand what's going on with the slow restore. Good news is we will get more testing infrastructure and development time from our client so we should be able to bring the patch over the finish line.
 
Hm, that's interesting. How many snapshots does the original Ubuntu VM have?
Zero!
If I see that right, the %sys and %iowait numbers are lower on the PBS when you restore the 300GB Ubuntu VM vs the Windows VM or maybe I am just confused. Could you maybe run the numbers for the restore of the original Ubuntu VM and the clone ideally with vmstat and mpstat on both the PBS and PVE + the logs as you kindly provided as well?
I've attached all the logs requested. :)

Is the hardware configuration the same for both the original Ubuntu VM and the clone? Can you please provide the files for both VMs from: /etc/pve/qemu-server/?
Surprisingly, they're identical, other than the disk of course. I've added these to the archive as well.

...Good news is we will get more testing infrastructure and development time from our client so we should be able to bring the patch over the finish line.
Glad to here! Hopefully this can be up streamed once it's cleaned up. We've been getting a 5-100x reduction in time to backup with PBS vs Veeam. But the restore, obviously, has been the large bottleneck. If this gets it all fixed up, there's really no reason to bother with anything but PBS (unless you need application aware).
 

Attachments

  • Like
Reactions: kaliszad
Zero!

I've attached all the logs requested. :)


Surprisingly, they're identical, other than the disk of course. I've added these to the archive as well.


Glad to here! Hopefully this can be up streamed once it's cleaned up. We've been getting a 5-100x reduction in time to backup with PBS vs Veeam. But the restore, obviously, has been the large bottleneck. If this gets it all fixed up, there's really no reason to bother with anything but PBS (unless you need application aware).
Thank you for the logs. If you compare the two mpstat outputs on PVE for the original and the clone you will notice the difference in %iowait and %sys times. There seems to be a significant difference. Is it possible the backing storage for these is set up differently? Perhaps a different block size or something like this was used?

If the linstor storage is backed by ZFS I would look for the volblocksize (ZVOLs) or recordsize (dataset) properties, compression setting, atime and relatime.
 
Here is a new build (the smaller [6.8 MB] version without debug info and stripped): https://www.notnullmakers.com/public/media/libproxmox_backup_qemu.so_19062025_mini
the SHA256 is: 9164af77c3b168874b2805fa7cf895cf8895e4b3937d1d69cde7fd2c8bdec00d

Again, of course I recommend you do a backup of the original /usr/lib/libproxmox_backup_qemu.so.0 binary and check e.g. with VirusTotal or other tools.

There is more information related to how much time was spent in the single thread section where we are writing data to storage.

You can check the current state of changes here: https://github.com/NOT-NULL-Makers/...mmit/af01a18e5672b3e72a8b2f876fddd10edf71a975
 
Last edited: