pbs-restore performance optimization (parallelization)

Sep 8, 2019
32
5
13
33
Hey guys,

thanks for the release of Proxmox Backup Server!

PBS looks very promising in regards to what our company needs:
  • Incremental backups of our VMs, e.g. every 15 minutes
  • Flexible retention cycle, e.g. keep last 8, keep 22 hours, ...
  • One pushing PVE client, several backup servers pull via sync jobs
In case of failure, the operator selects a backup (usually the last one, which is at max 15 minutes old) and initiates the restore process of the VM.

We are currently evaluating hardware for a backup server.
  • The performance of the incremental backups is fine, even with HDDs, no problems here
  • The restore time is too high: at our current host at Hetzner we see restore times of 1h 17 - 1h 21 with avg speed of 156-165 MB/s with the bigger VMs of about 750 GB.
  • I try to lower the restore time to about 0 hours and 15 - 45 minutes which requires restore speeds of 300 - 800 MB/s.
Let's focus on the restore speeds of the 750 GB VM. Here are the specs from our current host:

Host: a dedicated host on Hetzner
CPU: Intel(R) Xeon(R) CPU E5-1650 v3 (6 core, 12 threads)
Memory: 256 GB
Then I wondered whether NVMes allow for at least 300 MB/s speeds, so I did some tests on Amazon EC2 and Microsoft Azure.

Host: AWS EC2 i3.metal, dedicated instance
CPU: 80 x AMD EPYC 7551 32-Core Processor (2 Sockets)
Memory: 512 GB
  • Source (PBS chunks): mdadm RAID0 array of 2x 2TB NVMe, target: another RAID0 array of 2x 2TB NVMe. Speed: 140 MB/s
  • Source (PBS chunks): mdadm RAID0 array of 8x 2TB NVMe, target: itself. Speed: 142 MB/s
Host: Azure L80s_v2
CPU: 72 x Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz (2 Sockets)
Memory: 640 GB
  • Source (PBS chunks): 2TB NVMe, target: another 2TB NVMe. Speed: 204 MB/s
  • Source (PBS chunks): mdadm RAID0 array of 2x 2TB NVMe, target: another RAID0 array of 2x 2TB NVMe. Speed: 229 MB/s
  • Source (PBS chunks): mdadm RAID0 array of 5x 2TB NVMe, target: another RAID0 array of 5x 2TB NVMe. Speed: 208 MB/s
  • The host was capable of downloading the initial VM backup with 1200-2200 MB/s (10-18 Gbit/s), and the fio numbers speak for themselves (see attachment)
In the end, the tests with NVMe disks did not show the expected restore speeds. AWS EC2 even struggled to keep up with the performance of the E5-1650 v3 host with the Toshiba SSDs...

Let's take a look at htop during the restore process:

pbs_restore_aws_and_azure.png

The reason for the low restore speed is obvious:
  • /usr/bin/pbs-restore only keeps one thread active which results in nearly 100% utilization of one vCore
  • The single CPU thread is not fast enough to fill the I/O queues of my NVMes
  • With SSDs and NVMes, the single-thread-CPU-performance becomes the bottleneck during the restore.
So I took a look into the proxmox-backup-qemu repository, especially src/restore.rs
  • In the constructor pub fn new(setup: BackupSetup), a proxmox-restore-worker is initialized with builder.max_threads(6) and builder.core_threads(4)
  • but the restore workers don't work together when restoring a VM: in function pub async fn restore_image I see no parallelism in the loop for pos in 0..index.index_count()
Maybe I am looking at the wrong repository or wrong lines of code, but...


Can you imagine a parallelization of pbs-restore in regards to restoring a single PBS VM backup, so we can take advantage of modern SSDs and NVMe drives?




Further links:
 

Attachments

  • fio-benchmarks-azure-aws.txt
    5.6 KB · Views: 6
  • proxmox-backup-client benchmark.PNG
    proxmox-backup-client benchmark.PNG
    9.8 KB · Views: 45
Last edited:
mhmm... it seems it would be possible to parallelize the restore of a vm image but it is not yet implemented (as you saw). you can open a feature request if you like here: https://bugzilla.proxmox.com , but no promises
if/when this gets implemented
 
  • Like
Reactions: oversite and logics
Looking forward to faster restore performance. Maybe there could be a similiar cluster/host option to allow up to x processes just like with vzdump?!
Restoreing is painfully slow and CPU, NVME even HDDs are idling most of the time.

I checked the bugzill list but couldn't figure out if this is actually going somewhere soon? Maybe we can get an update on this. Thanks!
 
Looking forward to faster restore performance. Maybe there could be a similiar cluster/host option to allow up to x processes just like with vzdump?!
Restoreing is painfully slow and CPU, NVME even HDDs are idling most of the time.

I checked the bugzill list but couldn't figure out if this is actually going somewhere soon? Maybe we can get an update on this. Thanks!

I tried parallelization of the restore process, see parallelize restore.rs on https://lists.proxmox.com/pipermail/pbs-devel/2020-December/subject.html

But see the restriction in https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001685.html

the problem afaics, is that qemu cannot have multiple writers on the
same block backend from multiple threads, and it seems
that is a general qemu limitation (you can even only have
one iothread per image) so i am not sure that it is possible
to implement what you want with a qemu layer

ofc you can ask the qemu people on their mailing list, maybe
i am simply not seeing how

so we have to synchronize on the writes, which makes the whole
patch less interesting since it will not solve your bottleneck
i assume?

the only thing we can parallelize is the download of the chunks,
but in my tests here, that did not improve restore speed at all

Indeed I did see an improvement in my benchmarks, see benchmark results at https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001687.html

Anyhow for the most cases, Live Restore now solved my needs.
 
  • Like
Reactions: DerDanilo
How is live restore better/improvement in this case?

Because in case of a problem, we can use live-restore and like this, our VM and web server is up and running after about 7-9 minutes.

If you have a full-restore, even without parallelization, it would be difficult to restore 750 GB in under 10 minutes I guess.

Of course combining both would be even better: Parallelization of live-restore and we're up in a minute?...
 
Indeed I did see an improvement in my benchmarks, see benchmark results at https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001687.html
That is a bit of an understatement imho.
Those numbers are pretty amazing!
While it still may not utilize the full potential of the underlying storage, if I can get a 2x or 3x faster restore on the same hardware, I'll take it! :D
At restore speeds beyond 1GB/s the network would become the bottleneck anyway, for lots of people, I guess.

Because in case of a problem, we can use live-restore and like this, our VM and web server is up and running after about 7-9 minutes.
But if the live restore fails for whatever reason, all changes to the VM in the meanwhile are gone.
That is a deal breaker for some VMs like file servers, let alone databases.
For a simple webserver that may write to an external database anyway, this is probably fine, though. Not much to risk here.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!