pbs-restore performance optimization (parallelization)

logics · Nov 24, 2020

Hey guys,

thanks for the release of Proxmox Backup Server!

PBS looks very promising in regards to what our company needs:

Incremental backups of our VMs, e.g. every 15 minutes
Flexible retention cycle, e.g. keep last 8, keep 22 hours, ...
One pushing PVE client, several backup servers pull via sync jobs

In case of failure, the operator selects a backup (usually the last one, which is at max 15 minutes old) and initiates the restore process of the VM.

We are currently evaluating hardware for a backup server.

The performance of the incremental backups is fine, even with HDDs, no problems here
The restore time is too high: at our current host at Hetzner we see restore times of 1h 17 - 1h 21 with avg speed of 156-165 MB/s with the bigger VMs of about 750 GB.
I try to lower the restore time to about 0 hours and 15 - 45 minutes which requires restore speeds of 300 - 800 MB/s.

Let's focus on the restore speeds of the 750 GB VM. Here are the specs from our current host:

Host: a dedicated host on Hetzner
CPU: Intel(R) Xeon(R) CPU E5-1650 v3 (6 core, 12 threads)
Memory: 256 GB

Source (PBS chunks): a 2 TB HDD, Target: the same HDD. Speed: 35 MB/s
Source (PBS chunks): a 2 TB HDD, Target: SSD TOSHIBA_KHK61RSE1T92. Speed: 139 MB/s
Source (PBS chunks): SSD TOSHIBA_THNSN81Q92CSE, Target: SSD TOSHIBA_KHK61RSE1T92. Speed: 156 MB/s - 165 MB/s

Then I wondered whether NVMes allow for at least 300 MB/s speeds, so I did some tests on Amazon EC2 and Microsoft Azure.

Host: AWS EC2 i3.metal, dedicated instance
CPU: 80 x AMD EPYC 7551 32-Core Processor (2 Sockets)
Memory: 512 GB

Source (PBS chunks): mdadm RAID0 array of 2x 2TB NVMe, target: another RAID0 array of 2x 2TB NVMe. Speed: 140 MB/s
Source (PBS chunks): mdadm RAID0 array of 8x 2TB NVMe, target: itself. Speed: 142 MB/s

Host: Azure L80s_v2
CPU: 72 x Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz (2 Sockets)
Memory: 640 GB

Source (PBS chunks): 2TB NVMe, target: another 2TB NVMe. Speed: 204 MB/s
Source (PBS chunks): mdadm RAID0 array of 2x 2TB NVMe, target: another RAID0 array of 2x 2TB NVMe. Speed: 229 MB/s
Source (PBS chunks): mdadm RAID0 array of 5x 2TB NVMe, target: another RAID0 array of 5x 2TB NVMe. Speed: 208 MB/s
The host was capable of downloading the initial VM backup with 1200-2200 MB/s (10-18 Gbit/s), and the fio numbers speak for themselves (see attachment)

In the end, the tests with NVMe disks did not show the expected restore speeds. AWS EC2 even struggled to keep up with the performance of the E5-1650 v3 host with the Toshiba SSDs...

Let's take a look at htop during the restore process:

The reason for the low restore speed is obvious:

/usr/bin/pbs-restore only keeps one thread active which results in nearly 100% utilization of one vCore
The single CPU thread is not fast enough to fill the I/O queues of my NVMes
With SSDs and NVMes, the single-thread-CPU-performance becomes the bottleneck during the restore.

So I took a look into the proxmox-backup-qemu repository, especially src/restore.rs

In the constructor pub fn new(setup: BackupSetup), a proxmox-restore-worker is initialized with builder.max_threads(6) and builder.core_threads(4)
but the restore workers don't work together when restoring a VM: in function pub async fn restore_image I see no parallelism in the loop for pos in 0..index.index_count()

Maybe I am looking at the wrong repository or wrong lines of code, but...

Can you imagine a parallelization of pbs-restore in regards to restoring a single PBS VM backup, so we can take advantage of modern SSDs and NVMe drives?

Further links:

dcsapak · Nov 25, 2020

mhmm... it seems it would be possible to parallelize the restore of a vm image but it is not yet implemented (as you saw). you can open a feature request if you like here: https://bugzilla.proxmox.com , but no promises
if/when this gets implemented

logics · Nov 25, 2020

Thanks, I made a feature request. Maybe I can help coding it, too.

oguz · Nov 25, 2020

logics said:
Thanks, I made a feature request. Maybe I can help coding it, too.

if you're interested: https://www.proxmox.com/en/proxmox-backup-server/get-involved

DerDanilo · Sep 20, 2021

Looking forward to faster restore performance. Maybe there could be a similiar cluster/host option to allow up to x processes just like with vzdump?!
Restoreing is painfully slow and CPU, NVME even HDDs are idling most of the time.

I checked the bugzill list but couldn't figure out if this is actually going somewhere soon? Maybe we can get an update on this. Thanks!

logics · Sep 21, 2021

DerDanilo said:
Looking forward to faster restore performance. Maybe there could be a similiar cluster/host option to allow up to x processes just like with vzdump?!
Restoreing is painfully slow and CPU, NVME even HDDs are idling most of the time.

I checked the bugzill list but couldn't figure out if this is actually going somewhere soon? Maybe we can get an update on this. Thanks!

I tried parallelization of the restore process, see parallelize restore.rs on https://lists.proxmox.com/pipermail/pbs-devel/2020-December/subject.html

But see the restriction in https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001685.html

the problem afaics, is that qemu cannot have multiple writers on the
same block backend from multiple threads, and it seems
that is a general qemu limitation (you can even only have
one iothread per image) so i am not sure that it is possible
to implement what you want with a qemu layer

ofc you can ask the qemu people on their mailing list, maybe
i am simply not seeing how

so we have to synchronize on the writes, which makes the whole
patch less interesting since it will not solve your bottleneck
i assume?

the only thing we can parallelize is the download of the chunks,
but in my tests here, that did not improve restore speed at all

Indeed I did see an improvement in my benchmarks, see benchmark results at https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001687.html

Anyhow for the most cases, Live Restore now solved my needs.

voarsh · Sep 23, 2021

logics said:
Anyhow for the most cases, Live Restore now solved my needs.

How is live restore better/improvement in this case?

logics · Sep 23, 2021

voarsh said:
How is live restore better/improvement in this case?

Because in case of a problem, we can use live-restore and like this, our VM and web server is up and running after about 7-9 minutes.

If you have a full-restore, even without parallelization, it would be difficult to restore 750 GB in under 10 minutes I guess.

Of course combining both would be even better: Parallelization of live-restore and we're up in a minute?...

Felix. · Sep 23, 2021

logics said:
Indeed I did see an improvement in my benchmarks, see benchmark results at https://lists.proxmox.com/pipermail/pbs-devel/2020-December/001687.html

That is a bit of an understatement imho.
Those numbers are pretty amazing!
While it still may not utilize the full potential of the underlying storage, if I can get a 2x or 3x faster restore on the same hardware, I'll take it!

At restore speeds beyond 1GB/s the network would become the bottleneck anyway, for lots of people, I guess.

logics said:
Because in case of a problem, we can use live-restore and like this, our VM and web server is up and running after about 7-9 minutes.

But if the live restore fails for whatever reason, all changes to the VM in the meanwhile are gone.
That is a deal breaker for some VMs like file servers, let alone databases.
For a simple webserver that may write to an external database anyway, this is probably fine, though. Not much to risk here.

Search

Search

pbs-restore performance optimization (parallelization)

logics

Well-Known Member

Attachments

dcsapak

Proxmox Staff Member

logics

Well-Known Member

oguz

Proxmox Retired Staff

DerDanilo

Famous Member

logics

Well-Known Member

voarsh

Member

logics

Well-Known Member

Felix.

Renowned Member

We value your privacy