Slow restore performance

JustThat

New Member
Jun 3, 2024
10
1
3
Currently have proxmox 8.2 and PBS 3.2, backup of a 1.65TiB takes around 1 hour (which is pretty fast).

But restore took us 14 hours....which is very long.

You can find attached image of a PBS benchmark:
1717405472105.png

Machine specs:

i9-10940X CPU @ 3.30GHz CPU @ 3.3GHz
128G DDR4 3000MT/s
4TiB P2 NVMe PCIe SSD
Upload speed is ~60MB/s

Below you can find fio random read benchmark:

Code:
sudo fio --filename=/dev/nvme0n1 --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 --time_based --group_reporting --name=iops-test-job --eta-newline=1 --readonly
...
iops-test-job: (groupid=0, jobs=4): err= 0: pid=9063: Mon Jun  3 10:59:15 2024
  read: IOPS=214k, BW=835MiB/s (875MB/s)(97.8GiB/120008msec)
    slat (nsec): min=1038, max=266104, avg=1849.36, stdev=703.19
    clat (usec): min=783, max=14359, avg=4788.89, stdev=1264.40
     lat (usec): min=785, max=14360, avg=4790.74, stdev=1264.49
    clat percentiles (usec):
     |  1.00th=[ 2507],  5.00th=[ 3228], 10.00th=[ 3425], 20.00th=[ 3687],
     | 30.00th=[ 4047], 40.00th=[ 4490], 50.00th=[ 4621], 60.00th=[ 4817],
     | 70.00th=[ 5014], 80.00th=[ 5342], 90.00th=[ 7046], 95.00th=[ 7439],
     | 99.00th=[ 7963], 99.50th=[ 8160], 99.90th=[ 8979], 99.95th=[ 9372],
     | 99.99th=[10290]
   bw (  KiB/s): min=566744, max=1359152, per=100.00%, avg=855669.28, stdev=34033.89, samples=956
   iops        : min=141686, max=339788, avg=213917.33, stdev=8508.48, samples=956
  lat (usec)   : 1000=0.01%
  lat (msec)   : 2=0.17%, 4=28.91%, 10=70.91%, 20=0.02%
  cpu          : usr=6.05%, sys=14.03%, ctx=15253024, majf=0, minf=1065
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=25645666,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256


Run status group 0 (all jobs):
   READ: bw=835MiB/s (875MB/s), 835MiB/s-835MiB/s (875MB/s-875MB/s), io=97.8GiB (105GB), run=120008-120008msec


Disk stats (read/write):
  nvme0n1: ios=25611281/202, merge=0/118, ticks=122611533/491, in_queue=122612180, util=99.97%


Remote proxmox has specs like in the picture:

1717405287977.png


When running iotop, upload speed is around 6MB/s, disk read tops at 28MB/s at best, what's the issue here ?
 
what kind of backup (VM, CT)? what kind of target store? over the network, or are PBS and PVE co-located?

a basic tool like "atop" might already give you a clue where the bottle neck is..
 
It's backup up a VM. As for target store on proxmox, it's ZFS (NVME in RAID1), and for PBS, it's 1 NVME (no raid). It's over the network, both machines are in different locations.


I'm not very familiar with ATOP, but here is a screenshot:

1717408516934.png

I use instead BTOP:

1717408502245.png

Both are run on the PBS. Aside from network upload capping at 6MB/s, I don't see anything else, journalctl shows that it's constantly retrieving chunks on GET /chunk endpoint
 
and on the PVE side? can you try to run a proxmox-backup-client benchmark with --repository set so that the communication happens between them, and not local on the PBS (i.e., execute it on PVE, point it at PBS)?
 
From PVE to PBS, benchmark is as follows:

1717424291212.png

ATOP on PVE:

1717424672758.png

And BTOP on PVE:

1717425311080.png

iotop shows write speed of pbs-restore of around 30MB/s
 
Last edited:
I think the network latency is the bottleneck here. Some time ago we gave up on running PBS when PVE and PBS are located in different datacenter locations - east and west of EU, even thought we had stable 1Gbit/s connectivity between the servers - PBS uses lots of small chunks and even several milliseconds of network latency add up a lot.
 
yes, given the benchmark result I'd also say that is the case here. HTTP 2 really suffers when the latency goes up..
 
I confirm that putting PVE and PBS on lan is way faster.

Just a quick question regarding deduplication. When I start a restore of the VM 1.65TiB, does it send the whole 1.65TiB over the network or only the deduplicated data (which in my case is around 500Gb) ?
 
I confirm that putting PVE and PBS on lan is way faster.

Just a quick question regarding deduplication. When I start a restore of the VM 1.65TiB, does it send the whole 1.65TiB over the network or only the deduplicated data (which in my case is around 500Gb) ?
My guess is that the dedupe is on the PBS side and so it goes over the network multiple times. PVE doesn't do deduplication natively so I assume that's how it's implemented.

What you probably want to do is have PBS at both location and a sync job between them. That should take advantage of the dedupe and the restore wouldn't be over the WAN.
 
PBS really needs to be local to PVE, but PBS's network targets for backup storage can be across the WAN with the likes of NFS(this introduces other issues, but the protocol is much more forgiving). As it was already mentioned, latency is why. Your throughput seems to be there, but having network latency above a tolerance level (unknown for PBS, but I would put my finger on 10ms) will create the performance condition you are experiencing.

You can also have PBS fully built local for backups(DAS/NAS local to PBS) and replicate the backups off site, this is what we do and it has worked quite well.
 
I confirm that putting PVE and PBS on lan is way faster.

Just a quick question regarding deduplication. When I start a restore of the VM 1.65TiB, does it send the whole 1.65TiB over the network or only the deduplicated data (which in my case is around 500Gb) ?
it depends. if most of that deduplicating is empty/zero chunks, those are special cased in a lot of places. for other chunks, we do keep a chunk and handle the most used chunks as well, so unless you don't have every chunk exactly three times in your case, most of the deduplicated data should only be transferred once over the network.

also keep in mind that the chunks are compressed, which of course also reduces the amount of actual data transferred
 
Ok thank you for the clarifications. I still find it strange to be required to have a lan latency for PBS to be effective. It makes sense to me to have PBS in a different physical location than PVE in case of major incident. Are the plans to improve the protocol used for transfer (http2) ?

Also thought having to make an http request for every chunk of 2Mb and having large data to restore has big overhead overall , am I wrong on this ?
 
we will probably evaluate QUIC at some point, but other than that, no low-hanging fruit at the moment to improve the high latency experience there I am afraid.
 
we will probably evaluate QUIC at some point, but other than that, no low-hanging fruit at the moment to improve the high latency experience there I am afraid.
IMHO its in the deployment method. One of the biggest things Veeam did for this was introduced backup proxy servers, where the backup job's heavy lifting gets offloaded from the backup system to a proxy server, then the database links the backup normally when complete. Maybe that is something that can be looked into for PBS for those that want to have one PBS per Cluster in stretched cluster? Or redesign the deployment guide based on what is supported (PBX local to target PVE's, PBS to PBS replication for offsites...etc)? I dont think Quic is the right answer due to network controls that can/will break it (For example, Quic is blocked where we deploy).
 
IMHO its in the deployment method. One of the biggest things Veeam did for this was introduced backup proxy servers, where the backup job's heavy lifting gets offloaded from the backup system to a proxy server, then the database links the backup normally when complete. Maybe that is something that can be looked into for PBS for those that want to have one PBS per Cluster in stretched cluster? Or redesign the deployment guide based on what is supported (PBX local to target PVE's, PBS to PBS replication for offsites...etc)? I dont think Quic is the right answer due to network controls that can/will break it (For example, Quic is blocked where we deploy).
you can already set this up - backup to a local instance that has aggressive pruning, sync to remote instance that keeps long-term archives. doesn't solve the issue that the transfer between local and remote over http 2.0 will be slower than expected over higher latency links ;)
 
you can already set this up - backup to a local instance that has aggressive pruning, sync to remote instance that keeps long-term archives. doesn't solve the issue that the transfer between local and remote over http 2.0 will be slower than expected over higher latency links ;)

What happens if the link is down and so the aggressive pruning kicks in prior to transfer?

Although it might not solve it completely, I suspect the dedupe would work better PBS to PBS compared to a backup/restore job... although it would require extra storage and ongoing traffic for the sync job instead of only on demand, but good chance the backup job will already be at the remote site before you know you need it, which could solve the local and remote over http 2.0 issue...
 
What happens if the link is down and so the aggressive pruning kicks in prior to transfer?
then you'll prune more than you want, but you can delegate the pruning to the PBS doing the sync in this case to avoid that.

Although it might not solve it completely, I suspect the dedupe would work better PBS to PBS compared to a backup/restore job... although it would require extra storage and ongoing traffic for the sync job instead of only on demand, but good chance the backup job will already be at the remote site before you know you need it, which could solve the local and remote over http 2.0 issue...

I am not sure you know how PBS works under the hood, but I can't really parse what you wrote above ;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!