PBS backup faster than restore

tonci

Renowned Member
Jun 26, 2010
107
8
83
I installed PBS in VM and attached one virtio drive for datastore ... added datastore through GUI (storage(disks/...) etc.
Everything works fine except that backup is cca 20% faster than restore (?!) ... and (traditional) restore from nfs-backup server (vzdump) is cca 3x faster


1610015281257.png

and on PVE where PBS is hosted we can see bigger i/o wait during restore than during backup period (?!) :
1610016138168.png

I kindly ask for some hint how to optimize and speed up this restore ?

Thank you very much in advance

BR
Tonci
 
  • Like
Reactions: DerDanilo
Here is another try/compare :
1. this 20 Mbs restore is from PBS (virtualized on hostX) and it stretches through whole picture.
2. from 13h15 to 13h20 happens restore from netgear nfs server , so it's much faster than from PBS
3. from 13h40 to 13h50 vzdump archiveX is being copied from netgear nfsX server to virtualized ubuntu nfs server on hostX too)
4. from 13h50 to 14h this acrhiveX from nfsX server is being restored at high speed again

1610544306305.png

Here we can see read speed (netout) from virtualized PBS server on hostX. During this vzdump archiveX transfer , netout speed has was lowered down to 6-7Mbs which is acceptable and real (because of intensive write).
During this archiveX restore , net out speed from PBS was lowered down to 15Mbs ... which also acceptable because there is another reading stream
1610544537393.png
But this general low PBS restore speed is not acceptable at all

Conclusion: hostX is capable of sending out parallel restore streams ( one from virt-PBS and another one from virt-ubuntu-NFS) but nfs stream seems to be at expected speed , and PBS-stream not ... IMHO I would expect PBS to send out data 3-4 (60-70Mbs) times faster than this ...
So we cannot blaim hostX hardware (and disks) as bottleneck, but PBS "store" concept as such I guess ... I'm aware that this cannot be 1:1 (in PBS vm data is spread out thruogh chunk folders) compared but this must be somehow improved ... Or maybe we should use faster disks for backup server than for PVEhost (?!) which does not make that much sense ...
But still I think I misconfigured something so I kindly ask for support

Thank you very much in advance

BR

Tonci
 
I'm sure things are not trivial at all ... I can only presume that maybe number of snapshots affects restore time ... we are not talking of one archive vma file that is being read continuously ... here we have kind of "kaothic mosaic" spread overall the datastore (more than 65000 data-chunk folders ) , and restore procedure must take into account all snapshots that were made after full (1st) backup and connect them accordingly etc etc ...
or ?
 
  • Like
Reactions: che
Thank you for your explanation ... How can we "influence" chunk size ? What kind of setup would be best practice (optimal)? Can this chunk size be managed or it depends on the hardware (hard disk spinn/ssd)
I od have server with 4 sata ports available and 4x8T sata hdds ... What would be best scenario ?
thank you in advance
BR
Tonci
 
How can we "influence" chunk size ?
you cannot, for vm backups atm it is 4MiB, and for containers there are dynamically sized chunks, but you should on average also get abou ~4 MiB chunks there

how is your storage formatted (fs/layout) ? what is the baseline performance of those disks?
 
you cannot, for vm backups atm it is 4MiB, and for containers there are dynamically sized chunks, but you should on average also get abou ~4 MiB chunks there

how is your storage formatted (fs/layout) ? what is the baseline performance of those disks?
I've setup new environment : baremetal PBS , and store consists of 4x2T (zfs raid10) 7k2 rpm :
1610750921283.png
1610750991878.png
 
I noticed little strange (for me) disk activity on PBS:
1610829247503.png
1. I made vm backup and it lasts till 20h30.
2. at 20h35 I started garbage collection in order to free PBS from everything but this one vm backup.
3. I waited till 21h08 because of this blue "write" activity and assumed this belongs to garbage collection.
4. I started restore of the same vm and besides green "read" it started to write in the same time and during the whole period of restore

Since the only activity on this PBS is this restore my question would be why there is much more "write" than "read" ? ... Like it's always re-positioning chunks (?!)
 
3. I waited till 21h08 because of this blue "write" activity and assumed this belongs to garbage collection.
no it really doesn't, garbage collection will not write much to disk, only read + possibly delete + write the status to disk

can you check with 'iotop' or similar what writes so much on your datastore?
if there is no backup, there really should not be much write activity