backup speed vs restore speed

tonci

Renowned Member
Jun 26, 2010
107
7
83
My PBS setup consists of:
4 x 4T sata (5400 rpm) -> raidz10 PBS + datastore datasate /rpool/datastore1
2 x satadom ssd 64G -> raidz1- special device

Backup speed gets up to 85-95% 1G wirespeed and this is pretty satisfactory, but restore speed is little questionable.

When restoring just one VM (at a time), restore speed goes around 40% bw -> cca 400-450 Mbs, but when restoring several VMs concurrently (to the same host) , 1G wire gets full utilized !!!

So, PBS in this form is capable of serving out restore-data at full-wire-speed

Is there any reason why one VM (at a time) cannot get restored at higher speed than 400Mbs ?

Thank you very much in advance

BR

Tonci
 
what's your target storage to restore to?

what does 'proxmox-backup-client benchmark' say? (best with the repository given, so that the full end-to-end speed can be tested)
 
Hello,
this is what benchmark says:
1650919180179.png

so, I'm restoring 4 different VMs to the same target storage zraid10 4 x 1T ssd (local on the pve host)
 
ok so the connection pve <-> pbs should be enough (117MB/s)
and the remaining chunk operations are also fast enought...

my guess is that either the target storage is not fast enough when only writing a backup, or the restore process is somehow at fault.
how fast is your target storage?
 
Now I tested with hw-raid volume (LSI3108) (1 volume -> zfsraid0) and the result is the same like with 4 x 1T server ssd zraid10
But , we are always talking about one target , all this 4 VMs are being concurrently restored from this PBS to the same target ... So target is capable of writing 4 VMS in the same time that are coming/streaming at 1Gbs speed
How could I test this target write-speed ?
When we are restoring the same VM with vzdump to the same target , speed gets up to 900Mbs w/o any problem
 
mhmm... can you post the vm config & storage config of the pve server?
 
this is storage.cfg

dir: local
path /var/lib/vz
content images,rootdir
shared 0

zfspool: data2
pool data2
content rootdir,images
mountpoint /data2
nodes pve02-company,pve01-company
sparse 1

zfspool: data3
pool data3
content images,rootdir
mountpoint /data3
nodes pve01-ivero,pve02-ivero
sparse 1

pbs: pbs01
datastore st1-company
server 10.168.3.8
content backup
fingerprint 3c:a9:b6:0b:89:09:df:fa:77:3c:d6:f3:e5:f7:3f:ec:0c:5f:67:43:98
prune-backups keep-all=1
username root@pam

nfs: nfs-pbs
export /rpool/data/nfs/export
path /mnt/pve/nfs-pbs
server 10.168.3.8
content backup,vztmpl,iso
prune-backups keep-last=2



this is VM being restored :

agent: 1
balloon: 0
boot: order=virtio0;net0;ide0
cores: 4
ide0: nfs-pbs:iso/virtio-win.iso,media=cdrom,size=528322K
machine: pc-i440fx-6.1
memory: 2048
meta: creation-qemu=6.1.1,ctime=1646467931
name: W2K16-dc
net0: virtio=DE:95:C5:52:21:96,bridge=vmbr1,firewall=1,tag=10
numa: 0
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=e5926879-97c1-460b-9cb4-e94a7d27ac55
sockets: 1
vga: qxl
virtio0: data2:vm-1001-disk-0,size=32G
vmgenid: fa0aba15-5b49-4a5b-8f41-0fa1dee7b32a
#qmdump#map:virtio0:drive-virtio0:data2::
 
hi, sorry for the late answer. sadly i currently do not have a real idea what might be wrong aside from what i wrote above
does this also happen when you restore a 'normal' vzdump backup? or only from pbs ?
 
There is one more thing to point out (after further "combination" testing) ... :
my cluster consists of two powerful hosts (pve1&2), and the "little" 3rd quorum node (pve3) is SUPERMICRO A2 C3558 Atom 4 x 4TB sata 7200 rpm WD RED Pro drives - zraid10)
This quorum hardware concurrently runs PVE and PBS (very good coexistence). PBS data-store is one dataset rpool/data-pbs newly created on main rpool . PVE image store is var/lib/vz and there is one more dataset data-nfs for nfs server that holds vzdump archives
So, every hosts (3/3) has nfs-server and PBS-server attached for backup (This quorum node has its own PBS and NFS-server attached onto itself as backup store.)
1651485087197.png
1651485117140.png
1651485141314.png


So, when i run just one VM restore from PBS onto i.e. PVE1 , restore speed gets up to 300-350Mbs. When one more VM is being restored from PBS to the same host , restore speed of this second VM gets up to 300-350, but network gets utilized up to 600Mbs -> So , PBS is serving out data at 600Mbs .
Make long story short . I ended up with restoring 6 VMs at the same time ->
2 VMs .. PBS -> PVE1
2 VMs .. PBS -> PVE2
2 VMs .. PBS -> PVE3 (from the quorum host (pve3) back to itself) !!!

I was very surprised when noticed that no restore session affected other one !!! ..... - restore speed towards i.e. PVE1 was stable (450Mbs) during whole restore process , independent from that PBS was additionally reading and writing two VMs form itself onto itself ...
Only PBS CPU% and IOWait% raised up appropriately ... but relationship between cpu/iowait was pretty good ... (picture)


1651484216226.png


this is PBS network speed towards hosts (pve1 and pve2) ... cca 900Mbs .... 450 -> pve1 450 -> pve2

1651484487219.png



... I did some tests with 2 x server ssd (zfs raid1) as special device and the gain was not impressive ... 10-15% backup/restore ... IMHO in this scenario it is maybe not worth risk putting metadata on separate drives for such low gain (?!)



So , IMHO this 3rd quorum node (atom C3558 , 4 x 4T sata hdd) is capable of handling pretty big read/write load and obviously has enough power to restore 1 VM at a time at higher speed than 300-350 Mbs

Is it maybe possible to add some additional logic to allow restoring speed as high as hardware (and network) can handle ? ... for scenario like this one ..


Thank you very much in advance

BR

Tonci
 

Attachments

  • 1651483971419.png
    1651483971419.png
    25.3 KB · Views: 1
  • 1651484067378.png
    1651484067378.png
    16.4 KB · Views: 1
one more thing .... Backup speed was not an issue at all ... PBS was receiving data at 950Mbs. .... backing up 1 VM ... but restoring the same one at 350Mbs :(


regarding all mentioned above we can say that PBS is "faster at writing than in reading" ... which is not that common ... write 950 <> read 350 ...

IMHO , there are good chances to change this ...

Thank you
BR
Tonci
 
  • Like
Reactions: lucius_the

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!