Backup/restore low speed - what is wrong?

m3a2r1

Active Member
Feb 23, 2020
145
5
38
46
I'm testing such setup:
PVE 7.3.4 on ssd hard disks (RAID10) , QNAP NAS with NVME SSDs. Both are connected with fiber network adapter.
QNAP is mounted as NFS share.
When I test that share with
time sh -c "dd if=/dev/zero of=testfile bs=16k count=320000 && sync" command, I get about 400MB/s write speed.
But when I run backup, max speed is about 110MB/s (average about 100MB/s) (restore goes faster, it's about 220MB/s) .

proxmox-backup-client shows values as in attachment

I have the same issue with my others proxmox machines and proxmox backup servers. It's even worse when pve is on hdds not ssds - restore speed is about 20MB/s,
I like proxmox very much and I think that low backup/restore speed is disqualifying for production use of it :( If I will not fix that, I will have to change it to another virtualization system, which I really don't want (I'm using it about 3 years so I know and like it).

What should I fix or change in my setup to get maximum hard disk transfers while backup/restore? I don't have any idea :(
 

Attachments

  • screenshot.496.jpg
    screenshot.496.jpg
    99.6 KB · Views: 35
Last edited:
Do not use dd to test speed, there are plenty of threads which describe how to benchmark a disk to get realistic results
 
Yes, exactly. there should be some fio configurations that mirror the pbs behavior on the forum
 
So I'll test all of my nodes and pbs servers (if I'll find what parameters should I use for these tests).
 
I've used
Code:
fio --size=8g --rw=randread --bs=1M --ioengine=libaio --iodepth=32 --runtime=60 --numjobs=1 --direct=1 --time_based --name=WriteAndRead --end_fsync=1 --gtod_reduce=1
on my pbs but it's showing transfers about 10GB/s. Why?
 
to clearyfy Qnap/NFS is your VM storage? change to the folder and do fio. post the output
what is your backup storage? make backup-client benchmark to a repo. post the output
 
PBS is vm on Proxmox, storage is NFS from another vm (TrueNAS) on the same pve, I was trying to move disk to local pve storage but it will take few days, transfer is about 60MB/s.
 
I'm totally helpless - I'm trying to backup 650GB vm.
Source is file based storage on raid 60 (hdds) + 10Gbit network interface.
Destination is PBS on ssd raid 10 (Ultrastar SS300) + 10GBit network interface.

1680271643900.png

Help me, please, how to diagnose that, I don't even have any idea what should I start with.
 
you need to exclude storage issues
post storage exact model + cpu of your qnap and your pve host.
post fio output, with settings direct=1,sync=1,bs=4k,numjobs=1 -rw=read on your storage
then same for destination storage.
is vm use "host" as cpu ?
check with iperf too.

edit: after re-read, as floh8 said, need clarifications. where is VM storage , where is PBS datastore.
 
Last edited:
OK, I'll do it tomorrow. I'll write everything about source and destination with fio tests (I've checked iperf, it was about 6-7Gbit between 2 machines).
 
after re-read, as @floh8 said, need clarifications. where is VM storage , where is PBS datastore
As far as I understand:
PBS VM + TrueNAS VMs on the same PVE. PBS VM then using a NFS share from that TrueNAS VM that is using ZFS. Whats not clear is how HDDs/SSDs are provided to PBS/TrueNAS VMs.
 
Last edited:
As far as I understand:
PBS VM + TrueNAS VMs on the same PVE. PBS VM then using a NFS share from that TrueNAS VM that is using ZFS. Whats not clear is how HDDs/SSDs are provided to PBS/TrueNAS VMs.
Ok, I'll clear it, it's not config from first post.
Machine I'm trying to backup is on my 1st pve - physical machine is hp dl380 g8, 384gb ram, 2x 8 core cpu, Sas disks on raid60 - storage for vm drives is the directory (so file based). Pbs is on 2nd pve: dl380 g9, 256gb ram, 2x 6core cpu, ssd drives on raid10, storage for vm drives is on lvm, there is no TrueNAS here, disk for pbs is provided directly from Pve.
 
i hope it's only me because i'm still confused.
don't forget to iperf within vm.
 
OK, I'm starting:
Source machine (pve1):
DL380G8 2xCPU 8core, 384GB RAM, RAID60 on SAS HDD, storage in directory.
fio tests for rw
fio --name=WriteAndRead --size=1g --bs={4K,64K,1MB} --rw=rw --ioengine=libaio --sync=0 --iodepth=32 --numjobs=1 --direct=1 --end_fsync=1 --gtod_reduce=1 --time_based --runtime=60

(main pve drive - /dev/sda) :
4KB
read: IOPS=13.4k, BW=52.4MiB/s (54.9MB/s)(3142MiB/60011msec)
write: IOPS=13.4k, BW=52.3MiB/s (54.8MB/s)(3137MiB/60011msec)
64KB
read: IOPS=2092, BW=131MiB/s (137MB/s)(7855MiB/60050msec)
write: IOPS=2091, BW=131MiB/s (137MB/s)(7849MiB/60050msec)
1MB
read: IOPS=250, BW=251MiB/s (263MB/s)(14.7GiB/60215msec)
write: IOPS=250, BW=250MiB/s (262MB/s)(14.7GiB/60215msec)

(vm disk storage drive - /dev/sdb):
4KB
read: IOPS=24.1k, BW=94.2MiB/s (98.8MB/s)(5653MiB/60002msec)
write: IOPS=24.1k, BW=94.1MiB/s (98.7MB/s)(5647MiB/60002msec)
64KB
read: IOPS=5293, BW=331MiB/s (347MB/s)(19.4GiB/60056msec)
write: IOPS=5287, BW=330MiB/s (347MB/s)(19.4GiB/60056msec)
1MB
read: IOPS=622, BW=622MiB/s (653MB/s)(36.5GiB/60098msec)
write: IOPS=622, BW=622MiB/s (653MB/s)(36.5GiB/60098msec)

Destination machine (pve2):
DL380G9 2xCPU 6core, 256GB RAM, RAID10 on SSD Ultrastar SS300, storage in LVM
fio tests for rw
fio --name=WriteAndRead --size=1g --bs={4K,64K,1MB} --rw=rw --ioengine=libaio --sync=0 --iodepth=32 --numjobs=1 --direct=1 --end_fsync=1 --gtod_reduce=1 --time_based --runtime=60

(main pve drive - /dev/sda) :
4KB
read: IOPS=4363, BW=17.0MiB/s (17.9MB/s)(1023MiB/60013msec)
write: IOPS=4370, BW=17.1MiB/s (17.9MB/s)(1025MiB/60013msec)

64KB
read: IOPS=1522, BW=95.2MiB/s (99.8MB/s)(5710MiB/60005msec)
write: IOPS=1518, BW=94.9MiB/s (99.5MB/s)(5694MiB/60005msec)

1MB
read: IOPS=576, BW=577MiB/s (605MB/s)(33.8GiB/60012msec)
write: IOPS=577, BW=577MiB/s (605MB/s)(33.8GiB/60012msec)

the same on pbs (vm on pve2)
4KB
read: IOPS=22.1k, BW=86.2MiB/s (90.4MB/s)(5184MiB/60165msec)
write: IOPS=22.0k, BW=86.1MiB/s (90.3MB/s)(5179MiB/60165msec)

64KB
read: IOPS=15.8k, BW=990MiB/s (1038MB/s)(59.2GiB/61238msec)
write: IOPS=15.8k, BW=988MiB/s (1036MB/s)(59.1GiB/61238msec)

1MB
read: IOPS=2247, BW=2248MiB/s (2357MB/s)(136GiB/61902msec)
write: IOPS=2251, BW=2252MiB/s (2361MB/s)(136GiB/61902msec)

and 4MB (it's a pbs chunk size if I understood correctly)
read: IOPS=580, BW=2323MiB/s (2436MB/s)(140GiB/61807msec)
write: IOPS=581, BW=2324MiB/s (2437MB/s)(140GiB/61807msec)

iperf from pve1 to pve2:
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 6.13 GBytes 5.26 Gbits/sec

iperf from pve2 to pve1:
[ ID] Interval Transfer Bandwidth
[ 3] 0.0000-10.0031 sec 3.78 GBytes 3.24 Gbits/sec

iperf from pve1 to pbs:
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 9.64 GBytes 8.28 Gbits/sec

iperf from pbs to pve1:
[ ID] Interval Transfer Bandwidth
[ 3] 0.0000-10.0012 sec 3.75 GBytes 3.22 Gbits/sec

Below vm backup from pve1 to pbs:
1680347218883.png

And restore vm on pve2 from pbs :
1680354499846.png
edit: it's only first drive - 500GB, I'm waiting for restore 2nd drive - 150GB.

Summary - dramatically low speed of backup and restore.
Let me know what should I test, change, improve.
 
Last edited:
Ok. Do once again backup benchmark with repo from pve1/sdb.
Do u use mdadm or a controller as raid platform on both servers?
 
What do you mean "with repo from pve1/sdb"?
DL380 g8 has Smart Array P420 Raid controller and g9 has Smart HBA H240ar.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!