Bad backup and restore performance

mauro2306

Well-Known Member
May 17, 2020
48
0
46
43
Hi everyone,
i am running in a loop with my PBS system that doesn't have the performance it should be on the paper when processing real VM backups.
It seems that i can't make any backup from my PVE system to speed more than 1Gbps while the network cards on both machines are 10Gbps and the network speed test shows that it is effective :

root@proxmox:~# iperf -c 192.168.200.213
------------------------------------------------------------
Client connecting to 192.168.200.213, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 1] local 192.168.200.210 port 38270 connected with 192.168.200.213 port 5001 (icwnd/mss/irtt=14/1448/153)
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-10.0146 sec 9.75 GBytes 8.36 Gbits/sec

For the record, a backup process was taking place and running around 1Gbps as stated above, which hypothetically would raise the test to 9.30Gbits/s, fair enough i guess.

The PVE system is running on a HP ML350 Gen 9 server with the following specifications :
2x Intel Xeon E5-2650L V4 @ 1.7Ghz (14 cores with hyperthreading)
256GB of RAM PC4 2400Mhz ECC
HP P840ar in HBA mode
4x SSD Samsung PM1633a SAS 7,68TB in RAID 10 (block size 64K)
12x HDD WD DC H520 SAS 4Kn 14TB in RAIDZ-2 with 2 Vdevs (block size 128K)
I can see smart status without an issue
10GBps network is a HP 546SFP+ adapter

The PBS system runs on a Dell R730xd with the following specifications :
1x Intel Xeon E5-2640 V4 @ 2.4Ghz (10 cores plus hyperthreading)
256GB of RAM PC4 2400Mhz ECC
I tried Dell Perc H730p mini in HBA mode and a HBA300 without seeing any differences, i kept running it on the HBA300
2x SSD 128GB in Raid 1 for the OS
16x 12TB HGST SAS in RAIDZ-2 also with 2 Vdevs and 2 NVME SSDs in mirror special device (check screenshots) for the metadata small files.
10Gbps network is the embedded daughter card Dell X540+I350 (2x 10GbE and 2x 1GbE)

I have the impression that i am running a pretty decent setup, with enterprise grade equipment,
I used the proxmox-backup-client benchmark on both the PBS host and the PVE to compare results, here is what i got :

From PVE :
root@proxmox:~# proxmox-backup-client benchmark --repository 192.168.***.***:******
Uploaded 340 chunks in 5 seconds.
Time per request: 14825 microseconds.

┌───────────────────────────────────┬───────────────────┐
│ Name │ Value │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ 282.91 MB/s (23%) │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 249.45 MB/s (12%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed │ 268.12 MB/s (36%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed │ 425.15 MB/s (35%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed │ 170.31 MB/s (22%) │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed │ 866.01 MB/s (24%) │
└───────────────────────────────────┴───────────────────┘

from PBS itself :
root@pbs:~proxmox-backup-client benchmark --repository *******
Uploaded 340 chunks in 5 seconds.
Time per request: 14772 microseconds.

┌───────────────────────────────────┬────────────────────┐
│ Name │ Value │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 283.92 MB/s (23%) │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 254.57 MB/s (13%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed │ 331.72 MB/s (44%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed │ 480.07 MB/s (40%) │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed │ 225.55 MB/s (30%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed │ 1084.96 MB/s (30%) │
└───────────────────────────────────┴────────────────────┘

Values look to be similar which exclude any network issue from the equation i presume.
When i run a backup job, as you can see in attachment, i am around 80-90Mbps.
I thought that running the mirror special drive with NVME SSDs would solve my problem, so i reconfigured my RaidZ2 to include it, and it did not change anything at all. I haven't seen any difference in performance while i see data written on the special device.
In the screenshots you can see that iodelay are very low (never goes more than 1%), PVE bandwidth is quite steady around 90Mbps (the peak was probably a VM doing something), CPU load is ridiculous. By the way, doing a backup of a VM with storage on the RAID-10 SSDs is not performing bettern, another screenshot attached of a 64GB image, pretending a 260MB/s backup but i never saw the bandwidth going for more than 1Gbps which is reflected by the write bandwidth you can see that never goes for more than 100MB/s.

I really don't know where that limitation could come from, if someone can give me a hint.
 

Attachments

  • Copie d'écran_20240928_083725.png
    Copie d'écran_20240928_083725.png
    101.9 KB · Views: 5
  • Copie d'écran_20240928_085625.png
    Copie d'écran_20240928_085625.png
    80.6 KB · Views: 5
  • Copie d'écran_20240928_090124.png
    Copie d'écran_20240928_090124.png
    164.6 KB · Views: 5
  • Copie d'écran_20240928_090040.png
    Copie d'écran_20240928_090040.png
    206.9 KB · Views: 6
  • Copie d'écran_20240928_085926.png
    Copie d'écran_20240928_085926.png
    103.9 KB · Views: 5
The CPU (Intel Xeon E5-2650L V4) seems to be very slow.
Hi, thanks for the answer.
May i know on what basis you say that ? The CPU benchmark of that CPU isn't that bad, compared to the faster V4 Xeon E5-26xx generation, it only have 26% less performance. Also, the CPU load has an average of 5% on my machine. Can you elaborate why you think this is the bottleneck ?
 

Attachments

  • Copie d'écran_20240928_092415.png
    Copie d'écran_20240928_092415.png
    83.3 KB · Views: 7
After digging in a lot of discussion on the forum, i found this thread : https://forum.proxmox.com/threads/c...xmox-backup-server-performances.131821/page-2 and i assume that i am running in the same issue that's why you pointed me to the slow CPU performance. But from what i understood, the PBS CPU is the one that needs to apply the compression right ? So i could try to put a Xeon E5-2695 V4 instead of the E5-2640 V4, but since even in single thread the difference is only of 1%, i am not sure it would make any difference
 
Its not only compression. See the output of "proxmox-backup-client benchmark". For example, SHA256 checksum computation speed is 13% compared to the reference system.
 
Its not only compression. See the output of "proxmox-backup-client benchmark". For example, SHA256 checksum computation speed is 13% compared to the reference system.
Yeah, i figured out the percentage after writing my posts from other threads. One question though, is that a multi-threaded process, adding a second CPU to my machine would change something or you think it isn't worth doing it ?
 
Yeah, i figured out the percentage after writing my posts from other threads. One question though, is that a multi-threaded process, adding a second CPU to my machine would change something or you think it isn't worth doing it ?
I don't think that will change anything...
 
Sad, really, i get the use case you define when you say that PBS is made like that to ensure big infra to restore in very short amount of time VMs, but on the other end i am not quite sure that this is most of the use case we can find out there.
I will paste what i wrote in another post that i found while searching other people in my case. I hope some PM will look at it (i am also PM but in a way different software industry, physical security software). I added few arguments compare to the other post, referring to my experience in the software industry i am working on.

i get the statement about SSD replacing eventually HDD for the long shot, but still today the ratio price/storage is not in favor of SSD yet for most of the case, and for backup purpose, you may not need something that goes to multiple GB/s, most of the network link are 10Gbps and therefore i think that most use case would be satisfied with at least getting to that speed level, which is doable even with HDD when setting up a good RAIDZ infra. You may just need bigger volume to get all your data saved. And about the argument of saying that compression save space, it also depends on what do you backup, not all data are even compressible, which then reduces the gain while you still process CPU around it. For example in my industry, video surveillance, you can't save from H.264 or H.265 video format, therefore, if your customer requests to backup some specific VM with sensitive recording, you are screwed.
So either PBS is really made for super high-end infra and it doesn't care about the lower-end one, or i think that some additional settings on removing (or at least adjusting) compression level and few other stuff will be required, because the cost for changing a server to a more powerful one in my case for example is way, way more expensive than adding more storage to compensate the absence of compression for example.
 
But wait a moment, if i don't encrypt my backup, this bad SHA256 results doesn't really matter, does it ? The rest is making more sense right ? Then the second bottleneck would be the Chunck verification speed, right ? As i am using special device with 2 NVME in mirror with my pool, this result doesn't seem right, what do you think ?
 
Understood, thanks. One last question, would it make any difference to do backup on a NFS volume, installing a simple Linux distribution instead of Proxmox backup server, would it be different ?