Bad backup and restore performance

delpiero3

Active Member
May 17, 2020
43
0
26
42
Hi everyone,
i am running in a loop with my PBS system that doesn't have the performance it should be on the paper when processing real VM backups.
It seems that i can't make any backup from my PVE system to speed more than 1Gbps while the network cards on both machines are 10Gbps and the network speed test shows that it is effective :

root@proxmox:~# iperf -c 192.168.200.213
------------------------------------------------------------
Client connecting to 192.168.200.213, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 1] local 192.168.200.210 port 38270 connected with 192.168.200.213 port 5001 (icwnd/mss/irtt=14/1448/153)
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-10.0146 sec 9.75 GBytes 8.36 Gbits/sec

For the record, a backup process was taking place and running around 1Gbps as stated above, which hypothetically would raise the test to 9.30Gbits/s, fair enough i guess.

The PVE system is running on a HP ML350 Gen 9 server with the following specifications :
2x Intel Xeon E5-2650L V4 @ 1.7Ghz (14 cores with hyperthreading)
256GB of RAM PC4 2400Mhz ECC
HP P840ar in HBA mode
4x SSD Samsung PM1633a SAS 7,68TB in RAID 10 (block size 64K)
12x HDD WD DC H520 SAS 4Kn 14TB in RAIDZ-2 with 2 Vdevs (block size 128K)
I can see smart status without an issue
10GBps network is a HP 546SFP+ adapter

The PBS system runs on a Dell R730xd with the following specifications :
1x Intel Xeon E5-2640 V4 @ 2.4Ghz (10 cores plus hyperthreading)
256GB of RAM PC4 2400Mhz ECC
I tried Dell Perc H730p mini in HBA mode and a HBA300 without seeing any differences, i kept running it on the HBA300
2x SSD 128GB in Raid 1 for the OS
16x 12TB HGST SAS in RAIDZ-2 also with 2 Vdevs and 2 NVME SSDs in mirror special device (check screenshots) for the metadata small files.
10Gbps network is the embedded daughter card Dell X540+I350 (2x 10GbE and 2x 1GbE)

I have the impression that i am running a pretty decent setup, with enterprise grade equipment,
I used the proxmox-backup-client benchmark on both the PBS host and the PVE to compare results, here is what i got :

From PVE :
root@proxmox:~# proxmox-backup-client benchmark --repository 192.168.***.***:******
Uploaded 340 chunks in 5 seconds.
Time per request: 14825 microseconds.

┌───────────────────────────────────┬───────────────────┐
│ Name │ Value │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ 282.91 MB/s (23%) │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 249.45 MB/s (12%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed │ 268.12 MB/s (36%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed │ 425.15 MB/s (35%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed │ 170.31 MB/s (22%) │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed │ 866.01 MB/s (24%) │
└───────────────────────────────────┴───────────────────┘

from PBS itself :
root@pbs:~proxmox-backup-client benchmark --repository *******
Uploaded 340 chunks in 5 seconds.
Time per request: 14772 microseconds.

┌───────────────────────────────────┬────────────────────┐
│ Name │ Value │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ 283.92 MB/s (23%) │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 254.57 MB/s (13%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed │ 331.72 MB/s (44%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed │ 480.07 MB/s (40%) │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed │ 225.55 MB/s (30%) │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed │ 1084.96 MB/s (30%) │
└───────────────────────────────────┴────────────────────┘

Values look to be similar which exclude any network issue from the equation i presume.
When i run a backup job, as you can see in attachment, i am around 80-90Mbps.
I thought that running the mirror special drive with NVME SSDs would solve my problem, so i reconfigured my RaidZ2 to include it, and it did not change anything at all. I haven't seen any difference in performance while i see data written on the special device.
In the screenshots you can see that iodelay are very low (never goes more than 1%), PVE bandwidth is quite steady around 90Mbps (the peak was probably a VM doing something), CPU load is ridiculous. By the way, doing a backup of a VM with storage on the RAID-10 SSDs is not performing bettern, another screenshot attached of a 64GB image, pretending a 260MB/s backup but i never saw the bandwidth going for more than 1Gbps which is reflected by the write bandwidth you can see that never goes for more than 100MB/s.

I really don't know where that limitation could come from, if someone can give me a hint.
 

Attachments

  • Copie d'écran_20240928_083725.png
    Copie d'écran_20240928_083725.png
    101.9 KB · Views: 1
  • Copie d'écran_20240928_085625.png
    Copie d'écran_20240928_085625.png
    80.6 KB · Views: 2
  • Copie d'écran_20240928_090124.png
    Copie d'écran_20240928_090124.png
    164.6 KB · Views: 2
  • Copie d'écran_20240928_090040.png
    Copie d'écran_20240928_090040.png
    206.9 KB · Views: 2
  • Copie d'écran_20240928_085926.png
    Copie d'écran_20240928_085926.png
    103.9 KB · Views: 1
The CPU (Intel Xeon E5-2650L V4) seems to be very slow.
Hi, thanks for the answer.
May i know on what basis you say that ? The CPU benchmark of that CPU isn't that bad, compared to the faster V4 Xeon E5-26xx generation, it only have 26% less performance. Also, the CPU load has an average of 5% on my machine. Can you elaborate why you think this is the bottleneck ?
 

Attachments

  • Copie d'écran_20240928_092415.png
    Copie d'écran_20240928_092415.png
    83.3 KB · Views: 3
After digging in a lot of discussion on the forum, i found this thread : https://forum.proxmox.com/threads/c...xmox-backup-server-performances.131821/page-2 and i assume that i am running in the same issue that's why you pointed me to the slow CPU performance. But from what i understood, the PBS CPU is the one that needs to apply the compression right ? So i could try to put a Xeon E5-2695 V4 instead of the E5-2640 V4, but since even in single thread the difference is only of 1%, i am not sure it would make any difference
 
Its not only compression. See the output of "proxmox-backup-client benchmark". For example, SHA256 checksum computation speed is 13% compared to the reference system.
 
Its not only compression. See the output of "proxmox-backup-client benchmark". For example, SHA256 checksum computation speed is 13% compared to the reference system.
Yeah, i figured out the percentage after writing my posts from other threads. One question though, is that a multi-threaded process, adding a second CPU to my machine would change something or you think it isn't worth doing it ?
 
Yeah, i figured out the percentage after writing my posts from other threads. One question though, is that a multi-threaded process, adding a second CPU to my machine would change something or you think it isn't worth doing it ?
I don't think that will change anything...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!