Performance Expectations

Erik Horn · Oct 11, 2024

I've been evaluating PBS and have found the throughput, especially for single vm restores to be lacking at 300MB/s. The server we are using is what we had available and not ideal for the task. Rather than spending a lot of time to try and make our evaluation system run faster I figured it would make more sense to ask what kind of performance numbers we could expect to get if we were to purchase a reasonable server for PBS?

My definition of reasonable would be a current-generation enterprise class server with a decent CPU, the recommended amount of ram, and SSDs for the OS and metadata. Main backup storage would be a dozen or more hard drives because 100's of TBs of enterprise class SSDs is silly expensive.

My biggest performance concern is the case of restoring a 10-20TB VM quickly after a disaster. In my testing, it appears that PBS doesn't parallelize individual VM tasks very well, which leads to large systems taking forever to complete.

Thanks,

Erik

waltar · Oct 11, 2024

Don't thought about snapshot or reflink restores of your vm's in seconds (for full restore "events") ?

Erik Horn · Oct 11, 2024

waltar said:
Don't thought about snapshot or reflink restores of your vm's in seconds (for full restore "events") ?

These backups are part of a disaster recovery plan so the assumption is that all the servers are completely lost and rebuilt from tape. Snapshots aren't useful in that situation.

waltar · Oct 11, 2024

What about second server in other building ? Maybe that's your solution with the PBS right but with "PBS packed" images and not ready to run ones ?

Erik Horn · Oct 11, 2024

Our PVE cluster will span two buildings, plus have a 3rd witness site for high availability purposes. However high availability is not the same as disaster recovery. We need to be able to quickly recover data in case all of the primary copies have been destroyed.

In our case, the most likely situation where this is important is when malware infects and destroys the PVE hosts.

esi_y · Oct 11, 2024

I think @Erik Horn really wants to know if he can use PBS as rapid DR tool.

There's always some limit to the underlying hardware (and storage model chosen), but can you expand on (it's been a while since I tested PBS, so I am genuinely asking without adding my bias):

Erik Horn said:
In my testing, it appears that PBS doesn't parallelize individual VM tasks very well, which leads to large systems taking forever to complete.

NB I suspect this will not be directly answered by staff.

UdoB · Oct 11, 2024

Erik Horn said:
My biggest performance concern is the case of restoring a 10-20TB VM quickly after a disaster. In my testing, it appears that PBS doesn't parallelize individual VM tasks very well, which leads to large systems taking forever to complete.

Did you notice "live restore"? https://pve.proxmox.com/wiki/Backup_and_Restore#vzdump_restore

This will start a VM a very short time after restore starts, with reduced performance of course. In my use case I was really amazed how well that worked. It really helps restoring one VM at a time, not restoring a complete node with several of them.

Your mileage may/will vary...

Erik Horn · Oct 11, 2024

esi_y said:
I think @Erik Horn really wants to know if he can use PBS as rapid DR tool.

There's always some limit to the underlying hardware (and storage model chosen), but can you expand on (it's been a while since I tested PBS, so I am genuinely asking without adding my bias):

NB I suspect this will not be directly answered by staff.

When reviewing the forums, I did notice that discussions of parallelism were avoided or discouraged. This confuses me because enterprise storage systems have always required a lot of parallelism to achieve optimal performance. More recently this trend is also happening in CPUs, where the increases in aggregate performance are mostly achieved by adding cores, not increasing the performance of individual cores.

esi_y · Oct 11, 2024

Erik Horn said:
When reviewing the forums, I did notice that discussions of parallelism were avoided or discouraged.

My opinion only: This forum at times becomes toxic towards anyone who starts asking inconvenient questions.

That said, I really try to stick to the subject matter, people stop arguing when there's no counter-argument. This might however explain why you hear silence at times.

Erik Horn said:
This confuses me because enterprise storage systems have always required a lot of parallelism to achieve optimal performance.

Don't tell staff that something confuses you out of politeness, I learned my lesson already - you will be told you do not understand "their integration well".

Erik Horn said:
More recently this trend is also happening in CPUs, where the increases in aggregate performance are mostly achieved by adding cores, not increasing the performance of individual cores.

Also, one would think that if Proxmox can roll solution like PVE that is cluster based, it would be possible to do parallel recovery from PBS-cluster-like solution. Maybe one day.

To me it was all wrong when I first tested PBS, the setup was that PBS can be poor hardware because the heavy lifting happens at the other end. (I can look for quote attribution for this if challenged.) Well, I do not run my virtualisation solution to be preoccupied with backup streams.

Erik Horn · Oct 12, 2024

UdoB said:
Did you notice "live restore"? https://pve.proxmox.com/wiki/Backup_and_Restore#vzdump_restore

This will start a VM a very short time after restore starts, with reduced performance of course. In my use case I was really amazed how well that worked. It really helps restoring one VM at a time, not restoring a complete node with several of them.

Your mileage may/will vary...

I have considered live restore for large VMs and would use it for the systems the exceed a certain threshold. That threshold will be determined by the overall restore performance we can achieve. It could be that we may need 10 concurrent live restores of VMs that are bigger than 1TB, or 2-3 VMs that are larger than 5TB.

And during a real recovery, we'd also be starting not-live restores of the smaller VMs as fast as we could type and click.

esi_y · Oct 12, 2024

UdoB said:
Did you notice "live restore"? https://pve.proxmox.com/wiki/Backup_and_Restore#vzdump_restore

Just so it's not me being toxic, I actually did not know about this, interesting tip, just it goes against the marketing:

https://www.proxmox.com/en/proxmox-backup-server/features

This literally promises "fast recovery" (ransomware protection) and "Quick Restore" is an entire section there with reference to "lightning fast." I understand it's a pamphlet, but then for technical product, there should be proper specs sheet too with some benchmarks.

Erik Horn · Oct 12, 2024

esi_y said:
My opinion only: This forum at times becomes toxic towards anyone who starts asking inconvenient questions.

That said, I really try to stick to the subject matter, people stop arguing when there's no counter-argument. This might however explain why you hear silence at times.

Don't tell staff that something confuses you out of politeness, I learned my lesson already - you will be told you do not understand "their integration well".

Also, one would think that if Proxmox can roll solution like PVE that is cluster based, it would be possible to do parallel recovery from PBS-cluster-like solution. Maybe one day.

To me it was all wrong when I first tested PBS, the setup was that PBS can be poor hardware because the heavy lifting happens at the other end. (I can look for quote attribution for this if challenged.) Well, I do not run my virtualisation solution to be preoccupied with backup streams.

My understanding of the backup process is that the PVE hosts do most of the backup processing. I'm so not sure how the workload is distributed during a restore, but from what I saw, the PBS system does do some work as I was seeing 10% cpu utilization on it.

However, if the separation of work is the same for both backup and restore, then I'm very surprised my performance is so low because I consider my VE hosts to be high end. The PBS server isn't optimal, but in it's previous life it was a backup repository and was only removed from production because we out-grew the 24 drive bays.

esi_y · Oct 12, 2024

Erik Horn said:
However, if the separation of work is the same for both backup and restore, then I'm very surprised my performance is so low because I consider my VE hosts to be high end. The PBS server isn't optimal, but in it's previous life it was a backup repository and was only removed from production because we out-grew the 24 drive bays.

So do you think you maxed out the CPU or network? 300MB/s is weird limit...

Erik Horn · Oct 12, 2024

esi_y said:
So do you think you maxed out the CPU or network? 300MB/s is weird limit...

I didn't do a lot of analysis because I didn't have enough information about what work happened where to know what to look for. Which is why I originally asked what numbers people were seeing.

But if it was CPU on the PBS box, then 10% is 1 core of the E2640v4 cpu. After reading the forums, I was assuming the issue was the HDD latency, since the queue depth was averaging 2 and it would need to be at 20 just to assign 1 request to each drive. The drives run at 250MB/s, so 300 could be lack of parallel disk requests.

Network is 10Gb from PBS to the VE cluster. The VE cluster is 2x25 on each of the 4 nodes.

esi_y · Oct 12, 2024

Erik Horn said:
After reading the forums, I was assuming the issue was the HDD latency, since the queue depth was averaging 2 and it would need to be at 20 just to assign 1 request to each drive. The drives run at 250MB/s, so 300 could be lack of parallel disk requests.

The PBS datastore was some RAIDZ?

Wibla · Oct 12, 2024

Erik Horn said:
I didn't do a lot of analysis because I didn't have enough information about what work happened where to know what to look for. Which is why I originally asked what numbers people were seeing.

What drive layout are you running on the PBS datastore?

Erik Horn · Oct 14, 2024

Drive layout is raid-5 managed by a hardware controller. It has 23 drives in the configuration.

I did some fio testing. The important parameters were direct_io, blocksize=2M, randomread over 64GB of data. The 300MB/s that I'm seeing is consistent with the fio results for a queue depth of 2. In testing a series of queue depths, diminishing returns takes hold around a queue depth of 128 and 1750MB/s.

Something else that I realized after my last post was that the CPU in this system appears to have been previous-generation (or worse) when we originally purchased the system. If the PBS server does anything other than moving data from disk to network, the single-threaded cpu performance could still be a limiting factor.

esi_y · Oct 14, 2024

Erik Horn said:
If the PBS server does anything other than moving data from disk to network, the single-threaded cpu performance could still be a limiting factor.

I would like to see anyone from staff actually answer this, if not, one can always go on reading the code, but it just takes longer.

Erik Horn · Oct 15, 2024

I'd just be happy to see some real world performance numbers. For us to commit to this project, it would cost $75-100K USD, depending on the hardware chosen. It's hard to justify a large purchase with "the vendor states that it's fast but provides no additional performance information. Our evaluation shows significant performance bottlenecks".

After doing some research, I may be willing to entertain an all-flash PBS server. When originally looking at it, our preferred hardware vendor has SSDs priced at 12x per TB as HDDs. I've looked into a different vendor which has lower endurance drives for about 2x the cost. I still need to run the numbers to ensure that the drives would have sufficient endurance to last the life of the project. My other concern with an all-flash repo is that based on the data I've seen for single VM operations it won't have huge benefits, at least compared to the speed difference of a HDD vs SSD.

esi_y · Oct 15, 2024

Erik Horn said:
I'd just be happy to see some real world performance numbers. For us to commit to this project, it would cost $75-100K USD, depending on the hardware chosen. It's hard to justify a large purchase with "the vendor states that it's fast but provides no additional performance information. Our evaluation shows significant performance bottlenecks".

I will try to namedrop @t.lamprecht, if he wants to make a quick comment on the:

Erik Horn said:
I did some fio testing. The important parameters were direct_io, blocksize=2M, randomread over 64GB of data. The 300MB/s that I'm seeing is consistent with the fio results for a queue depth of 2. In testing a series of queue depths, diminishing returns takes hold around a queue depth of 128 and 1750MB/s.

I do not want to debate something I have at this point no idea how it works, but I think it should be also documented somewhere if the said behaviour is also expected one. If not, this should then become bugzilla.proxmox.com report.

Performance Expectations

New Member

Renowned Member

New Member

Renowned Member

New Member

Renowned Member

Distinguished Member

New Member

Renowned Member

New Member

Renowned Member

New Member

Renowned Member

New Member

Renowned Member

New Member

New Member

Renowned Member

New Member

Renowned Member

We value your privacy