[SOLVED] Performance Issue: Improving Timing cached reads

honeyfairy · Sep 17, 2023

Need assistance on an issue I've been facing with my HP ProLiant ML350 Gen10 server running Proxmox. Specifically, I'm interested in improving the "Timing cached reads" performance metric on my server.

Server Specs:

Server Model: HP ProLiant ML350 Gen10

CPU: Intel Xeon Bronze 3106

Storage: HP MO000800JWTBR-MSA-LF - HP 800GB SAS 12G MU LFF SSD for MSA Storage

RAID Controller: HPE Smart Array P408i-a SR Gen10 Controller

Memory: HP DDR4 SmartMemory 16GB

Performance Test Results:
I have conducted performance tests on my server using hdparm -tT to measure the "Timing cached reads" performance. Below are the results from my server and a DigitalOcean instance for comparison:

DigitalOcean:

Code:

/dev/vda:
Timing cached reads: 37718 MB in 2.00 seconds = 18892.70 MB/sec
HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device
Timing buffered disk reads: 3154 MB in 3.00 seconds = 1051.20 MB/sec

My HP Gen10 ML350 Server:

Code:

/dev/sde:
Timing cached reads: 10298 MB in 1.99 seconds = 5184.00 MB/sec
Timing buffered disk reads: 4804 MB in 3.00 seconds = 1598.70 MB/sec

As you can see, there's a significant difference in the "Timing cached reads" performance between my server and the DigitalOcean instance, with DigitalOcean having considerably higher performance.

My Questions:
What hardware and software changes can I make to improve the "Timing cached reads" performance on my server?
Are there specific RAID controller or SSD settings I should consider optimizing for better cache performance?
Could Proxmox configurations play a role in this performance difference?
Are there any known Proxmox or Linux kernel optimizations that can enhance cached reads performance?
What other diagnostic tools or strategies would you recommend to pinpoint the performance bottleneck?
I greatly appreciate your help and guidance on this matter. Please feel free to share your experiences, suggestions, or any relevant information that can assist in improving the cached reads performance on my HP ProLiant ML350 Gen10 server.
Thank you for your time and support.

leesteken · Sep 17, 2023

You are measuring the time it takes from data in the memory cache of the drive to move to your main memory. That depends on hardware: memory speed, controller speed, interface speed (SAS 12G), drive electronic speed, and of course software driver efficiency and CPU speed. Maybe you can improve this a bit but I have no experience with that.
Note that the DigitalOcean gives you a virtual drive and it is probably caching data in main memory. It's therefore faster because it is not limited by hardware bottlenecks (controllers, interfaces and drive electronics) and it's just a main memory copy. If you want to beat this, cache your data on a tmpfs or RAM-drive.
Maybe I'm missing the point and someone can correct me?

honeyfairy · Sep 17, 2023

If you want to beat this, cache your data on a tmpfs or RAM-drive.

Thanks, any guide on how I can cache on a RAM drive?
I tried adding the one ssd drive as a logical drive on the raid controller and enabled controller cache of the raid controller. That did not help

leesteken · Sep 17, 2023

honeyfairy said:
I tried adding the one ssd drive as a logical drive on the raid controller and enabled controller cache of the raid controller. That did not help

Of course not, (single) SSD's are slower than main memory.

honeyfairy said:
Thanks, any guide on how I can cache on a RAM drive?

Proxmox/Linux will cache files in main memory anyway automatically, so I don't see the need to search the internet on how to do this. I don't want to spend my time tricking a meaningless performance test into higher numbers (by faking hardware with main memory). Why do you want to this?

honeyfairy · Sep 17, 2023

I dont want it for meaningless tests. Of course thats not the aim. The aim is for higher performance.

leesteken · Sep 17, 2023

One of the limits on buffered disk reads is the interface speed. SATA III is limited to 560MB/s, SAS 12G is limited to twice that speed. NVMe is usually x4 PCIe with is limited by 4 times the speed of your PCIe generation, which is usually faster than the SSDs themselves and therefore not a bottleneck. Disk reads are also cached in main memory and that makes this benchmark for representative for real-world usage.

honeyfairy · Sep 17, 2023

leesteken said:
One of the limits on buffered disk reads is the interface speed. SATA III is limited to 560MB/s, SAS 12G is limited to twice that speed. NVMe is usually x4 PCIe with is limited by 4 times the speed of your PCIe generation, which is usually faster than the SSDs themselves and therefore not a bottleneck.

OK so I installed this nvme pcie card: https://documents.westerndigital.co...nvme-series/data-sheet-ultrastar-dc-sn200.pdf

I'm getting around 1000MB/s buffered reads, is there a bottleneck somewhere or is this the max I can get?

Disk reads are also cached in main memory and that makes this benchmark for representative for real-world usage.

I'm not understanding this completely, could you please explain more about this.

leesteken · Sep 17, 2023

honeyfairy said:
I'm not understanding this completely, could you please explain more about this.

The drive has memory that caches data, but that is far away from the CPU and needs to travel, for example, via interfaces like SATA cables and can only do so with 540MB/s.
The CPU also has memory that caches data (so that it does not need to ask the drive) and that is much closed to the CPU and therefore much faster.
There is no point in optimizing for cached data on the drive when the same data is also cached in main memory that is faster and closer to the CPU.
You are asking other people to help you optimize for something that does not resemble real-world performance cases, which might be fun for some people but not useful in practice.

EDIT: Maybe other people here have other ideas or can find a way to explain it better than me. All of this is unrelated to Proxmox and more a general computer thing. I'm sure you can learn about caching on WIkipedia or computer system/science courses. There is no such thing as "best performance" without a specific use-case.

honeyfairy · Sep 17, 2023

leesteken said:
The drive has memory that caches data, but that is far away from the CPU and needs to travel, for example, via interfaces like SATA cables and can only do so with 540MB/s.

Understood

The CPU also has memory that caches data (so that it does not need to ask the drive) and that is much closed to the CPU and therefore much faster.

This I believe is the L3 cache of the Intel CPU which is like 19MB or 24MB, right?

There is no point in optimizing for cached data on the drive when the same data is also cached in main memory that is faster and closer to the CPU.
You are asking other people to help you optimize for something that does not resemble real-world performance cases, which might be fun for some people but not useful in practice.

So you are saying that whether this value is high or low does not in any way show real world performance. I have tested this on an old server (a HP G7 and I get a higher value, while on my G10 I get a lower value. So I believe that is true.

Maybe other people here have other ideas or can find a way to explain it better than me.

You have explained it well. I thank you

leesteken · Sep 17, 2023

honeyfairy said:
This I believe is the L3 cache of the Intel CPU which is like 19MB or 24MB, right?

No i meant main memory, which is many GBs. L3, L2, L1 caches are progressively faster but smaller.

honeyfairy said:
So you are saying that whether this value is high or low does not in any way show real world performance. I have tested this on an old server (a HP G7 and I get a higher value, while on my G10 I get a lower value. So I believe that is true.

It depends on the cables and controllers, not the main system or the actual drive media.
Just for fun: you could run Proxmox inside a VM on Proxmox and the "cached drive reads" will be faster inside the Proxmox inside Proxmox because there are more layers of caching.

honeyfairy · Sep 17, 2023

Thanks
I would like to mark this as solved. However I cant seem to find that option

leesteken · Sep 17, 2023

honeyfairy said:
Thanks
I would like to mark this as solved. However I cant seem to find that option

Edit your original first post and select Solved.

Search

Search

[SOLVED] Performance Issue: Improving Timing cached reads

honeyfairy

Member

Need assistance on an issue I've been facing with my HP ProLiant ML350 Gen10 server running Proxmox. Specifically, I'm interested in improving the "Timing cached reads" performance metric on my server.

leesteken

Distinguished Member

honeyfairy

Member

leesteken

Distinguished Member

honeyfairy

Member

leesteken

Distinguished Member

honeyfairy

Member

leesteken

Distinguished Member

honeyfairy

Member

leesteken

Distinguished Member

honeyfairy

Member

leesteken

Distinguished Member

We value your privacy

[SOLVED] Performance Issue: Improving Timing cached reads

Member

​

Need assistance on an issue I've been facing with my HP ProLiant ML350 Gen10 server running Proxmox. Specifically, I'm interested in improving the "Timing cached reads" performance metric on my server.​

Distinguished Member

Member

Distinguished Member

Member

Distinguished Member

Member

Distinguished Member

Member

Distinguished Member

Member

Distinguished Member

We value your privacy

Need assistance on an issue I've been facing with my HP ProLiant ML350 Gen10 server running Proxmox. Specifically, I'm interested in improving the "Timing cached reads" performance metric on my server.