Ultra-low network latency test for Ceph on Proxmox

Sep 14, 2020
54
6
13
46
How do I go about performing better network latency tests for use with Ceph on Proxmox?

The objective would be to use tests to determine the best cards or models of network cards, cables, transceivers and switches for use in Ceph cluster networks where the nodes containing the OSD's are located or even in Ceph's public networks for communicating with clients.

A simple Ping doesn't seem to have a very reliable result, since with a simple ping, I don't see a difference between a gigabit NIC and a 10gbe NIC, for example.

In the manuals of my 10gbe fiber NIC it says that it has low latency, in the range of a few microseconds. But that's not what I got with Ping.

I've been reading on the internet that the ideal would be to use the netperf tool, but I didn't find a way to install this tool on Proxmox.

Any suggestion?

Tanks.
 
Last edited:
honestly, I don't think you'll have big differences between nic/cables.
but a good switch with good asics could give you some ms lower.

So, a simple ping -f could give you the correct latency.

For example, in production, I'm using mellanox connectx-4 nic + mellanox switches sn2100, I'm around: 0.023ms avg

But ceph performance is not only about network latency, it's mainly impacted by cpu processing latency , both on client/kvm side && osd side.

With 3ghz cpu both client && server side, forcing the cpu frequency to max, with a replicat x3,
i'm around 0.500-0.750 ms by iops.

As you see, network latency is not the major impact on iops. (with replicat x3, it's around 0.023 * 3 = 0.069 ms vs 0.500ms)
 
honestly, I don't think you'll have big differences between nic/cables.
but a good switch with good asics could give you some ms lower.

So, a simple ping -f could give you the correct latency.

For example, in production, I'm using mellanox connectx-4 nic + mellanox switches sn2100, I'm around: 0.023ms avg

But ceph performance is not only about network latency, it's mainly impacted by cpu processing latency , both on client/kvm side && osd side.

With 3ghz cpu both client && server side, forcing the cpu frequency to max, with a replicat x3,
i'm around 0.500-0.750 ms by iops.

As you see, network latency is not the major impact on iops. (with replicat x3, it's around 0.023 * 3 = 0.069 ms vs 0.500ms)
I also see other metrics like SSDs to host the OSD's database.

Certainly 0.500-0.750 is a great result per Iops. In fact, I don't think I need to lower the final latency of my access to Ceph that much. Of course, the lower, the better. But as the budget here is low, I have to work with old hardware.

But, how do you force the CPU frequency to the maximum in Proxmox Ceph? Is there a correct technique for this? Any software on Proxmox?

My CPUs, some of them have Intel Turbo Boost technology. But I don't know how to force this to work at full frequency.

Tanks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!