150mb/sec on a NVMe x3 ceph pool

AlexLup

Well-Known Member
Mar 19, 2018
218
16
58
43
Hi,
I have invested in SAMSUNG PM983 (MZ1LB960HAJQ-00007) x3 to run a fast pool on. However I am only getting 150mb/sec from these.

The frontend network is 1gbit, the backend is 10gbit. The CPUs are at 20% at all times even though they are weak, the RAM on the cluster (3x replica) is below 70%.

vfio results directly on the NVMe's:
https://docs.google.com/spreadsheets/d/1LXupjEUnNdf011QNr24pkAiDBphzpz5_MwM0t9oAl54/edit?usp=sharing

Config and Results of ceph bench:
https://pastebin.com/cScBv7Fv

Appreciate any help you can give me.
A
 
Last edited:
Apologies, I should've said that from the beginning.

The frontend network is 1gbit, the backend is 10gbit. The CPUs are at 20% at all times even though they are weak, the RAM on the cluster (3x replica as noted) is below 70%.
 
I have invested in SAMSUNG PM983 (MZ1LB960HAJQ-00007) x3 to run a fast pool on. However I am only getting 150mb/sec from these.
Have a look at the image, Ceph uses the public_network for client communication.
1604911640958.png
 
Disable cephx, then performance should be up... Also use 10 Gigabit network for frontet and backhend if only that clsuter use that ceph...
 
Hi all,
All very valid points which is why I ran the tests on the PVE boxes themselves through a Mikrotik 10gbit switch. Network is not breaking a sweat, and not CPU either. I have ofcourse tested the network to be at real 10gbit speed beforehand with iperf and RAM disks transfers.

For network interface information you can view it here, 10gbit bridge with MTU 9000 included: https://pastebin.com/bvkWwvr3
For the ceph config it can be found in the original link which I am now reposting: https://pastebin.com/cScBv7Fv

I realize this is a ceph issue and not necessarily a Proxmox issue (which is a wonderful product I keep promoting whenever I get the chance) but I really want to put my 750Euro investment to good use, so thank you for any feedback you might have for me!

A
 
Do not use;

auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx

Look forum, you will found some info about that...

For MTU, maybe 1500 better than 9000 for more IO, 9000 means that ethernet will be allow 9000 bits/s packet but that is not good for IO actually..
 
@AlexLup

you need to put both public_network && cluster_network on your 10GB link.

public network is : client--->primary osd
cluster network is : osd----->osd replication

if your public network is only at 1gbit, you can reach more than 1gbit/s (around 125MB/s)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!