Proxmox VE Ceph Benchmark 2018/02

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
233
63
a. A container with a virtual disk stored on CEPHFS, benchmark running on its local /tmp, bandwidth of approx 450MB/s
b. Same container with BIND MOUNT exported by host and benchmark running on this shared folder, bandwidth of approx 70MB/s
This is exactly the reason, why we don't recommend CephFS as directory storage for VM/CT at the moment. Workloads with small updates suffer greatly by the latency introduced. Use RBD instead, if you don't need a shared filesystem.
 

Rosario Contarino

New Member
Jul 16, 2019
28
0
1
50
This is exactly the reason, why we don't recommend CephFS as directory storage for VM/CT at the moment. Workloads with small updates suffer greatly by the latency introduced. Use RBD instead, if you don't need a shared filesystem.
That's exactly the point. We need shared filesystems. We are currently comparing CEPHFS performances against GPFS and Nutanix NDFS.

That being said, I am afraid I did not understand your point.

Both in case a. and b. above we are accessing CEPHFS. Only in case a. the container virtual disk is stored in CEPHFS and the benchmark read/writes inside this virtual disk (stored in CEPHFS), while in case b. the container access the underlying CEPHFS via a BIND MOUNT option.

The benchmark script is the same.

So why do you say that in case b. the benchmark suffers of the latency introduced by CEPHFS while in case a. it doesn't?

To be honest, I would have expected an opposite result: b. being faster than a.

Any further detail will be greatly appreciated.
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
233
63
Both in case a. and b. above we are accessing CEPHFS. Only in case a. the container virtual disk is stored in CEPHFS and the benchmark read/writes inside this virtual disk (stored in CEPHFS), while in case b. the container access the underlying CEPHFS via a BIND MOUNT option.
This is exactly the difference. While you have (depending on your test) one big open file that is written too / read from, the bind mount will put all read/write operations directly to CephFS. CephFS needs to translate the filesystem into objects (usually 4 MB).

This is why we didn't release it as a generell storage.

You can try to tune CephFS by eg. activating the experimental inline data [0] feature or cache tiering [1]. Both need to be carefully tested and no guarantees that the have the wished effect.

Please note that this setup is not supported by us, hence no enterprise support [2] (where eligible) can be given.

[0] https://docs.ceph.com/docs/nautilus/cephfs/experimental-features/#inline-data
[1] https://docs.ceph.com/docs/nautilus/rados/operations/cache-tiering/
[2] https://www.proxmox.com/en/proxmox-ve/pricing
 

Rosario Contarino

New Member
Jul 16, 2019
28
0
1
50
Understood.
I shall replace now all HDDs with SSDs and perform again the same test to see if anything significant changes and I'll publish here the results. Do you expect any performance improvement on CEPHFS in any of their future releases? Has anything been put in roadmap yet?
 

Alwin

Proxmox Staff Member
Staff member
Aug 1, 2017
2,672
233
63
Do you expect any performance improvement on CEPHFS in any of their future releases? Has anything been put in roadmap yet?
CephFS was the original idea but got finished the latest, in terms of being production ready. So yes, there are improvements with every Ceph release [0]. See their experimental feature list [1] to get a feel for what is still coming (eg. lazyio).

[0] https://docs.ceph.com/docs/master/releases/nautilus/
[1] https://docs.ceph.com/docs/nautilus/cephfs/experimental-features/
 

Joao Correa

New Member
Nov 20, 2017
8
1
3
35
Hello!
Has anyone been able to compare the performance between Proxmox 5.x (Ceph Luminous) and Proxmox 6 (Ceph Nautilus)?
Is there an improvement in performance?
 

davekempe

New Member
Feb 15, 2012
3
0
1
We have completed the initial build of our new cluster with 3 nodes:
CPU: Dual AMD EPYC 7551 32-Core Processor
RAM: 512G
DISK: 10 * 2TB NVME SAMSUNG drives
case is a 1RU Supermicro box 10bay NVME
4 * 10GB fibre nics with a pair of Juniper 4600s in a stack. 1 pair for ceph cluster, 1 pair for ceph public traffic
2 * 10GB copper for uplinks to LAN/vm traffic

Ceph is running pretty well, and the network is the bottle neck as expected. I suppose we could have upgraded to 100G if we needed it and had the budget.
Before we really start using this in anger, does anyone have any tips to tune Ceph or Proxmox for speed/reliability etc?
Happy to take any input, or do some more benchmarks.
 
Jan 21, 2017
275
24
18
30
Berlin
If you need a good working shared storage I recommend using croit.io on dedicated Ceph nodes instead.
External poola can easily be integrated into PVE, even erasure coded pools. (I finally figured out how to do it for multiple pools).
 

dsh

New Member
Jul 6, 2016
24
3
3
29
I'm interested in 3 Node Mesh network setup. How do you connect? Do you bond two nic for each node? or give separate IP?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!