[SOLVED] Ceph performance degradation

EagleBBS

New Member
Jan 4, 2023
11
0
1
Hello,


i've build a 3 nodes PVE cluster with following specificities by nodes
  • 128Go RAM
  • 3x2,5 GBPS networks cards ( Service / Admin / Ceph) - Operation will be to change CEPH network cards from 2,5 to 10GPBS in proach future
  • 5 1To OSDs (ssd) hosting their own OSD caches

All was ok during months, but since 2,3 weeks, all VMs that uses CEPH storage experienced major performance losses.
ssd led activities seems to be now continuous on 2 of 3 nodes (in the pas ssd activty were less stressed on all nodes)
And VM systems responses are very slow.

The status:
  • All OSD are in satus green
  • CEPH global status is green
  • Numerous event as mgr.nodename (mgr.1872281) 109502 : cluster 0 pgmap v109794: 129 pgs: 129 active+clean; 1.9 TiB data, 5.8 TiB used, 7.9 TiB / 14 TiB avail; 834 KiB/s rd, 12 MiB/s

Even if i well understood that CEPH role is oriented data integrity and not data access performance, this behavior change is really suprising.

If you have any idea? I will be really interested to have some steps to investigate.

Thanks in advance for your support.
 
Hello!

Please post your pvereport if you can or provide more details to the devices used. What ist the output of:

root@PMX4:~# ceph osd perf
osd commit_latency(ms) apply_latency(ms)
15 0 0
14 0 0
13 0 0
12 0 0
23 0 0
22 0 0
21 0 0
20 0 0
19 0 0
18 0 0
17 0 0
16 0 0

Also try to check the single osds and if there is one that has some significant other performance compared to the others:
root@PMX4:~# ceph tell osd.* bench
osd.12: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.58571606899999995,
"bytes_per_sec": 1833212166.8323224,
"iops": 437.07184000785884
}
osd.13: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.57350169600000001,
"bytes_per_sec": 1872255708.2028227,
"iops": 446.3805456645066
}
osd.14: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.61712510799999998,
"bytes_per_sec": 1739909477.1558056,
"iops": 414.82674530882969
}

Source: https://www.thomas-krenn.com/de/wiki/Ceph_Perfomance_Guide_-_Sizing_&_Testing
this might be useful for you.
 
Thanks for your link and fast answer,

please fin what i can provide you.

ceph osd perf
osd commit_latency(ms) apply_latency(ms)
10 821 821
3 20 20
2 86 86
7 173 173
11 18 18
9 363 363
8 330 330
6 21 21
5 12 12
4 1333 1333
1 23 23
0 29 29

ceph tell osd.* bench

osd.0: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 32.015915677000002,
"bytes_per_sec": 33537751.499369677,
"iops": 7.9960230587410157
}
osd.1: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 17.793766406,
"bytes_per_sec": 60343706.863429308,
"iops": 14.38706084810002
}

osd.2: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 60.188157644,
"bytes_per_sec": 17839752.303949088,
"iops": 4.2533283958313675
}
osd.3: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 20.986538949,
"bytes_per_sec": 51163358.884918153,
"iops": 12.198295327405489
}
osd.4: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 114.072244674,
"bytes_per_sec": 9412822.7867223993,
"iops": 2.2441918341451643
}
osd.5: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 22.998706712000001,
"bytes_per_sec": 46687052.33932808,
"iops": 11.131060681182881
}
osd.6: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 22.931379778,
"bytes_per_sec": 46824126.345425181,
"iops": 11.16374167094831
}
osd.7: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 91.591547349999999,
"bytes_per_sec": 11723154.101730546,
"iops": 2.7950177435232511
}
osd.8: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 90.058720457999996,
"bytes_per_sec": 11922685.760350691,
"iops": 2.8425897980572441
}
osd.9: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 98.844957917000002,
"bytes_per_sec": 10862889.181475699,
"iops": 2.5899146035851714
}
osd.10: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 107.87684967600001,
"bytes_per_sec": 9953403.6007252969,
"iops": 2.3730763437093012
}
 
Last edited:
Hello, your osds are really slow. Please post details what devices are used? This might me consumer grade hardware, which does not perform well with ceph. The values differ to much from each other. for example the one disk with 9 MB/s will slow down all a lot, as ceph needs to write the data 3 times. ceph performance is defined by the worst performing disk in the cluster.

Code:
osd.0:
"bytes_per_sec": 33537751.499369677, (33 MB/s)

osd.1:
"bytes_per_sec": 60343706.863429308 (60 MB/s)

osd.2:
"bytes_per_sec": 17839752.303949088  (17 MB/s)

osd.3:
"bytes_per_sec": 51163358.884918153, (51 MB/s)

osd.4:
"bytes_per_sec": 9412822.7867223993 (9,4 MB/s)

osd.5:
"bytes_per_sec": 46687052.33932808 (46 MB/s)

osd.6:
"bytes_per_sec": 46824126.345425181, (46 MB/s)

osd.7:
"bytes_per_sec": 11723154.101730546, (11 MB/s)

osd.8:
"bytes_per_sec": 11922685.760350691, (11 MB/s)

osd.9:
"bytes_per_sec": 10862889.181475699, (10 MB/s)

osd.10:
"bytes_per_sec": 9953403.6007252969, (9 MB/s)
 
Samsung_SSD_870_EVO_1TB node2:sdd osd.6 19%
Samsung_SSD_870_EVO_1TB node2::sde osd.11 21%
Samsung_SSD_870_EVO_1TB node2::sdf osd.0 17%
Samsung_SSD_870_EVO_2TB node1:sda osd.3 7%
Samsung_SSD_870_EVO_2TB node2::sda osd.1 6%
Samsung_SSD_870_EVO_2TB node3:sda osd.5 6%
Samsung_SSD_870_QVO_1TB node1:sdf osd.7 10%
Samsung_SSD_870_QVO_1TB node1:sde osd.10 45%
Samsung_SSD_870_QVO_1TB node3:sdb osd.9 37%
Samsung_SSD_870_QVO_1TB node3:sde osd.4 12%
Samsung_SSD_870_QVO_1TB node3:sdf osd.8 7%
Samsung_SSD_870_QVO_1TB node1:sdb osd.2 7%
 
Last edited:
Samsung_SSD_870_EVO_1TB node2:sdd osd.6 19%
...
Samsung_SSD_870_QVO_1TB node1:sdb osd.2 7%
... and again ... the forum thread has come to the same end as many others: please don't use consumer SSDs with CEPH or ZFS. They will not be fast (or get very very slow after time).
 
  • Like
Reactions: jsterr
It's a personal lab installation unfortunately, my only solution will be to remove ssd that are slow.
Eternal choice performance/cost vs personal/professionnal environnement...


But did you think that this performance issue is only linked to slow ssd? I was doing lot of data move during execution of command line.
I'm using 2,GGBPS network card, i've seen that 10GBPS are recommended for CEPH synchronisation. Network can be another rootcause?
 
Last edited:
Issue solved with these steps:
  • Reduction of OSD managed by each nodes: 4>3 osds by nodes (each OSD management taking CPU & RAM load)
  • Change of SAMSUNG technology disk QVO>EVO (faster)
  • Change of switch 2.5>10GBPS
Thanks for your positive support. I've learned now how to detect I/O latencies easyly with yours commands & process.


Have a nice week!
 
Last edited:
Code:
# ceph tell osd.12 bench
{
    "bytes_written": 1073741824,
    "blocksize": 4194304,
    "elapsed_sec": 0.45615625700000001,
    "bytes_per_sec": 2353890377.524735,         (2300 MB/s)
    "iops": 561.21119917028784                         (561 IOPS)
}

SAMSUNG MZPLL6T4HMLS-000MV 6.4TB Enterprise NVMe
 
Issue solved with these steps:
  • Change of SAMSUNG technology disk QVO>EVO (faster)
You'll be back soon enough, I suspect. What little performance you think you gained by swapping in new drives will deteriorate quickly. Something easily avoided by buying the right drives to begin with.

As an example, you can find used Samsung PM863a 1.92TB enterprise drives with PLP for under $100 (per the time of this post) on eBay...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!