[SOLVED] Ceph performance degradation

EagleBBS · Jul 17, 2024

Hello,

i've build a 3 nodes PVE cluster with following specificities by nodes

128Go RAM
3x2,5 GBPS networks cards ( Service / Admin / Ceph) - Operation will be to change CEPH network cards from 2,5 to 10GPBS in proach future
5 1To OSDs (ssd) hosting their own OSD caches

All was ok during months, but since 2,3 weeks, all VMs that uses CEPH storage experienced major performance losses.
ssd led activities seems to be now continuous on 2 of 3 nodes (in the pas ssd activty were less stressed on all nodes)
And VM systems responses are very slow.

The status:

All OSD are in satus green
CEPH global status is green
Numerous event as mgr.nodename (mgr.1872281) 109502 : cluster 0 pgmap v109794: 129 pgs: 129 active+clean; 1.9 TiB data, 5.8 TiB used, 7.9 TiB / 14 TiB avail; 834 KiB/s rd, 12 MiB/s

Even if i well understood that CEPH role is oriented data integrity and not data access performance, this behavior change is really suprising.

If you have any idea? I will be really interested to have some steps to investigate.

Thanks in advance for your support.

jsterr · Jul 17, 2024

Hello!

Please post your pvereport if you can or provide more details to the devices used. What ist the output of:

root@PMX4:~# ceph osd perf
osd commit_latency(ms) apply_latency(ms)
15 0 0
14 0 0
13 0 0
12 0 0
23 0 0
22 0 0
21 0 0
20 0 0
19 0 0
18 0 0
17 0 0
16 0 0

Also try to check the single osds and if there is one that has some significant other performance compared to the others:

root@PMX4:~# ceph tell osd.* bench
osd.12: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.58571606899999995,
"bytes_per_sec": 1833212166.8323224,
"iops": 437.07184000785884
}
osd.13: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.57350169600000001,
"bytes_per_sec": 1872255708.2028227,
"iops": 446.3805456645066
}
osd.14: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.61712510799999998,
"bytes_per_sec": 1739909477.1558056,
"iops": 414.82674530882969
}

Source: https://www.thomas-krenn.com/de/wiki/Ceph_Perfomance_Guide_-_Sizing_&_Testing
this might be useful for you.

EagleBBS · Jul 17, 2024

Thanks for your link and fast answer,

please fin what i can provide you.

ceph osd perf
osd commit_latency(ms) apply_latency(ms)
10 821 821
3 20 20
2 86 86
7 173 173
11 18 18
9 363 363
8 330 330
6 21 21
5 12 12
4 1333 1333
1 23 23
0 29 29

ceph tell osd.* bench

osd.0: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 32.015915677000002,
"bytes_per_sec": 33537751.499369677,
"iops": 7.9960230587410157
}
osd.1: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 17.793766406,
"bytes_per_sec": 60343706.863429308,
"iops": 14.38706084810002
}

osd.2: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 60.188157644,
"bytes_per_sec": 17839752.303949088,
"iops": 4.2533283958313675
}
osd.3: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 20.986538949,
"bytes_per_sec": 51163358.884918153,
"iops": 12.198295327405489
}
osd.4: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 114.072244674,
"bytes_per_sec": 9412822.7867223993,
"iops": 2.2441918341451643
}
osd.5: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 22.998706712000001,
"bytes_per_sec": 46687052.33932808,
"iops": 11.131060681182881
}
osd.6: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 22.931379778,
"bytes_per_sec": 46824126.345425181,
"iops": 11.16374167094831
}
osd.7: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 91.591547349999999,
"bytes_per_sec": 11723154.101730546,
"iops": 2.7950177435232511
}
osd.8: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 90.058720457999996,
"bytes_per_sec": 11922685.760350691,
"iops": 2.8425897980572441
}
osd.9: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 98.844957917000002,
"bytes_per_sec": 10862889.181475699,
"iops": 2.5899146035851714
}
osd.10: {
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 107.87684967600001,
"bytes_per_sec": 9953403.6007252969,
"iops": 2.3730763437093012
}

jsterr · Jul 17, 2024

Hello, your osds are really slow. Please post details what devices are used? This might me consumer grade hardware, which does not perform well with ceph. The values differ to much from each other. for example the one disk with 9 MB/s will slow down all a lot, as ceph needs to write the data 3 times. ceph performance is defined by the worst performing disk in the cluster.

Code:

osd.0:
"bytes_per_sec": 33537751.499369677, (33 MB/s)

osd.1:
"bytes_per_sec": 60343706.863429308 (60 MB/s)

osd.2:
"bytes_per_sec": 17839752.303949088  (17 MB/s)

osd.3:
"bytes_per_sec": 51163358.884918153, (51 MB/s)

osd.4:
"bytes_per_sec": 9412822.7867223993 (9,4 MB/s)

osd.5:
"bytes_per_sec": 46687052.33932808 (46 MB/s)

osd.6:
"bytes_per_sec": 46824126.345425181, (46 MB/s)

osd.7:
"bytes_per_sec": 11723154.101730546, (11 MB/s)

osd.8:
"bytes_per_sec": 11922685.760350691, (11 MB/s)

osd.9:
"bytes_per_sec": 10862889.181475699, (10 MB/s)

osd.10:
"bytes_per_sec": 9953403.6007252969, (9 MB/s)

EagleBBS · Jul 17, 2024

Samsung_SSD_870_EVO_1TB node2:sdd osd.6 19%
Samsung_SSD_870_EVO_1TB node2::sde osd.11 21%
Samsung_SSD_870_EVO_1TB node2::sdf osd.0 17%
Samsung_SSD_870_EVO_2TB node1:sda osd.3 7%
Samsung_SSD_870_EVO_2TB node2::sda osd.1 6%
Samsung_SSD_870_EVO_2TB node3:sda osd.5 6%
Samsung_SSD_870_QVO_1TB node1:sdf osd.7 10%
Samsung_SSD_870_QVO_1TB node1:sde osd.10 45%
Samsung_SSD_870_QVO_1TB node3:sdb osd.9 37%
Samsung_SSD_870_QVO_1TB node3:sde osd.4 12%
Samsung_SSD_870_QVO_1TB node3:sdf osd.8 7%
Samsung_SSD_870_QVO_1TB node1:sdb osd.2 7%

LnxBil · Jul 17, 2024

EagleBBS said:
Samsung_SSD_870_EVO_1TB node2:sdd osd.6 19%
...
Samsung_SSD_870_QVO_1TB node1:sdb osd.2 7%

... and again ... the forum thread has come to the same end as many others: please don't use consumer SSDs with CEPH or ZFS. They will not be fast (or get very very slow after time).

EagleBBS · Jul 18, 2024

It's a personal lab installation unfortunately, my only solution will be to remove ssd that are slow.
Eternal choice performance/cost vs personal/professionnal environnement...

But did you think that this performance issue is only linked to slow ssd? I was doing lot of data move during execution of command line.
I'm using 2,GGBPS network card, i've seen that 10GBPS are recommended for CEPH synchronisation. Network can be another rootcause?

LnxBil · Jul 18, 2024

Read from a VM that has local data. If it's still slow, it's the storage.

EagleBBS · Jul 22, 2024

Issue solved with these steps:

Reduction of OSD managed by each nodes: 4>3 osds by nodes (each OSD management taking CPU & RAM load)
Change of SAMSUNG technology disk QVO>EVO (faster)
Change of switch 2.5>10GBPS

Thanks for your positive support. I've learned now how to detect I/O latencies easyly with yours commands & process.

Have a nice week!

LnxBil · Jul 28, 2024

EagleBBS said:
Change of SAMSUNG technology disk QVO>EVO (faster)

Image how fast everything would be, if you would just go to enterprise SSDs.

mram · Jul 28, 2024

Code:

# ceph tell osd.12 bench
{
    "bytes_written": 1073741824,
    "blocksize": 4194304,
    "elapsed_sec": 0.45615625700000001,
    "bytes_per_sec": 2353890377.524735,         (2300 MB/s)
    "iops": 561.21119917028784                         (561 IOPS)
}

SAMSUNG MZPLL6T4HMLS-000MV 6.4TB Enterprise NVMe

Wibla · Jul 28, 2024

EagleBBS said:
Issue solved with these steps:

Change of SAMSUNG technology disk QVO>EVO (faster)

You'll be back soon enough, I suspect. What little performance you think you gained by swapping in new drives will deteriorate quickly. Something easily avoided by buying the right drives to begin with.

As an example, you can find used Samsung PM863a 1.92TB enterprise drives with PLP for under $100 (per the time of this post) on eBay...

Search

Search

[SOLVED] Ceph performance degradation

EagleBBS

New Member

jsterr

Renowned Member

EagleBBS

New Member

jsterr

Renowned Member

EagleBBS

New Member

LnxBil

Distinguished Member

EagleBBS

New Member

LnxBil

Distinguished Member

EagleBBS

New Member

LnxBil

Distinguished Member

mram

Renowned Member

Wibla

New Member