slow Performance on Ceph in LXC Container

Ah_lead · Jul 18, 2024

Hello everyone,

I am currently experiencing performance issues with my Ceph storage setup inside LXC containers and would greatly appreciate any insights or suggestions you might have. Please note that I am still a newbie and might be making some unexpected beginner mistakes, so I would deeply appreciate your guidance.

Server Hardware:

3 Servers (each with the following specifications):
- 8 x 1TB SSDs (total 8TB SSD storage per server)
- 72 x Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (2 Sockets)
- 10GbE Switch for networking
- HDD Controller: HP Smart HBA

Cluster Configuration:

Cluster: 3 Nodes (Servers)
OSDs: 21 OSDs (7 per Node)
PG Count: 1024
Ceph Configuration: 3x Monitor and 3x Manager
Networks: Separate NICs for public and cluster network through 10GbE switch

iface eno49 inet manual
mtu 9000

iface eno50 inet manual
mtu 9000

auto vmbr0
iface vmbr0 inet static
address 192.168.42.60/24
gateway 192.168.42.1
bridge-ports eno49
bridge-stp off
bridge-fd 0
mtu 9000

auto vmbr1
iface vmbr1 inet static
address 192.168.123.60/24
bridge-ports eno50
bridge-stp off
bridge-fd 0
mtu 9000

source /etc/network/interfaces.d/*

/etc/ceph/ceph.conf
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
fsid = 68xxxxxx-xxxe-xxxc-8xxx-xxxx0720xxxx
mon_allow_pool_delete = true
mon_host = 192.168.42.60 192.168.42.61 192.168.42.62
ms_bind_ipv4 = true
ms_bind_ipv6 = false
public_network = 192.168.42.60/24
cluster_network = 192.168.123.0/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mon.vrt-ffm1]
public_addr = 192.168.42.60

[mon.vrt-ffm2]
public_addr = 192.168.42.61

[mon.vrt-ffm3]
public_addr = 192.168.42.62

Container Configuration on Each Node:

LXC Container: One on each node
Resources:
- 12 cores per container
- 36 GB RAM per container
- 800 GB storage per container
Ceph Storage: Provided within the LXC containers

ceph tell osd.* bench
Node 1:

OSD.0: 1GB write, blocksize 4MB, 25.68 sec, 40MB/s
OSD.3: 1GB write, blocksize 4MB, 73.06 sec, 14.70MB/s
OSD.4: 1GB write, blocksize 4MB, 49.53 sec, 21.68MB/s
OSD.5: 1GB write, blocksize 4MB, 37.52 sec, 28.62MB/s
OSD.6: 1GB write, blocksize 4MB, 3.78 sec, 283.93MB/s
OSD.7: 1GB write, blocksize 4MB, 53.76 sec, 19.97MB/s
OSD.8: 1GB write, blocksize 4MB, 2.89 sec, 372.13MB/s

Node 2:

OSD.1: 1GB write, blocksize 4MB, 5.08 sec, 211.50MB/s
OSD.9: 1GB write, blocksize 4MB, 52.37 sec, 20.50MB/s
OSD.10: 1GB write, blocksize 4MB, 55.95 sec, 19.19MB/s
OSD.11: 1GB write, blocksize 4MB, 59.25 sec, 18.12MB/s
OSD.12: 1GB write, blocksize 4MB, 54.70 sec, 19.63MB/s
OSD.13: 1GB write, blocksize 4MB, 3.50 sec, 306.57MB/s
OSD.14: 1GB write, blocksize 4MB, 74.12 sec, 14.49MB/s

Node 3:

OSD.2: 1GB write, blocksize 4MB, 3.65 sec, 294.39MB/s
OSD.15: 1GB write, blocksize 4MB, 2.66 sec, 404.06MB/s
OSD.16: 1GB write, blocksize 4MB, 59.07 sec, 18.18MB/s
OSD.17: 1GB write, blocksize 4MB, 53.96 sec, 19.90MB/s
OSD.18: 1GB write, blocksize 4MB, 45.87 sec, 23.41MB/s
OSD.19: 1GB write, blocksize 4MB, 3.03 sec, 354.54MB/s
OSD.20: 1GB write, blocksize 4MB, 2.95 sec, 363.42MB/s

Rados Benchmark:

rados bench -p ceph 10 write --no-cleanup
Write: 4194304 bytes, 274.4 MB/sec, 68 IOPS, 0.22 average latency

rados bench -p ceph 10 seq
Sequential: 4194304 bytes, 1165.1 MB/sec, 291 IOPS, 0.05 average latency

rados bench -p ceph 10 rand
Random: 4194304 bytes, 1129.42 MB/sec, 307 IOPS, 0.05 average latency

FIO Benchmark:

fio --randrepeat=1 --ioengine=libaio –direct=1 --gtod_reduce=1 --name=test --filename=test --bs=1024k --iodepth=32 --size=150G --readwrite=randrw

IO Depth 32: 41.5 MiB/s write, 41.7 MiB/s read, 41 IOPS write, 41 IOPS read

Iperf3 Test:

Transfer: 1.15 GBytes Bitrate: 9.87 Gbits/sec Congestion Window: 1.55 MBytes

Issues Encountered:

the performance of my Ceph storage inside the LXC containers is inconsistent. Some OSDs exhibit very high performance (e.g., OSD.8, OSD.13, OSD.15), while others are much slower (e.g., OSD.3, OSD.14, OSD.16).

I have already configured the Ceph cluster with a pg_num of 1024 and the bdev_enable_discard option set to true. However, I am still experiencing slow write speeds and inconsistent performance across different OSDs.

Request for Advice:
What are the potential causes for the inconsistent performance among the OSDs?
Are there any additional Ceph or system-level optimizations that I can apply to improve the performance inside the LXC containers?
Could the current configuration of the cluster and containers be affecting the performance? If so, what changes would you recommend?
Is there a specific fio or other benchmark configuration that would better represent the typical Ceph workload in my setup?

I appreciate any insights or recommendations you can provide. Thank you for your help

Search

Search

slow Performance on Ceph in LXC Container

Ah_lead

New Member