Ceph performance issue

igorkuz

New Member
Nov 16, 2022
1
0
1
Hi all,
I'm new here. I have a 3 node cluster running ceph with 8x 1TB SSD per host but performance that I get out of it is very poor. By testing from a windows guest, using CrystalDiskMark, I get 340 MB/s read and 47MB/s write.

Any help would be appreciated as I really want it to work.

Here is my setup info:
3x Dell PE R620
8X Samsung 850 Pro SSDs connected using PERC H310 Non-RAID per host
2x Intel Xeon CPU E5-2689 per host
196GB RAM per host
Network is 10Gb (without the switch, this is the guide I used) and iperf test is showing that I get full 10Gb

Below is more info:

Code:
rados -p CephVM bench 10 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_pmh1_86946
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16        79        63    251.99       252    0.091274    0.132514
    2      16       101        85   169.981        88   0.0554055    0.140731
    3      16       133       117    155.98       128   0.0486606    0.380463
    4      16       177       161   160.977       176   0.0681243    0.360015
    5      16       220       204   163.176       172    0.772509    0.360257
    6      16       246       230   153.309       104     0.44223    0.346303
    7      16       288       272   155.404       168    0.215485    0.388716
    8      16       331       315   157.474       172   0.0470773    0.373749
    9      16       364       348   154.641       132    0.701008    0.392809
   10      15       387       372   148.775        96   0.0982538     0.38876
   11      13       387       374   135.977         8     0.14039    0.387209
Total time run:         11.4729
Total writes made:      387
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     134.926
Stddev Bandwidth:       63.2961
Max bandwidth (MB/sec): 252
Min bandwidth (MB/sec): 8
Average IOPS:           33
Stddev IOPS:            15.824
Max IOPS:               63
Min IOPS:               2
Average Latency(s):     0.462407
Stddev Latency(s):      0.663262
Max latency(s):         3.14395
Min latency(s):         0.0423993
Cleaning up (deleting benchmark objects)
Removed 387 objects
Clean up completed and total clean up time :0.533057

Config:
Code:
[global]
     auth_client_required = cephx
     auth_cluster_required = cephx
     auth_service_required = cephx
     cluster_network = 10.15.15.51/24
     fsid = b1326e1a-73a8-4418-b5d0-XXXXXXXXXXXXXXXXXXXXX
     mon_allow_pool_delete = true
     mon_host = 10.15.15.51 10.15.15.52 10.15.15.53
     ms_bind_ipv4 = true
     ms_bind_ipv6 = false
     osd_pool_default_min_size = 2
     osd_pool_default_size = 3
     public_network = 10.15.15.51/24

[client]
     keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
     keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.pmh1]
     host = pmh1
     mds_standby_for_name = pve

[mds.pmh2]
     host = pmh2
     mds_standby_for_name = pve

[mds.pmh3]
     host = pmh3
     mds standby for name = pve

[mon.pmh1]
     public_addr = 10.15.15.51

[mon.pmh2]
     public_addr = 10.15.15.52

[mon.pmh3]
     public_addr = 10.15.15.53
 

Attachments

  • Screenshot from 2022-11-16 10-47-40.png
    Screenshot from 2022-11-16 10-47-40.png
    26.6 KB · Views: 16
Last edited:
Can you try setting your target ratio, so the Autoscaler scales up the number of PGs accordingly? 32 seems a bit on the lower end. After the autoscaler has done its work, try benchmarking again to see if it yielded any improvements.
 
I use the following optimizations in a 5-node 12th-gen Dell cluster using SAS drives:

Set write cache enable (WCE) to 1 on SAS drives
Set VM cache to none
Set VM to use VirtIO-single SCSI controller and enable IO thread and discard option
Set VM CPU type to 'host'
Set VM CPU NUMA if server has 2 or more physical CPU sockets
Set VM VirtIO Multiqueue to number of cores/vCPUs
Set VM to have qemu-guest-agent software installed
Set Linux VMs IO scheduler to none/noop
Set RBD pool to use the 'krbd' option

I get write IOPS in the hundreds and reads are usually double/triple write IOPS.

Since you aren't using a switch, I recommend a full-mesh broadcast network https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Broadcast_Setup
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!