Performance issue - Ceph / Proxmox - SSD only pool

SanderM

Member
Oct 21, 2016
40
1
6
41
I have an SSD only pool built with 3 nodes and a total of 9 OSD's.
Each node has 3x 960Gb SM863 Samsung SSD's.
My pool is set to size=2 and PG=1024.

For some reason I'm only seeing 200mb/s write speed inside my vms. Read speed is about 300mb/s.

But these SSD's should be faster. Even a single SSD should reach about twice the speed.
So I wonder what could be the cause?

rados bench -p testpool 20 write shows about 988mb/s which is a whole lot faster than the 200mb I'm seeing inside my VM's.

Anyone has any tips?
 
I noticed that "rados bench -p testpool 20 write" shows 988mb/s but it's multi threaded.
If I do "rados bench -p testpool 20 write -t 1" I get about 200mb/s .. the same value as inside the VM.

So it looks like my ceph pool is slow with single thread i/o .. But why ? What can I do about it?
 
What network architecture/speed?

What VM disk subsystem?

What filesystem for OSDs?

What are you using to benchmark in the VM?
 
ceph is slow single threads because of latency (network + time to compute datas.)

some tips to improve performance :

- disable cephx authentifcation (this need a restart of your cluster and guest)
- disable debug in /etc/ceph/ceph.conf on your client
[global]
debug ms = 0/0

cache=none to reduce latency. (cache=writeback is better for sequential write, but slowdown read because of latency)

use cpu with fast frequency (intel, no amd), something like 3ghz. both for your ceph cluster && client.

in you guest, use deadline or noop io scheduler

After that, you can try krbd



A new feature is coming for qemu in next librbd version, to avoid some buffer copy between qemu block layer && librbd.
I have big hopes for this, it should help a lot.
 
  • Like
Reactions: ebiss
Thanks! This helped a lot and my speed is now actually a lot better ;)

I do have another problem now:

I removed all my test-vm's but there's still 16Gb used in ceph pool. When I go to a node -> storage -> content I can see there's vm-102-disk-1 of 16gb there but when I click it the remove button is greyed out.

So, how can I remove it now?
 
I removed it with 'rbd -p HDDpool rm vm-102-disk-1' for now.
But is it normal for the remove button to be greyed out?
 
Thanks! This helped a lot and my speed is now actually a lot better ;)
I'm curious to known difference. can you post results ?

I do have another problem now:

I removed all my test-vm's but there's still 16Gb used in ceph pool. When I go to a node -> storage -> content I can see there's vm-102-disk-1 of 16gb there but when I click it the remove button is greyed out.

So, how can I remove it now?

maybe a lock. seem strange that volume has not been remove when delete the vm