Any one can explain why the krbd module much faster than librbd

hatchi

New Member
Aug 15, 2017
4
0
1
54
Hi All,

I have been working on optimizing our cluster performance and I noticed that using krbd will give us approx 10X read/write performance gain compared to librbd.

And while I am thinking to switch I want to understand why ?.

I know krbd is running in the kernel so it in not userspace process which gives it advantages when dealing with memory allocation and other stuff but for me that does not explain 10X performance gain .
if the gain was 30 to 50 % more it would have been understandable but we are talking about 1000% gain here.

I need to understand before switching to krbd as to switch i have to disable all the rbd features on the images except layering (without object-map deleting big rbd images is slow)


for the test mythology and how i confirmed it was only switching between krbd and librbd that makes the difference i did this
1- i tested on new ceph pool with new proxmox server and found krbd is approx 8X better
2- i created new proxmox server and created new vm on it using librbd and tested disk performance
then on same vm on same server i switched to krbd , removed some rbd image features ,stopped and started the vm and i got 10X gain
3- on the same vm on step 2 which now work with krbd i switched it back to librbd and i lost the performance gains
4- on our production pool and our production ceph cluster I did this
A. create vm normally using proxmox librbd
B. test vm performance
C. stop vm
D. remove extra rbd image features that dont work with krbd
E- switch the whole proxmox cluster rbd storage to krbd (this means if at that moment any vm reboot it cannot boot but that was just for 1 min till i do the test)
F- test performance on the same vm after switching and we got 10X gain
G- revert the rbd storage in proxmox to non krbd
H- reboot the vm back which has some features missing like object map and retest which gave bad results again
I- reading the rbd image features that was removed manually and rebuilding the object map for the rbd image then retest again and no clear gain .

Note the images we are testing on are 2TB+ maybe that is a factor which give advantage to krbd not sure.


So does anyone have explanation as i plan to switch everything to krbd but need to understand first

Thanks
 
8x seem really huge. I'm around 2-1,6x slower on jewel. (and it's with rand 4k read, I'm around 70000iops with krbd vs 40000iops with librbd by iothread disk)

the main difference is memory management, with a lot of extra copy in librbd. This has been fixed in last librbd + ceph luminous. (I don't have tested it yet). that's increase latency, so lower iops.

some simple tuning tips :

-disable cephx auth
-don't use cache=writeback
-disable debug options in ceph.conf


Can you provide your test method + results, and ceph.conf ?
 
actually the krbd results seems more like the actual results that we should get considering the size of the cluster .
for the optimizations
-disable cephx auth
done already
-don't use cache=writeback
Does this give better performance ? I domt remeber trying without caching assuming i must cache
-disable debug options in ceph.conf
done already

here is my ceph.conf it has some fields that i am not sure why they are there :)

and just a note when using librbd i dont use ceph.conf on the proxmox node , i only add it when running on krbd
Code:
[global]
mon initial members = alm1,alm2
debug_client = 0/0
mon host = 10.100.100.1,10.100.100.2
debug_filer = 0/0
debug_objectcatcher = 0/0
ms_dispatch_throttle_bytes = 0
debug_rgw = 0/0
debug_crush = 0/0
debug_mon = 0/0
debug_buffer = 0/0
debug_tp = 0/0
cephx sign messages = False
debug_journaler = 0/0
osd_op_num_shards = 25
filestore_op_threads = 4
debug_journal = 0/0
debug_lockdep = 0/0
auth service required = none
debug_auth = 0/0
debug_objclass = 0/0
fsid = 3f85de46-d14f-406c-a99e-e7e84de13029
filestore_fd_cache_size = 64
debug_asok = 0/0
debug_paxos = 0/0
debug_filestore = 0/0
debug_perfcounter = 0/0
cephx require signatures = False
cluster network = 10.100.100.0/24
debug_ms = 0/0
bluestore fsck on mount = False
debug_timer = 0/0
debug_optracker = 0/0
auth cluster required = none
filestore_xattr_use_omap = True
debug_osd = 0/0
enable experimental unrecoverable data corrupting features = bluestore rocksdb
debug_rados = 0/0
filestore_fd_cache_shards = 32
public network = 10.100.100.0/24
debug_rbd = 0/0
debug_finisher = 0/0
auth supported = none
osd_op_num_threads_per_shard = 1
osd_op_threads = 5
debug_heartbeatmap = 0/0
debug_throttle = 0/0
debug_monc = 0/0
debug_objecter = 0/0
ms_nocrc = True
auth client required = none
debug_context = 0/0
max open files = 131072
throttler_perf_counter = False

[mon]
mon allow pool delete = true


[client.libvirt]
admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be writable by QEMU and allowed by SELinux or AppArmor
log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and allowed by SELinux or AppArmor

[osd]
osd_client_message_size_cap = 0
osd mount options xfs = noatime,largeio,inode64,swalloc
osd mkfs options xfs = -f -i size=2048
osd_enable_op_tracker = False
osd_client_message_cap = 0
osd mkfs type = xfs
osd journal size = 5120
osd crush update on start = false
 
>>-don't use cache=writeback
>>Does this give better performance ? I domt remeber trying without caching assuming i must cache
It's only speedup sequential write of small blocks (merging them beforge push object to ceph), but for all others workload,
it's a extra copy in memory, so it's increase latency.

>>and just a note when using librbd i dont use ceph.conf on the proxmox node , i only add it when running on krbd

you should, at least with debug_...= 0/0 values
 
will try them today

never thought about the debug in config file and the writeback

Thanks
 
Hi @spirit

I tried with ceph.conf file on server and I didnt get any speed advantage
also tried with no caching and same thing actually performance went down a little

here are the numbers

1- normal librbd current situation
write : bw=5176.1KB/s, iops=1294
read : bw=3199.7KB/s, iops=799


2- librbd + no writeback cache
write : bw=3978.6KB/s, iops=994
read : bw=1729.1KB/s, iops=432


3- KRBD + no writeback cache
write : bw=10393KB/s, iops=2598
read : bw=4893.5KB/s, iops=1223



4- KRBD + optimizations + writeback cache
write : bw=124158KB/s, iops=31039
read : bw=10290KB/s, iops=2572


Those numbers are wierd why the KRBD much faster than librbd
also i noticed that read speed and iops is slow compared to other benchmarks i see


I tried with proxmox5 and the numbers did improve but in maybe 15% so the issue is not related to ceph client version
Thanks
 
Hi @spirit

I tried with ceph.conf file on server and I didnt get any speed advantage
also tried with no caching and same thing actually performance went down a little

here are the numbers

1- normal librbd current situation
write : bw=5176.1KB/s, iops=1294
read : bw=3199.7KB/s, iops=799


2- librbd + no writeback cache
write : bw=3978.6KB/s, iops=994
read : bw=1729.1KB/s, iops=432


3- KRBD + no writeback cache
write : bw=10393KB/s, iops=2598
read : bw=4893.5KB/s, iops=1223



4- KRBD + optimizations + writeback cache
write : bw=124158KB/s, iops=31039
read : bw=10290KB/s, iops=2572


Those numbers are wierd why the KRBD much faster than librbd
also i noticed that read speed and iops is slow compared to other benchmarks i see


I tried with proxmox5 and the numbers did improve but in maybe 15% so the issue is not related to ceph client version
Thanks

How do you bench ? random or sequential ? block size ?

I don't see librbd 8x slower in your results as you said previously. (twice slower is expected, I don't known if you have also done test on proxmox 5 with luminious client + luminous server ?)
only krbd writeback seem strange (librbd writeback use rbd_cache, but not krbd)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!