ceph-performance and latency

Does anyone tried the option filestore flusher = "false" ?
Hi,
no - but the one blog post blog post say that's not the best idea with xfs:
Code:
By default the filestore flusher is only enabled for operations larger than 64k. By explicitly disabling it, we see a nice performance boost for BTRFS, but the opposite effect for XFS. Enabling journal AIO is again providing a performance boost.
 
regarding: nobarrier should ONLY be used with monitored bbu and no relearning cycle
We have bbu units on our 3ware cards.

However I do not understand ' no relearning cycle' .

Is that a set in ceph or on card?
 
does someone got nice performance out of the ceph cluster. reading around seems that ceph performs really poor.

@patrick - did you get better speed meanwhile?

i installed a new ceph cluster now (i had a small testcluster running for testing only)
3 nodes with: 2 times 2,6ghz xeons, 128g ram, 15 sata disks for ceph -journal on disks, 10gbit for external traffic 10gbit for osd traffic
with reads i can saturate the 10gig network
but writes with replication of 2 gives me max 400mb/sec with rados bench 4mb (played with differnet threads)

so with 45 disks / replica 2 = 22 disks / jornal & file write (2) X 100 mb for each disk = 1100mb/sec theoretical write for the cluster.
but 400mb compared with 1100mb/sec is a huge difference

faster sata disks have seq writes around 120mb+ so diveded by 2 because of journal would be around 60
i have max 20mb write per disk.

can somebody confirm that his cluster (without ssd journal) can write at least 40-50mb/sec per disk? or 90-100mb/disk with ssd journal?
 
http://ceph.com/community/ceph-performance-part-1-disk-controller-write-throughput/


so the raid controllers make big difference!
i use the following card: MegaRAID SAS 9271-8i wich has also the LSISAS2208 Dual-Core RAID on Chip (ROC) on board. on 4k writes it performs best on the benchmarks- but on 4M writes its the worst.
i can only use raid0 per disk. because no jbod mode.


so some of you maybe use similar raid controllers?
Hi,
I started also with lsi-crapware which are don't support pass through - I must create an raid-0 for each disk...
The first change was to replace this cards - with an lsi-sas controller (no raid). More write performance is possible with an areca-raid controller (which all can pass through) which have an BBU.

Udo
 
do you have a performance comparison between old and new controllers?
should be more then 2 times the speed. (and what i would expect from my setup)
only 4k writes should be little bit slower
 
do you have a performance comparison between old and new controllers?
should be more then 2 times the speed. (and what i would expect from my setup)
only 4k writes should be little bit slower
Sorry, no!
I have change also a lot during the time (upgrade ceph, switch from self-partitioning to ceph-deploy, switch OS from OSD nodes, change mount options, enable read ahead buffers) and still looking for improvements.

Udo
 
ok but what more or less was the performance gain?
do you also think that 500mb write is bad for 45 disks (journals on disks) replica 2?
 
root@ceph2:/var/lib/ceph/osd/ceph-16# dd bs=1M count=2560 if=/dev/zero of=test conv=fdatasync,notrunc
gives me around 170mb per disk which is ok
so theoretical speed should be around 1900mb/sec for 45 disk with replica 2 and journal on disks
if ceph would "eat" 30% of the speed only it would max out the 10gig network without problem. but i only get 500mb/sec

das somebody experienced that the disks writes normally 170mb but with ceph max writes are at 40mb? its just the raid controller behaving that different for direct sequential writes?
 
Just an FYI: people on the ceph-users mailing list have reported drops in ceph performance after upgrading from dumpling to firefly and from emperor to firefly respectively, with decreases of up to 20% in performance. Investigations for this are apparently underway atm ("Yeah, it's fighting for attention with a lot of other urgent stuff.")
 
Hi,
I started also with lsi-crapware which are don't support pass through - I must create an raid-0 for each disk...
The first change was to replace this cards - with an lsi-sas controller (no raid). More write performance is possible with an areca-raid controller (which all can pass through) which have an BBU.

Udo

Hello Udo,

I want to purchase a sas/sata card to test instead of our older 3ware/lsi cards.

can you give a suggestion or 2 for areca controller series or models? I see quite a few different ones at the areca homepage..

also is there a decent gui for areca ?
 
Hi,
More write performance is possible with an areca-raid controller (which all can pass through) which have an BBU.
Controller's cache is usually ignored in pass through more. Actually, I've never seen it working. Only single-drive "RAIDs".
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!