Ceph read speed inside a VM

tigron

New Member
Aug 8, 2014
3
0
1
Hello all,

I'm testing proxmox with ceph and created the following hardware configuration:

- 2 * Intel Journal SSD
- 11 * 6TB Seagate OSD disks + 1 * 2TB WD disk

2 * replication - 1024 PG on a test pool - ceph health OK

I have following read and write speeds according to rados bench:

Write bench:
Total time run: 100.174462
Total writes made: 6553
Write size: 4194304
Bandwidth (MB/sec): 261.663


Stddev Bandwidth: 49.1988
Max bandwidth (MB/sec): 356
Min bandwidth (MB/sec): 0
Average Latency: 0.244563
Stddev Latency: 0.21535
Max latency: 2.51924
Min latency: 0.015474


Read bench:
Total time run: 100.170912
Total reads made: 24852
Read size: 4194304
Bandwidth (MB/sec): 992.384


Average Latency: 0.06447
Max latency: 0.766223
Min latency: 0.005859


Rados read and write seems to be ok.
When launching a VM machine (latest debian distro) with a ceph disk, (RBD, virtio disk, deadline on VM disk, new xfs/ext4 or even btrfs partition) I have the following results:

Write test:
dd if=/dev/zero of=cephtest bs=16k count=1M
1048576+0 records in
1048576+0 records out
17179869184 bytes (17 GB) copied, 83.2059 s, 206 MB/s
seems reasonable -> disk io (confirmed with iostat on ceph server(s)) goes to 90%


Read test:
dd if=cephtest of=/dev/null bs=16k count=1M
1048576+0 records in
1048576+0 records out
17179869184 bytes (17 GB) copied, 168.887 s, 102 MB/s
Not ok -> disk io (confirmed with iostat on ceph server(s)) goes to 20/30 %

Anybody has a reason why the difference is that big ?

When I do a iostat in the local machine with normal behavior I see the following:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vdb 3.00 0.00 214.00 5.00 13516.00 1364.00 135.89 154.96 28.55 9.40 848.00 4.57 100.00%
vdb 2.00 0.00 186.00 7.00 13396.00 3552.00 175.63 149.38 80.58 10.22 1950.29 5.18 100.00%
 
Hi to all

I am interested in the results of this test after that the tuning will be applied.

But before say a wrong opinion, i want to ask: who have access to the block devices on CEPH, the VM or the PVE host?

Best regards
Cesar
 
Thinking about of this problem, i am sure about my thought if CEPH is in the same physical host where exist also PVE (I know that what I will say next isn't the perfect solution, but the VM will have a better behavior):

The problem is this:
- The VM has its own I/O scheduler
- The Host has its own I/O scheduler
Problem: Each I/O scheduler aren't speaking between them => the VM believe that he have the direct control of the hard disk (but that he is wrong, the hard disk is virtual), then the times of wait of I/O are different due to the lack of control, and as result we lose performance due to the wait of the process.

Then, as Red Hat knowing about of this problem of performance is that he recommend to use other "I/O Scheduler" for the VMs, and the recomendation is to use "noop", that means "no optimized", of this mode, the VM don't have worry of optimize the access of disk, and as a side effect, the VM don't will use more resource of processor unnecessarily trying to optimize something of that don't have absolute control.

Best regards and good luck with this situation (awaiting the perfect solution to this problem)
Cesar
Re edited: If i am wrong in any thing, please correct me (in special the spirit user - he is a Master of Masters)
 
Last edited:
Hi,

you are comparing rados-bench vs dd, it's like pear vs apple.

dd is pure sequential, 1 thread.

Try with something like "fio" inside the guest, with a queue depth bigger than 1.

Well, I tried different dd processes at once, but I only got it to 140MB/sec or so, still only 25 a 30% read performance is used on my disks :(
 
Thinking about of this problem, i am sure about my thought if CEPH is in the same physical host where exist also PVE (I know that what I will say next isn't the perfect solution, but the VM will have a better behavior):

The problem is this:
- The VM has its own I/O scheduler
- The Host has its own I/O scheduler
Problem: Each I/O scheduler aren't speaking between them => the VM believe that he have the direct control of the hard disk (but that he is wrong, the hard disk is virtual), then the times of wait of I/O are different due to the lack of control, and as result we lose performance due to the wait of the process.

Then, as Red Hat knowing about of this problem of performance is that he recommend to use other "I/O Scheduler" for the VMs, and the recomendation is to use "noop", that means "no optimized", of this mode, the VM don't have worry of optimize the access of disk, and as a side effect, the VM don't will use more resource of processor unnecessarily trying to optimize something of that don't have absolute control.

Best regards and good luck with this situation (awaiting the perfect solution to this problem)
Cesar
Re edited: If i am wrong in any thing, please correct me (in special the spirit user - he is a Master of Masters)

Yes, I think noop|deadline scheduler can be used in vm with ceph. (and shared storage in general)
Host scheduler is not used here, because kvm process talk directly through librbd to ceph storage.
 
Yes, I think noop|deadline scheduler can be used in vm with ceph. (and shared storage in general)
Host scheduler is not used here, because kvm process talk directly through librbd to ceph storage.
It could be interesting to see some benchmarks using default scheduler versus noop and deadline.
 
Can you run this bench inside the vm :

"fio --direct=1 --rw=read --ioengine=libaio --bs=4k --iodepth=32 --numjobs=1 --group_reporting"

?
That line will fail. This '--filename=/dev/vdx' requires an existing file. You cannot read from a block device.

Edit: Wrong assumption. Reading from a block device is possible.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!