Disk slows down after resize (ceph)

mart.v · Jul 10, 2018

Hi all,

I'm facing a strange problem. I'm using latest Proxmox with Ceph storage backend (SSD only), 10Gbit network, KVM virtualization, CentOS in guest.

When I create a fresh VM with 10 GB attached Ceph storage (cache disabled, virtio drivers), I'm getting roughly these speeds in fio:

READ: bw=115MiB/s (120MB/s), 115MiB/s-115MiB/s (120MB/s-120MB/s), io=1209MiB (1267MB), run=10552-10552msec
WRITE: bw=49.0MiB/s (51.4MB/s), 49.0MiB/s-49.0MiB/s (51.4MB/s-51.4MB/s), io=518MiB (543MB), run=10552-10552msec

After resizing storage to 100 GB (I only resize attached image in proxmox interface, I do not touch filesystem/partition table, so inside the guest there is still 10 GB partition), fio benchmark drops to:

READ: bw=20.7MiB/s (21.7MB/s), 20.7MiB/s-20.7MiB/s (21.7MB/s-21.7MB/s), io=504MiB (529MB), run=24359-24359msec
WRITE: bw=9039KiB/s (9256kB/s), 9039KiB/s-9039KiB/s (9256kB/s-9256kB/s), io=215MiB (225MB), run=24359-24359msec

No other changes were made to the system (reboot, etc.). Proxmox is running in test mode and no other VMs have impact on cluster performance (=there is no other workload).

Thank you for your tips / advice.

spirit · Jul 11, 2018

This is strange. for new space, it could be explain because objects are not yet allocated on ceph. But as you don't have resize the fs, it can't be that.

do you have still performance problem after stop/start of the vm ?

mart.v · Jul 11, 2018

Thanks for the reply. Yes, the problem persists after stop/start. I tried to run it multiple times and the results were similar.

I run only 3 node cluster with 2 osds per host. Drives are Intel S4500. But I dont know if this is relevant to my problem.

Alwin · Jul 11, 2018

How does your fio benchmark look like?

mart.v · Jul 11, 2018

fio --filename=/dev/sda --direct=1 --rw=randrw --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=8k --rwmixread=70 --iodepth=16 --numjobs=16 --runtime=60 --group_reporting --name=8k7030test

But I started to investigate this issue after I found out that regular work with disk is very slow. So I believe that this is not an issue with benchmark.

mart.v · Jul 11, 2018

When I tried to increase the block size to 128k, I'm getting on the 10GB VM speeds like:
READ: bw=592MiB/s (621MB/s), 592MiB/s-592MiB/s (621MB/s-621MB/s), io=16.9GiB (18.2GB), run=29251-29251msec
WRITE: bw=253MiB/s (265MB/s), 253MiB/s-253MiB/s (265MB/s-265MB/s), io=7393MiB (7752MB), run=29251-29251msec

On the "big" 100GB VM results are like this:
READ: bw=152MiB/s (159MB/s), 152MiB/s-152MiB/s (159MB/s-159MB/s), io=5414MiB (5677MB), run=35724-35724msec
WRITE: bw=64.0MiB/s (68.2MB/s), 64.0MiB/s-64.0MiB/s (68.2MB/s-68.2MB/s), io=2322MiB (2435MB), run=35724-35724msec

spirit · Jul 11, 2018

can you try a bench 100% read and another 100%write, to see the difference ?

mart.v · Jul 11, 2018

spirit said:
can you try a bench 100% read and another 100%write, to see the difference ?

Hmm, interesting. It seems that there is no difference

fio --name=seqwrite --rw=write --direct=1 --ioengine=libaio --bs=32k --numjobs=4 --size=2G --runtime=600 --group_reporting
Gives me about 50-60 MB/s on both VM's

fio --name=seqread --rw=read --direct=1 --ioengine=libaio --bs=8k --numjobs=8 --size=1G --runtime=600 --group_reporting
Gives me about 100-110 MB/s on both VM's

But the original command (fio --filename=/dev/sda --direct=1 --rw=randrw --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=8k --rwmixread=70 --iodepth=16 --numjobs=16 --runtime=60 --group_reporting --name=8k7030test) still shows big difference.

Search

Search

Disk slows down after resize (ceph)

mart.v

Well-Known Member

spirit

Distinguished Member

mart.v

Well-Known Member

Alwin

Proxmox Retired Staff

mart.v

Well-Known Member

mart.v

Well-Known Member

spirit

Distinguished Member

mart.v

Well-Known Member