poor CEPH performance

frantek

Well-Known Member
May 30, 2009
165
7
58
Hi,

I've a Ceph setup wich I upgraded to the latest version and moved all disks to bluestore. Now performance is pretty bad. I get IO delay of about 10 in worst case.

I use 10GE mesh networking for Ceph. DBs are on SSD's and the OSD's are spinning disks.

[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.15.15.0/24
filestore xattr use omap = true
fsid = e9a07274-cba6-4c72-9788-a7b65c93e477
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 1
public network = 10.15.15.0/24
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
[mds.pve02]
host = pve02
mds standby for name = pve
[mds.pve03]
host = pve03
mds standby for name = pve
[mds.pve01]
host = pve01
mds standby for name = pve
[mon.2]
host = pve03
mon addr = 10.15.15.7:6789
[mon.1]
host = pve02
mon addr = 10.15.15.6:6789
[mon.0]
host = pve01
mon addr = 10.15.15.5:6789

cluster:
id: e9a07274-cba6-4c72-9788-a7b65c93e477
health: HEALTH_OK

services:
mon: 3 daemons, quorum 0,1,2
mgr: pve01(active), standbys: pve03, pve02
mds: cephfs-1/1/1 up {0=pve02=up:active}, 2 up:standby
osd: 18 osds: 18 up, 18 in

data:
pools: 4 pools, 1248 pgs
objects: 320.83k objects, 1.19TiB
usage: 3.64TiB used, 4.55TiB / 8.19TiB avail
pgs: 1248 active+clean

io:
client: 115KiB/s rd, 53.2MiB/s wr, 79op/s rd, 182op/s wr

proxmox-ve: 5.4-2 (running kernel: 4.15.18-15-pve)
pve-manager: 5.4-8 (running version: 5.4-8/51d494ca)
pve-kernel-4.15: 5.4-5
pve-kernel-4.15.18-17-pve: 4.15.18-43
pve-kernel-4.15.18-16-pve: 4.15.18-41
pve-kernel-4.15.18-15-pve: 4.15.18-40
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
ceph: 12.2.12-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-11
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-53
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-44
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-54
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2

Situation while doing a W10 setup started about 23:35:

cpu.png

In "normal" operation (before 23:35) IO delay never drops below 2. In my other, non Ceph setups, it normally is zero. How to fix this?

TIA
 
Last edited:
The IO delay is the outstanding IO of the system, with 6x OSDs (HDDs) in each node, there is considerable more IO going on. Besides the IO delay, how do you define that Ceph's performance is poor?

Please further describe your hardware and run a rados bench.
The benchmark commands can be found in the Ceph benchmark paper (PDF), plus their results.
https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2018-02.41761/
 
Poor means W10 setup takes about 30 minutes, instead of less than 10 minutes due to slow disks. VM's are slow. With my old PVE4 setup with Ceph and without blue store on the same hardware the problem did not exist. The old system was slower than singel nodes with a RAID controller too but not that drastically.

Total time run: 60.602509
Total writes made: 2096
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 138.344
Stddev Bandwidth: 28.1865
Max bandwidth (MB/sec): 204
Min bandwidth (MB/sec): 68
Average IOPS: 34
Stddev IOPS: 7
Max IOPS: 51
Min IOPS: 17
Average Latency(s): 0.462364
Stddev Latency(s): 0.222208
Max latency(s): 2.16543
Min latency(s): 0.110608

"rados bench 60 read -t 16 -p pve" did not work for some reason.

Hardware, 3 Nodes with two 10GE Intel X540-AT2 NICs for Ceph mesh, each:

FAMILY System x
MANUFACTURER IBM
PRODUCT System x3650 M3
TOTAL USABLE RAM 70.74 GB
TOTAL NUMBER OF CORES 8
CORES PER CPU 4
TOTAL NUMBER OF CPUS 2
MAXIMUM SPEED 4.40 GHz
 
Last edited:
What disks are you using?
Different brands and models of 500 GB SATA disks.

And did the rados read test give any errors?
None, just the usage.

From a quick search, it seems that it has a RAID card by default, is Ceph running on it? If so, please read the following link. https://pve.proxmox.com/pve-docs/chapter-pveceph.html#_precondition
Off course not. It runs in JBOD mode.

Again: The problem popped up after upgrade from PVE4 to 5 and got even worse by switching to blue store.
 
Different brands and models of 500 GB SATA disks.


None, just the usage.


Off course not. It runs in JBOD mode.

Again: The problem popped up after upgrade from PVE4 to 5 and got even worse by switching to blue store.

JBOD mode still isn't a HBA and can cause issues. I get you had ok performance before hand (Allthough it sounds like it still wasn't were it should be), but its still not the proper configuration from my understanding and experience. A true HBA really makes a difference.

I noticed from your ceph status your seeing about 50MB/s in writes, not amazing but not horrible. Do you recall what you were getting before the upgrade? Is that 50MB/s from the win10 install or is that IO created from another process?
 
I noticed from your ceph status your seeing about 50MB/s in writes, not amazing but not horrible. Do you recall what you were getting before the upgrade? Is that 50MB/s from the win10 install or is that IO created from another process?
Sadly not. And yes, the 50 MB/s are from the Win10 install.

I had a look at my Nagios graphs and they proof me wrong:

nagios-graph.png

Perhaps it's just me, but compared to singel nodes with RAID5 my Ceph cluster is slow.
 
Did you change any other settings? Did you add an Pool or increase the PG?

Any change you made after upgrade to PVE5 can cause this problem.
 
No, I've just followed the instructions in the PVE wiki for the upgrade.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!