Hi,
I've a Ceph setup wich I upgraded to the latest version and moved all disks to bluestore. Now performance is pretty bad. I get IO delay of about 10 in worst case.
I use 10GE mesh networking for Ceph. DBs are on SSD's and the OSD's are spinning disks.
Situation while doing a W10 setup started about 23:35:
In "normal" operation (before 23:35) IO delay never drops below 2. In my other, non Ceph setups, it normally is zero. How to fix this?
TIA
I've a Ceph setup wich I upgraded to the latest version and moved all disks to bluestore. Now performance is pretty bad. I get IO delay of about 10 in worst case.
I use 10GE mesh networking for Ceph. DBs are on SSD's and the OSD's are spinning disks.
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.15.15.0/24
filestore xattr use omap = true
fsid = e9a07274-cba6-4c72-9788-a7b65c93e477
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 1
public network = 10.15.15.0/24
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
[mds.pve02]
host = pve02
mds standby for name = pve
[mds.pve03]
host = pve03
mds standby for name = pve
[mds.pve01]
host = pve01
mds standby for name = pve
[mon.2]
host = pve03
mon addr = 10.15.15.7:6789
[mon.1]
host = pve02
mon addr = 10.15.15.6:6789
[mon.0]
host = pve01
mon addr = 10.15.15.5:6789
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.15.15.0/24
filestore xattr use omap = true
fsid = e9a07274-cba6-4c72-9788-a7b65c93e477
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 1
public network = 10.15.15.0/24
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
[mds.pve02]
host = pve02
mds standby for name = pve
[mds.pve03]
host = pve03
mds standby for name = pve
[mds.pve01]
host = pve01
mds standby for name = pve
[mon.2]
host = pve03
mon addr = 10.15.15.7:6789
[mon.1]
host = pve02
mon addr = 10.15.15.6:6789
[mon.0]
host = pve01
mon addr = 10.15.15.5:6789
cluster:
id: e9a07274-cba6-4c72-9788-a7b65c93e477
health: HEALTH_OK
services:
mon: 3 daemons, quorum 0,1,2
mgr: pve01(active), standbys: pve03, pve02
mds: cephfs-1/1/1 up {0=pve02=up:active}, 2 up:standby
osd: 18 osds: 18 up, 18 in
data:
pools: 4 pools, 1248 pgs
objects: 320.83k objects, 1.19TiB
usage: 3.64TiB used, 4.55TiB / 8.19TiB avail
pgs: 1248 active+clean
io:
client: 115KiB/s rd, 53.2MiB/s wr, 79op/s rd, 182op/s wr
id: e9a07274-cba6-4c72-9788-a7b65c93e477
health: HEALTH_OK
services:
mon: 3 daemons, quorum 0,1,2
mgr: pve01(active), standbys: pve03, pve02
mds: cephfs-1/1/1 up {0=pve02=up:active}, 2 up:standby
osd: 18 osds: 18 up, 18 in
data:
pools: 4 pools, 1248 pgs
objects: 320.83k objects, 1.19TiB
usage: 3.64TiB used, 4.55TiB / 8.19TiB avail
pgs: 1248 active+clean
io:
client: 115KiB/s rd, 53.2MiB/s wr, 79op/s rd, 182op/s wr
proxmox-ve: 5.4-2 (running kernel: 4.15.18-15-pve)
pve-manager: 5.4-8 (running version: 5.4-8/51d494ca)
pve-kernel-4.15: 5.4-5
pve-kernel-4.15.18-17-pve: 4.15.18-43
pve-kernel-4.15.18-16-pve: 4.15.18-41
pve-kernel-4.15.18-15-pve: 4.15.18-40
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
ceph: 12.2.12-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-11
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-53
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-44
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-54
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
pve-manager: 5.4-8 (running version: 5.4-8/51d494ca)
pve-kernel-4.15: 5.4-5
pve-kernel-4.15.18-17-pve: 4.15.18-43
pve-kernel-4.15.18-16-pve: 4.15.18-41
pve-kernel-4.15.18-15-pve: 4.15.18-40
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
ceph: 12.2.12-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-11
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-53
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-44
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-54
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
Situation while doing a W10 setup started about 23:35:
In "normal" operation (before 23:35) IO delay never drops below 2. In my other, non Ceph setups, it normally is zero. How to fix this?
TIA
Last edited: