ceph rebalance osd

ilia987 · Apr 7, 2020

how i can force a rebalance
i have 3 node 4 ssds each (sas3 segate nitro 3.8 TB)

the performance is amazing i get over 2GBs read\write (of real usage load peak),

but i got some warnings because the balance is not ideal.
any idea what i should do ?

Alwin · Apr 7, 2020

You will need more OSDs. Even if you can distribute the PGs better. The cluster can not sustain a OSD failure or more data without going into read-only.

ilia987 · Apr 7, 2020

i know, we in the process of ordering more,
i am still looking for best performance\value for our company..

currently there are no deals for fast sas3 drives..

Alwin · Apr 7, 2020

Then best remove some data from Ceph to reduce the fill level. A TRIM inside the VMs could help already.

EDIT: you could also create a slower pool with some spinners.

ilia987 · Apr 7, 2020

Alwin said:
Then best remove some data from Ceph to reduce the fill level. A TRIM inside the VMs could help already.

EDIT: you could also create a slower pool with some spinners.

slower pool is not relevant for this, because this storage have two main tasks, host our vms, and provide data for our computational grid

Liviu Sas · Apr 8, 2020

Seems quite well balanced.
But if you want to extract a little bit of extra space on a , the ceph balancer plugin did miracles for me on a cluster with a mix of very random drive sizes

https://docs.ceph.com/docs/mimic/mgr/balancer/

ceph mgr module enable balancer
ceph balancer on
ceph balancer mode upmap

Good luck

ilia987 · Apr 8, 2020

Liviu Sas said:
Seems quite well balanced.
But if you want to extract a little bit of extra space on a , the ceph balancer plugin did miracles for me on a cluster with a mix of very random drive sizes

https://docs.ceph.com/docs/mimic/mgr/balancer/

ceph mgr module enable balancer
ceph balancer on
ceph balancer mode upmap

Good luck

how to set it in proxmox, just as root shell from one of the main ceph hosts?

Liviu Sas · Apr 8, 2020

ilia987 said:
how to set it in proxmox, just as root shell from one of the main ceph hosts?

Root shell on a node running the ceph mgr.

ilia987 · Apr 20, 2020

i got an error:

Code:

root@pve-srv3:~# ceph balancer mode upmap
Error EPERM: min_compat_client "jewel" < "luminous", which is required for pg-upmap. Try "ceph osd set-require-min-compat-client luminous" before enabling this mode

Alwin · Apr 20, 2020

ilia987 said:
Error EPERM: min_compat_client "jewel" < "luminous", which is required for pg-upmap. Try "ceph osd set-require-min-compat-client luminous" before enabling this mode

pg-upmap has been introduced in luminous. Setting this will result in jewel clients not being able to connect to this Ceph cluster anymore.

ilia987 · Apr 20, 2020

what are the jewel clients?
i have basic ceph fs and pool

Alwin · Apr 20, 2020

Well, with all in all, I can only recommend to start reading Ceph's architecture guide.
https://docs.ceph.com/docs/nautilus/architecture/
https://docs.ceph.com/docs/nautilus/glossary/#term-ceph-clients

Liviu Sas · Apr 21, 2020

try to run "ceph features" to see if you have any old clients.
You can paste the result on this list.

There is a chance you have a cephfs client running on an older kernel version.
What proxmox version are you using?
What are the kernel versions on you nodes accessing the cephfs?

there was an issue in cephfs implementation in linux kernel <5.3 where the version was misreported:
https://lists.ceph.io/hyperkitty/li...d/RUBXOY2L4JD7AYXHTTTXNJI4BPE6S7TX/?sort=date

Cheers,
Liviu

ilia987 · Apr 22, 2020

Liviu Sas said:
try to run "ceph features" to see if you have any old clients.
You can paste the result on this list.

Code:

{
    "mon": [
        {
            "features": "0x3ffddff8ffacffff",
            "release": "luminous",
            "num": 3
        }
    ],
    "mds": [
        {
            "features": "0x3ffddff8ffacffff",
            "release": "luminous",
            "num": 3
        }
    ],
    "osd": [
        {
            "features": "0x3ffddff8ffacffff",
            "release": "luminous",
            "num": 12
        }
    ],
    "client": [
        {
            "features": "0x27018fb86aa42ada",
            "release": "jewel",
            "num": 5
        },
        {
            "features": "0x2f018fb86aa42ada",
            "release": "luminous",
            "num": 19
        },
        {
            "features": "0x3ffddff8ffacffff",
            "release": "luminous",
            "num": 3
        }
    ],
    "mgr": [
        {
            "features": "0x3ffddff8ffacffff",
            "release": "luminous",
            "num": 3
        }
    ]
}

Liviu Sas said:
What proxmox version are you using?

i think it is latest:

Code:

proxmox-ve: 6.1-2 (running kernel: 5.3.13-1-pve)
pve-manager: 6.1-5 (running version: 6.1-5/9bf06119)
pve-kernel-5.3: 6.1-1
pve-kernel-helper: 6.1-1
pve-kernel-5.0: 6.0-11
pve-kernel-4.15: 5.4-12
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-4.15.18-24-pve: 4.15.18-52
pve-kernel-4.15.18-21-pve: 4.15.18-48
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.1-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-2
pve-cluster: 6.1-2
pve-container: 3.0-16
pve-docs: 6.1-3
pve-edk2-firmware: 2.20191127-1
pve-firewall: 4.0-9
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-4
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2

Liviu Sas · Apr 22, 2020

Interesting, looks like you have 5 clients reporting version "jewel"

"client": [
{
"features": "0x27018fb86aa42ada",
"release": "jewel",
"num": 5
},

Most probably you are mounting the cephfs from some client using an old kernel version.
Can you make sure that all clients mounting the cephfs are using at least kernel-3.15?

If you can confirm that you do not have any clients using an older linux version, you should be able to force it by using:

ceph osd set-require-min-compat-client luminous

If you get the following error, you can add "--yes-i-really-mean-it" to the command to force it.

Error EPERM: cannot set require_min_compat_client to
luminous: 4 connected client(s) look like jewel (missing
0xa00000000200000)

ilia987 · Apr 22, 2020

Liviu Sas said:
Interesting, looks like you have 5 clients reporting version "jewel"

Most probably you are mounting the cephfs from some client using an old kernel version.
Can you make sure that all clients mounting the cephfs are using at least kernel-3.15?

If you can confirm that you do not have any clients using an older linux version, you should be able to force it by using:

If you get the following error, you can add "--yes-i-really-mean-it" to the command to force it.

just looked again i have 5 clients ( that mounted the cephfs ) and are outside of proxmox
they are

Ubuntu 18.04 LTS (GNU/Linux 4.15.0-88-generic x86_64)
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)

but it is luminous.

as fast as i know this the setup we have.

there are 4 uses cases in our ceph cluster:

lxc\vm inside proxmox
cephfs data storage (internal to proxmox, used by lxc's)
cephfs mount for 5 machines outside proxmox
one the the five machines re-share it for read only access for clients trough another network

Liviu Sas · Apr 22, 2020

Just as I thought

You can run

ceph osd set-require-min-compat-client luminous --yes-i-really-mean-it

ilia987 · Apr 22, 2020

we use it in our production environment, what are the risks ?

Liviu Sas · Apr 22, 2020

I had no issues on my home proxmox cluster

Put it this way, worst thing that can happen is that the 5 cephfs clients will lose access to cephfs. In that case you would need to set min-compat-client back to jewel.

But as anything, you should test it in a dev or test environment

ilia987 · Apr 22, 2020

Unfortunately we dont have a test environment

We are a small company,, this is all that we have.

i think we will just wait until new ssds arrive

ceph rebalance osd

Active Member

Attachments

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Active Member

Member

Active Member

Member

Active Member

Proxmox Retired Staff

Active Member

Proxmox Retired Staff

Member

Active Member

Member

Active Member

Member

Active Member

Member

Active Member