How to slow down CEPH Rebalance

merkkg

Member
Sep 7, 2020
5
0
6
37
Hi,

I have a 3 node cluster, that is working perfectly for my requirements

Each node is identical with below setup
CPU:
Dual Socket Intel Xeon 4214 CPU @ 2.20GHz
Memory: 256GB Ram
Network Card: 2 x 100G ports
Disk Controller: MegaRAID® SAS 9380-8e RAID Controller (JBOD Mode) connected to Supermicro CSE-216BE1C-R741JBOD
Disks: 16 x Samsung MZ7LH3T8HMLT PM883 3.84 TB
but problem is when a host goes down or for some reason rebalancing needs to happen its too fast causing all VM's on cluster to have issues due to the apply/commit latency on OSDs goes above 500

How can I slow down the rebalancing to not affect real usage or prioritize real usage over rebalancing.

The rebalancing is only happening at 4GB/s I would expect it to handle alot more.

Another question is during the rebalancing CPU usage goes very high mainly by the ceph-osd processes at times 90% on each host. Main pool with most data (60T) is a CEPHFS pool with 2048 placement groups, size 2/1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!