Ceph very slow rebalancing ~300Kib

Y0ngg4n

New Member
Jul 25, 2023
5
0
1
Hi i have recreated a osd in my hyperconverged cluster. I have a 10Gbit link so it should be really fast to rebalance. But it seems like to rebalance only with some kilobytes:
ceph.png

I have already set

ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
ceph tell 'osd.*' injectargs '--osd-max-backfills 16'

But does not seem to change something.

Maybe somebody can help me?
 
I get 100's in write IOPS and read IOPS are 2x-3x of write IOPS using SAS 10K RPM HDDs. This is with a 5-node Dell 12th-gen 16-drive bay servers using 10GbE.

I am guessing you are using consumer SSDs? They bottleneck real quick once their internal cache is filled. You'll want enterprise SSD with PLP ( power-loss prevention.
 
  • Like
Reactions: Darkk
No the osds are 3x Crucial NVME M2 PCI 3 2TB drives. So they should not be slow also with a 10GbE connection.
 

Crucial P3 2TB M.2 PCIe Gen3 NVMe Interne SSD, Bis zu 3500MB/s - CT2000P3SSD8​


This is a cheapest consumer SSD, not suited for Ceph. (You will see countless reports here in the community with similar issues with consumer hardware and Ceph)
 
Ok can you give me an example of a good nvme for ceph?
It seems like currently the rebalancing settled at 20-30MiB per second
What is a normal speed?
1690275855287.png
 
Last edited:
Anything that has Power-loss-protection (PLP) is a good start. See for example: https://geizhals.at/?cat=hdssd&xf=4643_Power-Loss+Protection

Since the SSD has capacitors to provide enough power to write everything in cache down to non-volatile memory, it can ACK sync writes much quicker. Ceph issues a lot of sync writes.
 
Last edited:
  • Like
Reactions: Y0ngg4n
I am having a similar issue.. oddly though initially it was recovering at 500-800 MB/s and now it's slowed down to 5% of that.. All of my volumes (even slow SATA drives) are backed by NVME, although they are not (yet) the enterprise grade (which I am installing next week).

For my bulk storage though, I am using 3.5 inch SATA drives; they are all enterprise quality drives, but don't have capacitors like an SSD can have. For these scenarios, would using at least an Enterprise SSD for the WAL enhance performance? Trying to get a handle on this before I reconfigure everything.



1698969181010.png
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!