Slow CEPH performance

deepcloud

Member
Feb 12, 2021
79
14
13
India
deepcloud.in
Hi,

I have proxmox 8.1 running, will be upgrading to proxmox 8.2.2 soon.

We had one of the nodes crash and we have very slow rebuild speeds

We run AMD EPYC 7002 Series CPU with 64 Cores * 2, 2TB RAM, 15.36TB SN650 NVME SSD - WD Enterprise grade * 4 per node and we have 10G for interVM communication and a dedicated dual redundant active/passive 100G Network for CEPH Cluster

So there is enough horsepower i am sure, would want to understand how to speed up this.

Thanks in advance

1716403338721.png
 
This always happens toward the end of a rebalance as there are fewer and fewer osd targets left. the default tuning is to assure that rebalance doesnt clobber client IO, so toward the end it ends up being very conservative.

in your case, there is two things you can do.

1. increase your in flight rebalance ios. the tunables are osd_max_backfills and osd_recovery_max_active. If you have your devices set with proper device classes they should already be set a bit higher than default, but you can continue uppling the values until your guest io begins to suffer.

https://www.thomas-krenn.com/en/wiki/Ceph_-_increase_maximum_recovery_&_backfilling_speed

2. redeploy your OSDs with multiple OSDs per drive. NVMEs can handle a lot of io; as long as we're still limited to bluestore as the storage back for OSD, that would limit your drives to a single logical queue. you can benefit by splitting your drives to 4-8 OSDs which would make your "last OSD" that much smaller- and allow you to more fully utilize your nvme's throughput.
 
This always happens toward the end of a rebalance as there are fewer and fewer osd targets left. the default tuning is to assure that rebalance doesnt clobber client IO, so toward the end it ends up being very conservative.

in your case, there is two things you can do.

1. increase your in flight rebalance ios. the tunables are osd_max_backfills and osd_recovery_max_active. If you have your devices set with proper device classes they should already be set a bit higher than default, but you can continue uppling the values until your guest io begins to suffer.

https://www.thomas-krenn.com/en/wiki/Ceph_-_increase_maximum_recovery_&_backfilling_speed

2. redeploy your OSDs with multiple OSDs per drive. NVMEs can handle a lot of io; as long as we're still limited to bluestore as the storage back for OSD, that would limit your drives to a single logical queue. you can benefit by splitting your drives to 4-8 OSDs which would make your "last OSD" that much smaller- and allow you to more fully utilize your nvme's throughput.
Hi Alex,
Thanks for the above info. How can we redeploy with multipe OSD per drive any inputs on this ?
 
the RIGHT way is to remove OSDs one by one, but in your case that is not practical (your pool is too full for that.)

So what remains is either you remove about 15TB of raw data (~5.5TB used) before you start, or change WHOLE NODES at a time- which would have you operating degraded until rebalance is complete. If an option, wiping the whole storage and starting from scratch/restoring from backup would be the safest and probably quickest to full health, but would require some downtime.

I'll draw up the basic process here:
1. down out an OSD. wait for rebalance to complete.
2. out down and destroy the OSD once empty.
3. ceph-volume lvm batch --osds-per-device 4 /dev/nvmeXn1
4. if necessary, ceph-volume lvm activate --all
5. wait till rebalance is complete

repeat for all drives.

edit whoops, flipped down and out. queue George Carlin football and baseball routine.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!