ceph recovery slow and won't change number of backfills

ewaldo

Member
Jan 1, 2020
8
1
23
45
I have a 3 node cluster and I replaced 3x8TB drives and the recovery/rebalance is going to take 20+ days and maxes out at 10MiB which seems exceptionally slow. I've changed all relevant parameters, though setting them at a global level doesn't have any effect. It does change when set at the OSD level. It still won't perform more than 3 backfills at a time. It is on a 100GBe network so the network speed is definitely not a limit.

root@host3:~# ceph-conf --show-config | egrep "osd_recovery_max_active|osd_recovery_op_priority|osd_max_backfills|osd_mclock_override_recovery_settings"

osd_max_backfills = 1
osd_mclock_override_recovery_settings = false
osd_recovery_max_active = 0
osd_recovery_max_active_hdd = 3
osd_recovery_max_active_ssd = 10
osd_recovery_op_priority = 3


All OSDs are set the same
root@host3:~# ceph daemon osd.5 config get osd_recovery_max_active
{
"osd_recovery_max_active": "10"
}
root@host3:~# ceph daemon osd.5 config get osd_recovery_op_priority
{
"osd_recovery_op_priority": "3"
}
root@host3:~# ceph daemon osd.5 config get osd_max_backfills
{
"osd_max_backfills": "10"
}
root@host3:~# ceph daemon osd.5 config get osd_mclock_override_recovery_settings
{
"osd_mclock_override_recovery_settings": "true"
}
root@host3:~# ceph daemon osd.5 config get osd_recovery_max_active_hdd
{
"osd_recovery_max_active_hdd": "10"
}
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!