This may not be a question for PVE forums but no harm in asking
My experience with ceph treatment of OSDs is that the subsystem does not fail OSDs unless they are completely dropped on the bus. Failing disks (eg with read failures, even with trapped sense key errors) do NOT get dropped, and the OSD remains up and In- even if it bounces up and down slowing down all transactions. If I manually out a failing OSD, it eventually leads to slow ops but the OSD STILL does not get failed out and I have to go and manually down it.
Is this behavior controllable at all?
My experience with ceph treatment of OSDs is that the subsystem does not fail OSDs unless they are completely dropped on the bus. Failing disks (eg with read failures, even with trapped sense key errors) do NOT get dropped, and the OSD remains up and In- even if it bounces up and down slowing down all transactions. If I manually out a failing OSD, it eventually leads to slow ops but the OSD STILL does not get failed out and I have to go and manually down it.
Is this behavior controllable at all?