How can I find out how much data CEPH repaired on my disks?

devedse

Member
Aug 20, 2023
26
10
8
Hi everyone,

One of the reasons we use CEPH is because it has a way to self repair itself incase of bitrot (just like ZFS and BTRFS). What I can't figure out though is how see what historical repairs were done and what disks had errors.

In ZFS you can quite easily see that a specific disk had read errors in the past x days which could be a good indication that it needs to be replaced. For CEPH I can't figure out how to see this.

Another thing I'm confused about is that on some places it's mentioned CEPH should be "self healing" whereas in this post it's mentioned that you need to manually repair a PG:
https://docs.ceph.com/en/pacific/rados/operations/pg-repair/

Claude ai gave me this command:
1760017147963.png

But that would mean it repaired 391gb of data, which feels a bit high to me. Also I've never seen a message in proxmox from CEPH saying there was a data corruption. However I almost can't imagine not a single bit was every corrupted in my storage. (One of my SSD's has a wear of 120% so I'm kindoff starting to expect that to fail)