Ceph Timeout on one node

ScottDavis

New Member
May 23, 2024
26
4
3
Very slow to respond, and when trying to add OSD, etc. I get timeout errors.

ceph -s shows this ...
Code:
root@pmox01-scan-hq:~# ceph -s
  cluster:
    id:     7363a620-944a-4321-ad70-d12dd688bac7
    health: HEALTH_WARN
            clock skew detected on mon.pmox03-scan-hq, mon.pmox01-scan-hq
            Degraded data redundancy: 128 pgs undersized
            17304 slow ops, oldest one blocked for 80760 sec, mon.pmox01-scan-hq has slow ops
 
  services:
    mon: 3 daemons, quorum pmox02-scan-hq,pmox03-scan-hq,pmox01-scan-hq (age 10h)
    mgr: pmox02-scan-hq(active, since 23h), standbys: pmox03-scan-hq
    osd: 4 osds: 4 up (since 22h), 4 in (since 22h); 1 remapped pgs
 
  data:
    pools:   2 pools, 129 pgs
    objects: 2 objects, 1.0 MiB
    usage:   110 MiB used, 7.0 TiB / 7.0 TiB avail
    pgs:     2/6 objects misplaced (33.333%)
             128 active+undersized
             1   active+clean+remapped

Any ideas as to what is causing the issue? Other two nodes are fine.
 
Does the filesystem for /var/lib/ceph on that node has any issues?
What kind of storage is used there?
How do I check that?

Its just basic bluestore OSD's on cef storage.

Monitors show running, but manager on that node is 'unknown' still with timeout when I try to view monitor or storage info on that node. No replication either.
 
No, the MON each has a local database stored in /var/lib/ceph/mon/…
The filesystem of that directory is crucial for the MON performance.

I have seen Ceph clusters where this filesystem was stored on cheap SD cards which were not able to deliver the performance needed for the MON operation.
 
No, the MON each has a local database stored in /var/lib/ceph/mon/…
The filesystem of that directory is crucial for the MON performance.

I have seen Ceph clusters where this filesystem was stored on cheap SD cards which were not able to deliver the performance needed for the MON operation.

Interesting. The install has been done on SD cards with the ceph storage using enterprise SSD's.