PGs incomplete

miro-zamiro

New Member
Aug 18, 2025
16
0
1
Hi!

I have a problem with my ceph cluster: 34 pgs are incomplete. I tried to find anything about it in documentation, but I couldn't. Could you guys help me out? Please tell me what information you need.

Code:
root@4-dell:~# ceph -s
  cluster:
    id:     8bc79f4e-ce8c-4036-9384-9d9b6212d60e
    health: HEALTH_WARN
            1 filesystem is degraded
            2 MDSs report slow metadata IOs
            Reduced data availability: 34 pgs inactive, 34 pgs incomplete
            6 daemons have recently crashed
            4 slow ops, oldest one blocked for 12102 sec, daemons [osd.4,osd.5,osd.8] have slow ops.
 
  services:
    mon: 4 daemons, quorum 1-asus,2-hp,4-dell,pve1 (age 10h)
    mgr: 2-hp(active, since 10h), standbys: 1-asus, 4-dell, pve1
    mds: 2/2 daemons up, 2 standby
    osd: 7 osds: 7 up (since 10h), 7 in (since 10h)
 
  data:
    volumes: 1/2 healthy, 1 recovering
    pools:   7 pools, 800 pgs
    objects: 855.18k objects, 1.3 TiB
    usage:   2.6 TiB used, 1.7 TiB / 4.3 TiB avail
    pgs:     4.250% pgs not active
             766 active+clean
             34  incomplete
 
root@4-dell:~#
Code:
root@4-dell:~# ceph health detail
HEALTH_WARN 1 filesystem is degraded; 2 MDSs report slow metadata IOs; Reduced data availability: 34 pgs inactive, 34 pgs incomplete; 6 daemons have recently crashed; 4 slow ops, oldest one blocked for 12244 sec, daemons [osd.4,osd.5,osd.8] have slow ops.
[WRN] FS_DEGRADED: 1 filesystem is degraded
    fs nextcloud is degraded
[WRN] MDS_SLOW_METADATA_IO: 2 MDSs report slow metadata IOs
    mds.pve1-1(mds.0): 4 slow metadata IOs are blocked > 30 secs, oldest blocked for 13278 secs
    mds.pve1-0(mds.0): 3 slow metadata IOs are blocked > 30 secs, oldest blocked for 13276 secs
[WRN] PG_AVAILABILITY: Reduced data availability: 34 pgs inactive, 34 pgs incomplete
    pg 1.1 is incomplete, acting [2,8]
    pg 1.2 is incomplete, acting [2,3]
    pg 1.e is incomplete, acting [4,2]
    pg 1.15 is incomplete, acting [2,4]
    pg 1.1b is incomplete, acting [2,8]
    pg 1.1c is incomplete, acting [3,2]
    pg 1.2a is incomplete, acting [3,2]
    pg 1.2f is incomplete, acting [4,2]
    pg 1.30 is incomplete, acting [2,8]
    pg 1.31 is incomplete, acting [2,3]
    pg 1.32 is incomplete, acting [2,3]
    pg 1.34 is incomplete, acting [2,3]
    pg 1.36 is incomplete, acting [4,2]
    pg 1.3c is incomplete, acting [2,4]
    pg 1.42 is incomplete, acting [2,4]
    pg 1.4a is incomplete, acting [2,4]
    pg 1.50 is incomplete, acting [2,3]
    pg 1.5b is incomplete, acting [2,8]
    pg 1.5c is incomplete, acting [3,2]
    pg 1.6a is incomplete, acting [3,2]
    pg 1.6f is incomplete, acting [4,2]
    pg 1.70 is incomplete, acting [2,8]
    pg 1.71 is incomplete, acting [2,3]
    pg 1.72 is incomplete, acting [2,3]
    pg 1.74 is incomplete, acting [2,3]
    pg 1.76 is incomplete, acting [4,2]
    pg 1.7c is incomplete, acting [2,4]
    pg 3.1a is incomplete, acting [0,5]
    pg 3.44 is incomplete, acting [5,2]
    pg 3.49 is incomplete, acting [2,5]
    pg 3.58 is incomplete, acting [3,6]
    pg 3.71 is incomplete, acting [4,2]
    pg 5.b is incomplete, acting [2,5]
    pg 5.16 is incomplete, acting [8,2]
[WRN] RECENT_CRASH: 6 daemons have recently crashed
    osd.1 crashed on host 1-asus at 2025-11-26T17:10:55.887786Z
    osd.1 crashed on host 1-asus at 2025-11-27T17:54:25.533846Z
    osd.1 crashed on host 1-asus at 2025-11-28T01:22:35.391438Z
    mds.pve0-1 crashed on host 4-dell at 2025-11-29T09:12:59.292542Z
    mds.pve0-0 crashed on host 4-dell at 2025-11-29T15:19:01.472965Z
    osd.1 crashed on host 1-asus at 2025-11-30T08:01:48.118021Z
[WRN] SLOW_OPS: 4 slow ops, oldest one blocked for 12244 sec, daemons [osd.4,osd.5,osd.8] have slow ops.
root@4-dell:~#
 
I managed to fix it using these steps:
https://medium.com/opsops/recoverin...-3-pgs-inactive-3-pgs-incomplete-b97cbcb4b5a1
DO THAT ONLY IF INACTIVE PGs ARE EMPTY (MEANING OBJECTS = 0, seen using command ceph pg ls incomplete)
1. Stop osd / osds with inactive pg
2. Mark that pg with ceph-object-tool as complete:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2 --op mark-complete --pgid 2.50
3. Restart osd /osds
4. Sometimes it resulted in unfound objects, In such a case remove pg:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2 --op remove --pgid 2.19 --force
5. Start osd / osd and immediately force create that pg using this command:
ceph osd force-create-pg 2.19
This should result in a healthy state.
Best,
Miro