Reduced data availability: 40 pgs inactive, 42 pgs incomplete

Magneto

Well-Known Member
Jul 30, 2017
133
4
58
44
In a 5 node cluster, I had to replace some failed SSD's and now the CEPH cluster is stuck with "Reduced data availability: 40 pgs inactive, 42 pgs incomplete"


Code:
Reduced data availability: 40 pgs inactive, 42 pgs incomplete

pg 2.57 is incomplete, acting [1,35,14] (reducing pool CephFS_data min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 15.2a is incomplete, acting [41,27,0] (reducing pool SSD_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 15.2c is incomplete, acting [48,26,40] (reducing pool SSD_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 15.64 is incomplete, acting [24,27,46] (reducing pool SSD_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.0 is incomplete, acting [7,23,29] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.2 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [4]
pg 17.3 is incomplete, acting [15,2,26] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.6 is stuck inactive for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [37]
pg 17.7 is stuck inactive for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [21]
pg 17.8 is incomplete, acting [37,28,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.b is incomplete, acting [18,32,10] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.c is stuck inactive for 3h, current state undersized+degraded+remapped+backfilling+peered, last acting [36]
pg 17.d is incomplete, acting [70,45,38] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.10 is incomplete, acting [51,46,11] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.11 is stuck inactive for 4h, current state undersized+degraded+remapped+backfilling+peered, last acting [51]
pg 17.14 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]
pg 17.17 is incomplete, acting [48,70,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.1d is incomplete, acting [12,35,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.1e is incomplete, acting [24,3,48] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.20 is stuck inactive for 3h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]
pg 17.22 is incomplete, acting [7,18,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.26 is incomplete, acting [30,48,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.27 is incomplete, acting [31,24,58] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.29 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [68]
pg 17.2c is incomplete, acting [1,7,30] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.2e is incomplete, acting [15,5,44] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.30 is stuck inactive for 111m, current state undersized+degraded+remapped+backfilling+peered, last acting [30]
pg 17.32 is incomplete, acting [48,31,3] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.3a is incomplete, acting [16,25,46] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.3b is incomplete, acting [2,7,1] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.3d is incomplete, acting [48,70,33] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.40 is incomplete, acting [11,45,43] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.44 is incomplete, acting [43,18,6] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.4a is incomplete, acting [14,22,35] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.4d is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [50]
pg 17.4f is incomplete, acting [0,44,15] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.53 is incomplete, acting [33,27,26] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.54 is incomplete, acting [31,41,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.55 is incomplete, acting [27,44,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.56 is incomplete, acting [41,55,27] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.57 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [14]
pg 17.58 is incomplete, acting [35,9,50] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.59 is incomplete, acting [58,25,31] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.6c is incomplete, acting [27,20,25] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.70 is stuck inactive for 11h, current state undersized+degraded+remapped+backfilling+peered, last acting [36]
pg 17.71 is incomplete, acting [0,1,35] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.74 is stuck inactive for 5h, current state undersized+degraded+remapped+backfilling+peered, last acting [23]
pg 17.76 is incomplete, acting [40,27,70] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.77 is incomplete, acting [6,9,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.79 is incomplete, acting [35,27,25] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 17.7b is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]


Code:
Degraded data redundancy: 284751/3351945 objects degraded (8.495%), 38 pgs degraded, 38 pgs undersized

pg 15.1b is stuck undersized for 10m, current state active+undersized+degraded+remapped+backfilling, last acting [1,37]
pg 15.77 is stuck undersized for 10m, current state active+undersized+degraded+remapped+backfilling, last acting [50,31]
pg 17.2 is stuck undersized for 106m, current state undersized+degraded+remapped+backfilling+peered, last acting [4]
pg 17.6 is stuck undersized for 88m, current state undersized+degraded+remapped+backfilling+peered, last acting [37]
pg 17.7 is stuck undersized for 23m, current state undersized+degraded+remapped+backfilling+peered, last acting [21]
pg 17.c is stuck undersized for 106m, current state undersized+degraded+remapped+backfilling+peered, last acting [36]
pg 17.e is stuck undersized for 10m, current state active+undersized+degraded+remapped+backfilling, last acting [68,52]
pg 17.11 is stuck undersized for 10m, current state undersized+degraded+remapped+backfilling+peered, last acting [51]
pg 17.14 is stuck undersized for 10m, current state undersized+degraded+remapped+backfilling+peered, last acting [3]
pg 17.19 is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [36,8]
pg 17.1a is stuck undersized for 90m, current state active+undersized+degraded+remapped+backfilling, last acting [35,3]
pg 17.1f is stuck undersized for 10m, current state active+undersized+degraded+remapped+backfilling, last acting [50,46]
pg 17.20 is stuck undersized for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]
pg 17.24 is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [3,68]
pg 17.29 is stuck undersized for 11m, current state undersized+degraded+remapped+backfilling+peered, last acting [68]
pg 17.2a is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [14,37]
pg 17.2d is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [36,7]
pg 17.30 is stuck undersized for 89m, current state undersized+degraded+remapped+backfilling+peered, last acting [30]
pg 17.31 is stuck undersized for 11m, current state active+undersized+degraded+remapped+backfilling, last acting [21,46]
pg 17.35 is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [68,5]
pg 17.3f is stuck undersized for 101m, current state active+undersized+degraded+remapped+backfilling, last acting [3,26]
pg 17.43 is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [29,51]
pg 17.48 is stuck undersized for 10m, current state active+undersized+degraded+remapped+backfilling, last acting [43,37]
pg 17.49 is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [8,30]
pg 17.4d is stuck undersized for 88m, current state undersized+degraded+remapped+backfilling+peered, last acting [50]
pg 17.57 is stuck undersized for 89m, current state undersized+degraded+remapped+backfilling+peered, last acting [14]
pg 17.61 is stuck undersized for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [8]
pg 17.62 is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [37,21]
pg 17.65 is stuck undersized for 11m, current state undersized+degraded+remapped+backfilling+peered, last acting [21]
pg 17.66 is stuck undersized for 10m, current state undersized+degraded+remapped+backfilling+peered, last acting [33]
pg 17.69 is stuck undersized for 10m, current state active+undersized+degraded+remapped+backfilling, last acting [40,46]
pg 17.6b is stuck undersized for 10m, current state undersized+degraded+remapped+backfilling+peered, last acting [42]
pg 17.6f is stuck undersized for 59m, current state active+undersized+degraded+remapped+backfilling, last acting [11,58]
pg 17.70 is stuck undersized for 11m, current state undersized+degraded+remapped+backfilling+peered, last acting [36]
pg 17.74 is stuck undersized for 101m, current state undersized+degraded+remapped+backfilling+peered, last acting [23]
pg 17.7a is stuck undersized for 10m, current state active+undersized+degraded+remapped+backfilling, last acting [7,42]
pg 17.7b is stuck undersized for 91m, current state undersized+degraded+remapped+backfilling+peered, last acting [3]
pg 17.7c is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [3,46]


Code:
root@PVE1:~#  ceph osd tree
ID   CLASS  WEIGHT    TYPE NAME       STATUS  REWEIGHT  PRI-AFF
 -1         27.96579  root default                             
 -3          7.02707      host PVE1                           
  0    ssd   0.43939          osd.0       up   1.00000  1.00000
  2    ssd   0.43619          osd.2       up   1.00000  1.00000
  3    ssd   0.90919          osd.3       up   1.00000  1.00000
  4    ssd   0.43939          osd.4       up   1.00000  1.00000
  5    ssd   0.43939          osd.5       up   1.00000  1.00000
  9    ssd   0.43939          osd.9       up   1.00000  1.00000
 11    ssd   0.43619          osd.11      up   1.00000  1.00000
 14    ssd   0.43619          osd.14      up   1.00000  1.00000
 18    ssd   0.43639          osd.18      up   1.00000  1.00000
 21    ssd   0.43939          osd.21      up   1.00000  1.00000
 25    ssd   0.87279          osd.25      up   1.00000  1.00000
 28    ssd   0.43619          osd.28      up   1.00000  1.00000
 29    ssd   0.42760          osd.29      up   1.00000  1.00000
 30    ssd   0.43939          osd.30      up   1.00000  1.00000
 -5          5.75449      host PVE2                           
  6    ssd   0.45459          osd.6       up   1.00000  1.00000
  7    ssd   0.87279          osd.7       up   1.00000  1.00000
 10    ssd   0.43619          osd.10      up   1.00000  1.00000
 12    ssd   0.43619          osd.12      up   1.00000  1.00000
 15    ssd   0.45459          osd.15      up   1.00000  1.00000
 22    ssd   0.43939          osd.22      up   1.00000  1.00000
 27    ssd   0.87279          osd.27      up   1.00000  1.00000
 45    ssd   0.45459          osd.45      up   1.00000  1.00000
 48    ssd   0.45459          osd.48      up   1.00000  1.00000
 57    ssd   0.43939          osd.57    down         0  1.00000
 58    ssd   0.43939          osd.58      up   1.00000  1.00000
 -7          8.16016      host PVE3                           
  1    ssd   0.43649          osd.1       up   1.00000  1.00000
  8    ssd   0.43939          osd.8       up   1.00000  1.00000
 16    ssd   0.45459          osd.16      up   1.00000  1.00000
 17    ssd   0.45430          osd.17      up   1.00000  1.00000
 20    ssd   0.43649          osd.20      up   1.00000  1.00000
 24    ssd   0.43619          osd.24      up   1.00000  1.00000
 32    ssd   0.45459          osd.32      up   1.00000  1.00000
 33    ssd   0.43649          osd.33      up   1.00000  1.00000
 34    ssd   0.43639          osd.34      up   1.00000  1.00000
 38    ssd   0.43649          osd.38      up   1.00000  1.00000
 40    ssd   0.43649          osd.40      up   1.00000  1.00000
 41    ssd   0.43649          osd.41      up   1.00000  1.00000
 42    ssd   0.43649          osd.42      up   1.00000  1.00000
 43    ssd   0.43649          osd.43      up   1.00000  1.00000
 44    ssd   0.43639          osd.44      up   1.00000  1.00000
 47    ssd   0.25000          osd.47      up   0.34996  1.00000
 50    ssd   0.43939          osd.50      up   1.00000  1.00000
 51    ssd   0.43939          osd.51      up   1.00000  1.00000
 52    ssd   0.42760          osd.52      up   1.00000  1.00000
-13          7.02408      host PVE4                           
 13    ssd   0.43939          osd.13      up   1.00000  1.00000
 19    ssd   0.43939          osd.19      up   1.00000  1.00000
 23    ssd   0.43619          osd.23      up   1.00000  1.00000
 26    ssd   0.43939          osd.26      up   0.85004  1.00000
 31    ssd   0.90919          osd.31      up   1.00000  1.00000
 35    ssd   0.87279          osd.35      up   1.00000  1.00000
 36    ssd   0.43619          osd.36      up   1.00000  1.00000
 37    ssd   0.43619          osd.37      up   0.85004  1.00000
 39    ssd   0.43619          osd.39      up   1.00000  1.00000
 46    ssd   0.87279          osd.46      up   1.00000  1.00000
 55    ssd   0.42760          osd.55      up   1.00000  1.00000
 68    ssd   0.43939          osd.68      up   1.00000  1.00000
 70    ssd   0.43939          osd.70      up   1.00000  1.00000
 
Last edited:
Code:
root@PVE1:~# ceph health detail

HEALTH_WARN Reduced data availability: 40 pgs inactive, 42 pgs incomplete; Degraded data redundancy: 278376/3351945 objects degraded (8.305%), 35 pgs degraded, 36 pgs undersized; 34 slow ops, oldest one blocked for 13483 sec, daemons [osd.16,osd.27,osd.48,osd.6,osd.7] have slow ops.

[WRN] PG_AVAILABILITY: Reduced data availability: 40 pgs inactive, 42 pgs incomplete

    pg 2.57 is incomplete, acting [1,35,14] (reducing pool CephFS_data min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 15.2a is incomplete, acting [41,27,0] (reducing pool SSD_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 15.2c is incomplete, acting [48,26,40] (reducing pool SSD_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 15.64 is incomplete, acting [24,27,46] (reducing pool SSD_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.0 is incomplete, acting [7,23,29] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.2 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [4]

    pg 17.3 is incomplete, acting [15,2,26] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.6 is stuck inactive for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [37]

    pg 17.7 is stuck inactive for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [21]

    pg 17.8 is incomplete, acting [37,28,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.b is incomplete, acting [18,32,10] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.c is stuck inactive for 3h, current state undersized+degraded+remapped+backfilling+peered, last acting [36]

    pg 17.d is incomplete, acting [70,45,38] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.10 is incomplete, acting [51,46,11] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.11 is stuck inactive for 4h, current state undersized+degraded+remapped+backfilling+peered, last acting [51]

    pg 17.14 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]

    pg 17.17 is incomplete, acting [48,70,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.1d is incomplete, acting [12,35,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.1e is incomplete, acting [24,3,48] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.20 is stuck inactive for 3h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]

    pg 17.22 is incomplete, acting [7,18,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.26 is incomplete, acting [30,48,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.27 is incomplete, acting [31,24,58] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.29 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [68]

    pg 17.2c is incomplete, acting [1,7,30] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.2e is incomplete, acting [15,5,44] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.30 is stuck inactive for 115m, current state undersized+degraded+remapped+backfilling+peered, last acting [30]

    pg 17.32 is incomplete, acting [48,31,3] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.3a is incomplete, acting [16,25,46] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.3b is incomplete, acting [2,7,1] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.3d is incomplete, acting [48,70,33] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.40 is incomplete, acting [11,45,43] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.44 is incomplete, acting [43,18,6] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.4a is incomplete, acting [14,22,35] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.4d is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [50]

    pg 17.4f is incomplete, acting [0,44,15] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.53 is incomplete, acting [33,27,26] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.54 is incomplete, acting [31,41,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.55 is incomplete, acting [27,44,14] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.56 is incomplete, acting [41,55,27] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.57 is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [14]

    pg 17.58 is incomplete, acting [35,9,50] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.59 is incomplete, acting [58,25,31] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.6c is incomplete, acting [27,20,25] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.70 is stuck inactive for 11h, current state undersized+degraded+remapped+backfilling+peered, last acting [36]

    pg 17.71 is incomplete, acting [0,1,35] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.74 is stuck inactive for 5h, current state undersized+degraded+remapped+backfilling+peered, last acting [23]

    pg 17.76 is incomplete, acting [40,27,70] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.77 is incomplete, acting [6,9,20] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.79 is incomplete, acting [35,27,25] (reducing pool Ceph_Storage min_size from 2 may help; search ceph.com/docs for 'incomplete')

    pg 17.7b is stuck inactive for 6h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]

[WRN] PG_DEGRADED: Degraded data redundancy: 278376/3351945 objects degraded (8.305%), 35 pgs degraded, 36 pgs undersized

    pg 17.2 is stuck undersized for 107m, current state undersized+degraded+remapped+backfilling+peered, last acting [4]

    pg 17.6 is stuck undersized for 90m, current state undersized+degraded+remapped+backfilling+peered, last acting [37]

    pg 17.7 is stuck undersized for 25m, current state undersized+degraded+remapped+backfilling+peered, last acting [21]

    pg 17.c is stuck undersized for 107m, current state undersized+degraded+remapped+backfilling+peered, last acting [36]

    pg 17.e is stuck undersized for 11m, current state active+undersized+degraded+remapped+backfilling, last acting [68,52]

    pg 17.11 is stuck undersized for 11m, current state undersized+degraded+remapped+backfilling+peered, last acting [51]

    pg 17.14 is stuck undersized for 11m, current state undersized+degraded+remapped+backfilling+peered, last acting [3]

    pg 17.19 is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [36,8]

    pg 17.1a is stuck undersized for 92m, current state active+undersized+degraded+remapped+backfilling, last acting [35,3]

    pg 17.1f is stuck undersized for 11m, current state active+undersized+remapped+backfilling, last acting [50,46]

    pg 17.20 is stuck undersized for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [3]

    pg 17.24 is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [3,68]

    pg 17.29 is stuck undersized for 13m, current state undersized+degraded+remapped+backfilling+peered, last acting [68]

    pg 17.2a is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [14,37]

    pg 17.2d is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [36,7]

    pg 17.30 is stuck undersized for 90m, current state undersized+degraded+remapped+backfilling+peered, last acting [30]

    pg 17.31 is stuck undersized for 13m, current state active+undersized+degraded+remapped+backfilling, last acting [21,46]

    pg 17.35 is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [68,5]

    pg 17.3f is stuck undersized for 103m, current state active+undersized+degraded+remapped+backfilling, last acting [3,26]

    pg 17.43 is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [29,51]

    pg 17.48 is stuck undersized for 11m, current state active+undersized+degraded+remapped+backfilling, last acting [43,37]

    pg 17.49 is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [8,30]

    pg 17.4d is stuck undersized for 90m, current state undersized+degraded+remapped+backfilling+peered, last acting [50]

    pg 17.57 is stuck undersized for 90m, current state undersized+degraded+remapped+backfilling+peered, last acting [14]

    pg 17.61 is stuck undersized for 2h, current state undersized+degraded+remapped+backfilling+peered, last acting [8]

    pg 17.62 is stuck undersized for 2h, current state active+undersized+degraded+remapped+backfilling, last acting [37,21]

    pg 17.65 is stuck undersized for 13m, current state undersized+degraded+remapped+backfilling+peered, last acting [21]

    pg 17.66 is stuck undersized for 11m, current state undersized+degraded+remapped+backfilling+peered, last acting [33]

    pg 17.69 is stuck undersized for 11m, current state active+undersized+degraded+remapped+backfilling, last acting [40,46]

    pg 17.6b is stuck undersized for 11m, current state undersized+degraded+remapped+backfilling+peered, last acting [42]

    pg 17.6f is stuck undersized for 61m, current state active+undersized+degraded+remapped+backfilling, last acting [11,58]

    pg 17.70 is stuck undersized for 13m, current state undersized+degraded+remapped+backfilling+peered, last acting [36]

    pg 17.74 is stuck undersized for 103m, current state undersized+degraded+remapped+backfilling+peered, last acting [23]

    pg 17.7a is stuck undersized for 11m, current state active+undersized+degraded+remapped+backfilling, last acting [7,42]

    pg 17.7b is stuck undersized for 92m, current state undersized+degraded+remapped+backfilling+peered, last acting [3]

    pg 17.7c is stuck undersized for 3h, current state active+undersized+degraded+remapped+backfilling, last acting [3,46]

[WRN] SLOW_OPS: 34 slow ops, oldest one blocked for 13483 sec, daemons [osd.16,osd.27,osd.48,osd.6,osd.7] have slow ops.

root@PVE1:~#

This has been going on for about 40 hours now
Code:
root@PVE1:~#  ceph -s
  cluster:
    id:     3b7897cf-7da1-4f33-85a0-d68d5a58ba2f
    health: HEALTH_WARN
            1 nearfull osd(s)
            Reduced data availability: 40 pgs inactive, 42 pgs incomplete
            Degraded data redundancy: 269098/3351945 objects degraded (8.028%), 34 pgs degraded, 35 pgs undersized
            5 pool(s) nearfull
            34 slow ops, oldest one blocked for 13618 sec, daemons [osd.16,osd.27,osd.48,osd.6,osd.7] have slow ops.
 
  services:
    mon: 5 daemons, quorum PVE5,PVE1,PVE3,PVE4,PVE2 (age 65m)
    mgr: PVE5(active, since 13h), standbys: PVE2, PVE1, PVE3, PVE4
    mds: 1/1 daemons up, 4 standby
    osd: 57 osds: 56 up (since 14m), 56 in (since 13m); 49 remapped pgs
 
  data:
    volumes: 1/1 healthy
    pools:   5 pools, 417 pgs
    objects: 1.12M objects, 4.2 TiB
    usage:   12 TiB used, 16 TiB / 28 TiB avail
    pgs:     14.388% pgs not active
             269098/3351945 objects degraded (8.028%)
             247132/3351945 objects misplaced (7.373%)
             326 active+clean
             42  incomplete
             18  undersized+degraded+remapped+backfilling+peered
             16  active+undersized+degraded+remapped+backfilling
             14  active+remapped+backfilling
             1   active+undersized+remapped+backfilling
 
  io:
    recovery: 292 MiB/s, 75 objects/s
 
  progress:
    Global Recovery Event (12h)
      [=====================.......] (remaining: 3h)
 
Hello,

How many OSDs did you remove? And from which nodes?

Whats the output of `ceph osd df tree` and of `cat /etc/ceph/ceph.conf`?
 
Hello,

How many OSDs did you remove? And from which nodes?

Whats the output of `ceph osd df tree` and of `cat /etc/ceph/ceph.conf`?
ceph osd df tree:


Code:
root@PVE2:~# ceph osd df tree
ID   CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME     
 -1         29.71176         -   16 TiB  6.9 TiB  6.9 TiB   91 MiB   31 GiB  8.9 TiB      0     0    -          root default 
 -3          7.02707         -  7.0 TiB  3.5 TiB  3.5 TiB  235 KiB   15 GiB  3.5 TiB  49.97  1.14    -              host PVE1
  0    ssd   0.43939   1.00000  450 GiB  286 GiB  286 GiB   17 KiB  616 MiB  164 GiB  63.63  1.45   25      up          osd.0
  2    ssd   0.43619   1.00000  447 GiB  204 GiB  203 GiB   11 KiB  1.1 GiB  243 GiB  45.67  1.04   16      up          osd.2
  3    ssd   0.90919   1.00000  931 GiB  596 GiB  594 GiB   15 KiB  1.6 GiB  335 GiB  63.99  1.46   40      up          osd.3
  4    ssd   0.43939   1.00000  450 GiB  317 GiB  316 GiB   30 KiB  1.4 GiB  133 GiB  70.46  1.60   23      up          osd.4
  5    ssd   0.43939   1.00000  450 GiB  218 GiB  217 GiB   20 KiB  967 MiB  232 GiB  48.43  1.10   23      up          osd.5
  9    ssd   0.43939   1.00000  450 GiB  128 GiB  127 GiB   33 KiB  844 MiB  322 GiB  28.51  0.65   13      up          osd.9
 11    ssd   0.43619   1.00000  447 GiB  147 GiB  146 GiB   11 KiB  606 MiB  300 GiB  32.89  0.75   21      up          osd.11
 14    ssd   0.43619   1.00000  447 GiB  213 GiB  212 GiB   10 KiB  1.2 GiB  234 GiB  47.63  1.08   21      up          osd.14
 18    ssd   0.43639   1.00000  447 GiB  220 GiB  219 GiB   12 KiB  1.3 GiB  226 GiB  49.34  1.12   26      up          osd.18
 21    ssd   0.43939   1.00000  450 GiB  219 GiB  218 GiB   16 KiB  723 MiB  231 GiB  48.67  1.11   18      up          osd.21
 25    ssd   0.87279   1.00000  894 GiB  471 GiB  470 GiB   14 KiB  1.7 GiB  422 GiB  52.75  1.20   41      up          osd.25
 28    ssd   0.43619   1.00000  447 GiB  152 GiB  151 GiB    9 KiB  1.1 GiB  295 GiB  33.97  0.77   11      up          osd.28
 29    ssd   0.42760   1.00000  438 GiB  154 GiB  153 GiB   19 KiB  1.2 GiB  284 GiB  35.25  0.80   18      up          osd.29
 30    ssd   0.43939   1.00000  450 GiB  270 GiB  269 GiB   18 KiB  776 MiB  180 GiB  59.99  1.36   17      up          osd.30
 -5          7.50046         -  1.3 TiB  2.6 GiB  1.3 GiB      0 B  1.2 GiB  1.3 TiB      0     0    -              host PVE2
  6    ssd   0.45459   1.00000  465 GiB  163 GiB  162 GiB    9 KiB  790 MiB  302 GiB  35.07  0.80   22      up          osd.6
  7    ssd   0.87279   1.00000  894 GiB  352 GiB  351 GiB   23 KiB  1.0 GiB  541 GiB  39.43  0.90   39      up          osd.7
 10    ssd   0.43619   1.00000  447 GiB  167 GiB  166 GiB   15 KiB  1.2 GiB  280 GiB  37.42  0.85   30      up          osd.10
 12    ssd   0.43619   1.00000  447 GiB  176 GiB  175 GiB   14 KiB  635 MiB  271 GiB  39.38  0.90   17      up          osd.12
 15    ssd   0.45459   1.00000  465 GiB  158 GiB  158 GiB    9 KiB  529 MiB  307 GiB  34.01  0.77   19      up          osd.15
 22    ssd   0.43939   1.00000  450 GiB  117 GiB  116 GiB   10 KiB  1.2 GiB  333 GiB  26.05  0.59   12      up          osd.22
 27    ssd   0.87279   1.00000  894 GiB  683 GiB  681 GiB   18 KiB  1.6 GiB  211 GiB  76.38  1.74   37      up          osd.27
 45    ssd   0.45459   1.00000  465 GiB  238 GiB  237 GiB   13 KiB  1.1 GiB  228 GiB  51.08  1.16   19      up          osd.45
 48    ssd   0.45459   1.00000  465 GiB  147 GiB  146 GiB   10 KiB  708 MiB  319 GiB  31.49  0.72   21      up          osd.48
 49    ssd   0.43649   1.00000      0 B      0 B      0 B      0 B      0 B      0 B      0     0    0    down          osd.49
 53    ssd   0.43649   1.00000  447 GiB  991 MiB  545 MiB      0 B  446 MiB  446 GiB   0.22  0.00    9      up          osd.53
 54    ssd   0.43649   1.00000  447 GiB  859 MiB  439 MiB      0 B  420 MiB  446 GiB   0.19  0.00    8      up          osd.54
 56    ssd   0.43649   1.00000  447 GiB  773 MiB  375 MiB      0 B  398 MiB  446 GiB   0.17  0.00    3      up          osd.56
 57    ssd   0.43939   0.85004  450 GiB  300 GiB  299 GiB   19 KiB  983 MiB  150 GiB  66.77  1.52   18      up          osd.57
 58    ssd   0.43939   1.00000  450 GiB  288 GiB  287 GiB   34 MiB  1.1 GiB  162 GiB  63.93  1.45   18      up          osd.58
 -7          8.16016         -  8.8 TiB  3.4 TiB  3.4 TiB   58 MiB   16 GiB  5.4 TiB  38.53  0.88    -              host PVE3
  1    ssd   0.43649   1.00000  447 GiB  193 GiB  192 GiB   10 KiB  1.0 GiB  254 GiB  43.12  0.98   17      up          osd.1
  8    ssd   0.43939   1.00000  450 GiB  232 GiB  231 GiB   54 KiB  989 MiB  218 GiB  51.58  1.17   23      up          osd.8
 16    ssd   0.45459   1.00000  465 GiB  246 GiB  245 GiB    9 KiB  648 MiB  220 GiB  52.78  1.20   17      up          osd.16
 17    ssd   0.45430   1.00000  465 GiB  181 GiB  180 GiB   11 KiB  961 MiB  284 GiB  38.98  0.89   15      up          osd.17
 20    ssd   0.43649   1.00000  447 GiB  253 GiB  253 GiB   26 MiB  643 MiB  194 GiB  56.65  1.29   20      up          osd.20
 24    ssd   0.43619   1.00000  447 GiB  155 GiB  153 GiB   15 KiB  1.3 GiB  292 GiB  34.60  0.79   20      up          osd.24
 32    ssd   0.45459   1.00000  465 GiB  249 GiB  248 GiB   11 KiB  861 MiB  216 GiB  53.51  1.22   14      up          osd.32
 33    ssd   0.43649   1.00000  447 GiB  262 GiB  261 GiB   10 KiB  709 MiB  185 GiB  58.58  1.33   19      up          osd.33
 34    ssd   0.43639   1.00000  447 GiB  193 GiB  192 GiB   25 MiB  1.2 GiB  254 GiB  43.15  0.98   17      up          osd.34
 38    ssd   0.43649   1.00000  447 GiB   98 GiB   97 GiB   10 KiB  890 MiB  349 GiB  21.87  0.50   19      up          osd.38
 40    ssd   0.43649   1.00000  447 GiB  132 GiB  131 GiB  320 KiB  833 MiB  315 GiB  29.55  0.67   13      up          osd.40
 41    ssd   0.43649   1.00000  447 GiB  130 GiB  129 GiB   10 KiB  1.2 GiB  317 GiB  29.10  0.66   15      up          osd.41
 42    ssd   0.43649   1.00000  447 GiB  222 GiB  221 GiB   13 KiB  793 MiB  225 GiB  49.64  1.13   23      up          osd.42
 43    ssd   0.43649   1.00000  447 GiB  178 GiB  176 GiB   15 KiB  1.2 GiB  269 GiB  39.75  0.90   17      up          osd.43
 44    ssd   0.43639   1.00000  447 GiB  123 GiB  123 GiB    9 KiB  630 MiB  324 GiB  27.59  0.63   13      up          osd.44
 47    ssd   0.25000   0.34996  894 GiB   28 GiB   28 GiB   27 KiB  221 MiB  866 GiB   3.10  0.07    3      up          osd.47
 50    ssd   0.43939   1.00000  450 GiB  252 GiB  252 GiB  7.0 MiB  737 MiB  198 GiB  56.09  1.28   23      up          osd.50
 51    ssd   0.43939   1.00000  450 GiB  207 GiB  206 GiB  233 KiB  912 MiB  243 GiB  45.93  1.04   12      up          osd.51
 52    ssd   0.42760   1.00000  438 GiB  132 GiB  131 GiB   36 KiB  828 MiB  306 GiB  30.22  0.69   18      up          osd.52
-13          7.02408         -  7.0 TiB  3.5 TiB  3.5 TiB   33 MiB   14 GiB  3.5 TiB  50.16  1.14    -              host PVE4
 13    ssd   0.43939   1.00000  450 GiB   99 GiB   98 GiB   11 KiB  1.1 GiB  351 GiB  22.06  0.50   23      up          osd.13
 19    ssd   0.43939   1.00000  450 GiB   98 GiB   98 GiB   10 KiB  656 MiB  352 GiB  21.81  0.50   20      up          osd.19
 23    ssd   0.43619   1.00000  447 GiB  346 GiB  344 GiB  132 KiB  1.4 GiB  101 GiB  77.38  1.76   20      up          osd.23
 26    ssd   0.43939   0.85004  450 GiB  345 GiB  344 GiB   12 KiB  1.2 GiB  105 GiB  76.66  1.74   20      up          osd.26
 31    ssd   0.90919   1.00000  931 GiB  534 GiB  533 GiB   19 KiB  1.3 GiB  397 GiB  57.35  1.30   41      up          osd.31
 35    ssd   0.87279   1.00000  894 GiB  338 GiB  336 GiB  7.0 MiB  1.6 GiB  556 GiB  37.82  0.86   36      up          osd.35
 36    ssd   0.43619   1.00000  447 GiB  249 GiB  248 GiB  141 KiB  806 MiB  198 GiB  55.73  1.27   16      up          osd.36
 37    ssd   0.43619   0.85004  447 GiB  311 GiB  309 GiB   25 MiB  1.3 GiB  136 GiB  69.58  1.58   17      up          osd.37
 39    ssd   0.43619   1.00000  447 GiB  118 GiB  117 GiB   19 KiB  813 MiB  329 GiB  26.40  0.60   12      up          osd.39
 46    ssd   0.87279   1.00000  894 GiB  452 GiB  451 GiB   54 KiB  1.7 GiB  441 GiB  50.60  1.15   44      up          osd.46
 55    ssd   0.42760   1.00000  438 GiB  147 GiB  146 GiB  162 KiB  929 MiB  291 GiB  33.56  0.76   24      up          osd.55
 68    ssd   0.43939   1.00000  450 GiB  351 GiB  351 GiB   11 KiB  617 MiB   99 GiB  78.05  1.78   23      up          osd.68
 70    ssd   0.43939   1.00000  450 GiB  220 GiB  219 GiB   10 KiB  1.0 GiB  230 GiB  48.92  1.11   20      up          osd.70
                         TOTAL   30 TiB   13 TiB   13 TiB  125 MiB   58 GiB   17 TiB  43.97                                   
MIN/MAX VAR: 0/1.78  STDDEV: 18.83

cat /etc/ceph/ceph.conf`

Code:
root@PVE2:~# cat /etc/ceph/ceph.conf
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 10.0.0.11/24
         fsid = 3b7897cf-7da1-4f33-85a0-d68d5a58ba2f
         mon_allow_pool_delete = true
         mon_host = 10.0.0.15 10.0.0.12 10.0.0.11 10.0.0.13 10.0.0.14
         ms_bind_ipv4 = true
         ms_bind_ipv6 = false
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public_network = 10.0.0.11/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
         keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.PVE1]
         host = PVE1
         mds_standby_for_name = pve

[mds.PVE2]
         host = PVE2
         mds_standby_for_name = pve

[mds.PVE3]
         host = PVE3
         mds_standby_for_name = pve

[mds.PVE4]
         host = PVE4
         mds_standby_for_name = pve

[mds.PVE5]
         host = PVE5
         mds standby for name = pve

[mon.PVE1]
         public_addr = 10.0.0.11

[mon.PVE2]
         public_addr = 10.0.0.12

[mon.PVE3]
         public_addr = 10.0.0.13

[mon.PVE4]
         public_addr = 10.0.0.14

[mon.PVE5]
         public_addr = 10.0.0.15
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!