Reduced data availability: 1 pg inactive, 1 pg stale

jsterr

Renowned Member
Jul 24, 2020
862
248
88
33
Im having this on my testing-cluster:

Code:
Reduced data availability: 1 pg inactive, 1 pg stalepg 1.0 is stuck stale for 29h, current state stale+undersized+degraded+peered, last acting [3]

How can I get this error removed? I restarted the osd.3 and I also deleted it and set it up again but errors still active.

Code:
[global]
     auth_client_required = cephx
     auth_cluster_required = cephx
     auth_service_required = cephx
     cluster_network = 10.100.100.231/24
     fsid = b0fabee0-fa12-403e-97a3-b3d6009479fa
     mon_allow_pool_delete = true
     mon_host = 10.100.100.231 10.100.100.233 10.100.100.232
     osd_pool_default_min_size = 2
     osd_pool_default_size = 3
     public_network = 10.100.100.231/24

[client]
     keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
     keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.pve01]
     host = pve01
     mds_standby_for_name = pve

[mds.pve02]
     host = pve02
     mds_standby_for_name = pve

[mds.pve03]
     host = pve03
     mds standby for name = pve

[mon.pve01]
     public_addr = 10.100.100.231

[mon.pve02]
     public_addr = 10.100.100.232

[mon.pve03]
     public_addr = 10.100.100.233
 
Also in case you do not get the indication well enough from the docs, I strongly suggest you do hardware testing on the drive(s) which held the malfunctioning pg(s).
 
  • Like
Reactions: jsterr
See Troubleshooting PGs for the steps to attempt recovery or delete that placement group.

Sorry I read the complete document but could not find a way to delete the pg? I tried the repair etc. nothing changed. Any tip?

Edit: Would be nice to know that - I recreated the pool and deleted all osds before - now everything is fine (needed to restart ceph mgr because inactive pgs)
 
Last edited: