Hi,
Today there was an unexpected power outage where my servers are co-located, the entire datacenter went dark. Luckily I had fresh backups to simply restore for the most part.
However, I have an issue with one OSD on one server, the OSD is stuck in "active+recovery_wait+degraded" I have tried repair etc but nothing is helping and google wasn't my friend in this case. Anyone have a suggestion on how to proceed, the drive seems not to be physically damaged due to the power failure.
ceph health detail
HEALTH_ERR 1 osds down; 5681/115284 objects misplaced (4.928%); 61/38428 objects unfound (0.159%); 4 scrub errors; Possible data damage: 3 pgs inconsistent; Degraded data redundancy: 160/115284 objects degraded (0.139%), 53 pgs degraded; 3 stuck requests are blocked > 4096 sec. I
Br/Joel
Today there was an unexpected power outage where my servers are co-located, the entire datacenter went dark. Luckily I had fresh backups to simply restore for the most part.
However, I have an issue with one OSD on one server, the OSD is stuck in "active+recovery_wait+degraded" I have tried repair etc but nothing is helping and google wasn't my friend in this case. Anyone have a suggestion on how to proceed, the drive seems not to be physically damaged due to the power failure.
ceph health detail
HEALTH_ERR 1 osds down; 5681/115284 objects misplaced (4.928%); 61/38428 objects unfound (0.159%); 4 scrub errors; Possible data damage: 3 pgs inconsistent; Degraded data redundancy: 160/115284 objects degraded (0.139%), 53 pgs degraded; 3 stuck requests are blocked > 4096 sec. I
mplicated osds 1
OSD_DOWN 1 osds down
osd.11 (root=default,host=proxmox4) is down
OBJECT_MISPLACED 5681/115284 objects misplaced (4.928%)
OBJECT_UNFOUND 61/38428 objects unfound (0.159%)
pg 1.236 has 1 unfound objects
pg 1.232 has 1 unfound objects
pg 1.230 has 1 unfound objects
pg 1.223 has 1 unfound objects
pg 1.212 has 1 unfound objects
pg 1.20c has 1 unfound objects
pg 1.20b has 1 unfound objects
pg 1.1f7 has 1 unfound objects
pg 1.1f6 has 1 unfound objects
pg 1.1ef has 1 unfound objects
pg 1.1e0 has 1 unfound objects
pg 1.1d5 has 1 unfound objects
pg 1.1cb has 2 unfound objects
pg 1.1c0 has 1 unfound objects
pg 1.1b3 has 1 unfound objects
pg 1.1a7 has 2 unfound objects
pg 1.19a has 2 unfound objects
pg 1.c0 has 2 unfound objects
pg 1.b2 has 1 unfound objects
pg 1.ab has 3 unfound objects
pg 1.9c has 1 unfound objects
pg 1.9b has 1 unfound objects
pg 1.9a has 1 unfound objects
pg 1.90 has 1 unfound objects
pg 1.86 has 1 unfound objects
pg 1.84 has 1 unfound objects
pg 1.79 has 1 unfound objects
pg 1.76 has 1 unfound objects
pg 1.74 has 1 unfound objects
pg 1.6a has 1 unfound objects
pg 1.c has 1 unfound objects
pg 1.10 has 1 unfound objects
pg 1.4e has 1 unfound objects
pg 1.ca has 1 unfound objects
pg 1.dd has 1 unfound objects
pg 1.e5 has 1 unfound objects
pg 1.e9 has 1 unfound objects
pg 1.f8 has 1 unfound objects
pg 1.100 has 1 unfound objects
pg 1.10c has 1 unfound objects
pg 1.116 has 1 unfound objects
pg 1.11f has 1 unfound objects
pg 1.131 has 1 unfound objects
pg 1.14a has 1 unfound objects
pg 1.151 has 1 unfound objects
pg 1.15b has 1 unfound objects
pg 1.170 has 1 unfound objects
pg 1.17a has 2 unfound objects
pg 1.17c has 1 unfound objects
pg 1.17e has 1 unfound objects
pg 1.181 has 1 unfound objects
(additional pgs left out for brevity)
OSD_SCRUB_ERRORS 4 scrub errors
PG_DAMAGED Possible data damage: 3 pgs inconsistent
pg 1.dc is active+clean+inconsistent, acting [1,4,12]
pg 1.163 is active+clean+remapped+inconsistent, acting [3,14,8]
pg 1.1c2 is active+clean+remapped+inconsistent, acting [0,14,6]
PG_DEGRADED Degraded data redundancy: 160/115284 objects degraded (0.139%), 53 pgs degraded
pg 1.c is active+recovery_wait+degraded, acting [14,0,6], 1 unfound
pg 1.10 is active+recovery_wait+degraded, acting [4,9,14], 1 unfound
pg 1.4e is active+recovery_wait+degraded, acting [4,9,1], 1 unfound
pg 1.6a is active+recovery_wait+degraded, acting [3,8,1], 1 unfound
pg 1.74 is active+recovery_wait+degraded, acting [7,14,9], 1 unfound
pg 1.76 is active+recovery_wait+degraded, acting [3,1,14], 1 unfound
pg 1.79 is active+recovery_wait+degraded, acting [14,8,4], 1 unfound
pg 1.84 is active+recovery_wait+degraded, acting [8,5,1], 1 unfound
pg 1.86 is active+recovery_wait+degraded, acting [14,9,5], 1 unfound
pg 1.90 is active+recovery_wait+degraded, acting [8,1,5], 1 unfound
pg 1.9a is active+recovery_wait+degraded, acting [9,0,5], 1 unfound
pg 1.9b is active+recovery_wait+degraded, acting [8,4,1], 1 unfound
pg 1.9c is active+recovery_wait+degraded, acting [5,1,9], 1 unfound
pg 1.ab is active+recovery_wait+degraded, acting [5,9,14], 3 unfound
pg 1.b2 is active+recovery_wait+degraded, acting [8,4,14], 1 unfound
pg 1.c0 is active+recovery_wait+degraded, acting [9,3,14], 2 unfound
pg 1.ca is active+recovery_wait+degraded, acting [4,8,14], 1 unfound
pg 1.dd is active+recovery_wait+degraded, acting [3,8,1], 1 unfound
pg 1.e5 is active+recovery_wait+degraded, acting [8,3,1], 1 unfound
pg 1.e9 is active+recovery_wait+degraded, acting [14,7,9], 1 unfound
pg 1.f8 is active+recovery_wait+degraded, acting [9,3,0], 1 unfound
pg 1.100 is active+recovery_wait+degraded, acting [9,0,15], 1 unfound
pg 1.10c is active+recovery_wait+degraded+remapped, acting [1,4,9], 1 unfound
pg 1.116 is active+recovery_wait+degraded, acting [1,8,14], 1 unfound
pg 1.11f is active+recovery_wait+degraded, acting [5,9,1], 1 unfound
pg 1.131 is active+recovery_wait+degraded, acting [9,0,15], 1 unfound
pg 1.14a is active+recovery_wait+degraded, acting [1,7,8], 1 unfound
pg 1.151 is active+recovery_wait+degraded+remapped, acting [14,1,8], 1 unfound
pg 1.15b is active+recovery_wait+degraded, acting [14,1,4], 1 unfound
pg 1.170 is active+recovery_wait+degraded, acting [5,8,14], 1 unfound
pg 1.17a is active+recovery_wait+degraded+remapped, acting [14,4,1], 2 unfound
pg 1.17c is active+recovery_wait+degraded+remapped, acting [5,1,9], 1 unfound
pg 1.17e is active+recovery_wait+degraded+remapped, acting [7,0,9], 1 unfound
pg 1.181 is active+recovery_wait+degraded, acting [9,5,1], 1 unfound
pg 1.19a is active+recovery_wait+degraded+remapped, acting [4,14,8], 2 unfound
pg 1.1a7 is active+recovery_wait+degraded, acting [1,8,4], 2 unfound
pg 1.1b3 is active+recovery_wait+degraded, acting [9,7,14], 1 unfound
pg 1.1c0 is active+recovery_wait+degraded, acting [14,4,9], 1 unfound
pg 1.1cb is active+recovery_wait+degraded, acting [8,14,1], 2 unfound
pg 1.1d5 is active+recovery_wait+degraded, acting [9,0,15], 1 unfound
pg 1.1e0 is active+recovery_wait+degraded+remapped, acting [4,1,15], 1 unfound
pg 1.1ef is active+recovery_wait+degraded+remapped, acting [15,0,7], 1 unfound
pg 1.1f6 is active+recovery_wait+degraded, acting [8,4,14], 1 unfound
pg 1.1f7 is active+recovery_wait+degraded, acting [3,1,9], 1 unfound
pg 1.20b is active+recovery_wait+degraded+remapped, acting [3,14,1], 1 unfound
pg 1.20c is active+recovery_wait+degraded, acting [1,8,3], 1 unfound
pg 1.212 is active+recovery_wait+degraded, acting [1,9,3], 1 unfound
pg 1.223 is active+recovery_wait+degraded, acting [3,9,1], 1 unfound
pg 1.230 is active+recovery_wait+degraded, acting [4,9,1], 1 unfound
pg 1.232 is active+recovery_wait+degraded, acting [14,8,7], 1 unfound
pg 1.236 is active+recovery_wait+degraded, acting [1,3,14], 1 unfound
REQUEST_STUCK 3 stuck requests are blocked > 4096 sec. Implicated osds 1
3 ops are blocked > 33554.4 sec
osd.1 has stuck requests > 33554.4 sec
OSD_DOWN 1 osds down
osd.11 (root=default,host=proxmox4) is down
OBJECT_MISPLACED 5681/115284 objects misplaced (4.928%)
OBJECT_UNFOUND 61/38428 objects unfound (0.159%)
pg 1.236 has 1 unfound objects
pg 1.232 has 1 unfound objects
pg 1.230 has 1 unfound objects
pg 1.223 has 1 unfound objects
pg 1.212 has 1 unfound objects
pg 1.20c has 1 unfound objects
pg 1.20b has 1 unfound objects
pg 1.1f7 has 1 unfound objects
pg 1.1f6 has 1 unfound objects
pg 1.1ef has 1 unfound objects
pg 1.1e0 has 1 unfound objects
pg 1.1d5 has 1 unfound objects
pg 1.1cb has 2 unfound objects
pg 1.1c0 has 1 unfound objects
pg 1.1b3 has 1 unfound objects
pg 1.1a7 has 2 unfound objects
pg 1.19a has 2 unfound objects
pg 1.c0 has 2 unfound objects
pg 1.b2 has 1 unfound objects
pg 1.ab has 3 unfound objects
pg 1.9c has 1 unfound objects
pg 1.9b has 1 unfound objects
pg 1.9a has 1 unfound objects
pg 1.90 has 1 unfound objects
pg 1.86 has 1 unfound objects
pg 1.84 has 1 unfound objects
pg 1.79 has 1 unfound objects
pg 1.76 has 1 unfound objects
pg 1.74 has 1 unfound objects
pg 1.6a has 1 unfound objects
pg 1.c has 1 unfound objects
pg 1.10 has 1 unfound objects
pg 1.4e has 1 unfound objects
pg 1.ca has 1 unfound objects
pg 1.dd has 1 unfound objects
pg 1.e5 has 1 unfound objects
pg 1.e9 has 1 unfound objects
pg 1.f8 has 1 unfound objects
pg 1.100 has 1 unfound objects
pg 1.10c has 1 unfound objects
pg 1.116 has 1 unfound objects
pg 1.11f has 1 unfound objects
pg 1.131 has 1 unfound objects
pg 1.14a has 1 unfound objects
pg 1.151 has 1 unfound objects
pg 1.15b has 1 unfound objects
pg 1.170 has 1 unfound objects
pg 1.17a has 2 unfound objects
pg 1.17c has 1 unfound objects
pg 1.17e has 1 unfound objects
pg 1.181 has 1 unfound objects
(additional pgs left out for brevity)
OSD_SCRUB_ERRORS 4 scrub errors
PG_DAMAGED Possible data damage: 3 pgs inconsistent
pg 1.dc is active+clean+inconsistent, acting [1,4,12]
pg 1.163 is active+clean+remapped+inconsistent, acting [3,14,8]
pg 1.1c2 is active+clean+remapped+inconsistent, acting [0,14,6]
PG_DEGRADED Degraded data redundancy: 160/115284 objects degraded (0.139%), 53 pgs degraded
pg 1.c is active+recovery_wait+degraded, acting [14,0,6], 1 unfound
pg 1.10 is active+recovery_wait+degraded, acting [4,9,14], 1 unfound
pg 1.4e is active+recovery_wait+degraded, acting [4,9,1], 1 unfound
pg 1.6a is active+recovery_wait+degraded, acting [3,8,1], 1 unfound
pg 1.74 is active+recovery_wait+degraded, acting [7,14,9], 1 unfound
pg 1.76 is active+recovery_wait+degraded, acting [3,1,14], 1 unfound
pg 1.79 is active+recovery_wait+degraded, acting [14,8,4], 1 unfound
pg 1.84 is active+recovery_wait+degraded, acting [8,5,1], 1 unfound
pg 1.86 is active+recovery_wait+degraded, acting [14,9,5], 1 unfound
pg 1.90 is active+recovery_wait+degraded, acting [8,1,5], 1 unfound
pg 1.9a is active+recovery_wait+degraded, acting [9,0,5], 1 unfound
pg 1.9b is active+recovery_wait+degraded, acting [8,4,1], 1 unfound
pg 1.9c is active+recovery_wait+degraded, acting [5,1,9], 1 unfound
pg 1.ab is active+recovery_wait+degraded, acting [5,9,14], 3 unfound
pg 1.b2 is active+recovery_wait+degraded, acting [8,4,14], 1 unfound
pg 1.c0 is active+recovery_wait+degraded, acting [9,3,14], 2 unfound
pg 1.ca is active+recovery_wait+degraded, acting [4,8,14], 1 unfound
pg 1.dd is active+recovery_wait+degraded, acting [3,8,1], 1 unfound
pg 1.e5 is active+recovery_wait+degraded, acting [8,3,1], 1 unfound
pg 1.e9 is active+recovery_wait+degraded, acting [14,7,9], 1 unfound
pg 1.f8 is active+recovery_wait+degraded, acting [9,3,0], 1 unfound
pg 1.100 is active+recovery_wait+degraded, acting [9,0,15], 1 unfound
pg 1.10c is active+recovery_wait+degraded+remapped, acting [1,4,9], 1 unfound
pg 1.116 is active+recovery_wait+degraded, acting [1,8,14], 1 unfound
pg 1.11f is active+recovery_wait+degraded, acting [5,9,1], 1 unfound
pg 1.131 is active+recovery_wait+degraded, acting [9,0,15], 1 unfound
pg 1.14a is active+recovery_wait+degraded, acting [1,7,8], 1 unfound
pg 1.151 is active+recovery_wait+degraded+remapped, acting [14,1,8], 1 unfound
pg 1.15b is active+recovery_wait+degraded, acting [14,1,4], 1 unfound
pg 1.170 is active+recovery_wait+degraded, acting [5,8,14], 1 unfound
pg 1.17a is active+recovery_wait+degraded+remapped, acting [14,4,1], 2 unfound
pg 1.17c is active+recovery_wait+degraded+remapped, acting [5,1,9], 1 unfound
pg 1.17e is active+recovery_wait+degraded+remapped, acting [7,0,9], 1 unfound
pg 1.181 is active+recovery_wait+degraded, acting [9,5,1], 1 unfound
pg 1.19a is active+recovery_wait+degraded+remapped, acting [4,14,8], 2 unfound
pg 1.1a7 is active+recovery_wait+degraded, acting [1,8,4], 2 unfound
pg 1.1b3 is active+recovery_wait+degraded, acting [9,7,14], 1 unfound
pg 1.1c0 is active+recovery_wait+degraded, acting [14,4,9], 1 unfound
pg 1.1cb is active+recovery_wait+degraded, acting [8,14,1], 2 unfound
pg 1.1d5 is active+recovery_wait+degraded, acting [9,0,15], 1 unfound
pg 1.1e0 is active+recovery_wait+degraded+remapped, acting [4,1,15], 1 unfound
pg 1.1ef is active+recovery_wait+degraded+remapped, acting [15,0,7], 1 unfound
pg 1.1f6 is active+recovery_wait+degraded, acting [8,4,14], 1 unfound
pg 1.1f7 is active+recovery_wait+degraded, acting [3,1,9], 1 unfound
pg 1.20b is active+recovery_wait+degraded+remapped, acting [3,14,1], 1 unfound
pg 1.20c is active+recovery_wait+degraded, acting [1,8,3], 1 unfound
pg 1.212 is active+recovery_wait+degraded, acting [1,9,3], 1 unfound
pg 1.223 is active+recovery_wait+degraded, acting [3,9,1], 1 unfound
pg 1.230 is active+recovery_wait+degraded, acting [4,9,1], 1 unfound
pg 1.232 is active+recovery_wait+degraded, acting [14,8,7], 1 unfound
pg 1.236 is active+recovery_wait+degraded, acting [1,3,14], 1 unfound
REQUEST_STUCK 3 stuck requests are blocked > 4096 sec. Implicated osds 1
3 ops are blocked > 33554.4 sec
osd.1 has stuck requests > 33554.4 sec
Br/Joel