Hi Guys,
Long shot but we have an old 2/1 pool, on our proxmox install hypoerconverged and have lost an osd. I now have 3 stale pg's funnily enough showing as on this osd.
Is there any way I can try to recover data from the failed disk and import it back in? The disk shows in the node but just never starts normaly with an input out put error
Ceph -s
root@vms-ceph121:/home/richard.admin# ceph -s
cluster:
id: 93cc6f61-f2dd-4346-8813-71de1fa1c221
health: HEALTH_WARN
noout,nobackfill,norecover flag(s) set
2 osds down
1 host (1 osds) down
2 nearfull osd(s)
Reduced data availability: 3 pgs stale
Degraded data redundancy: 318198/47393185 objects degraded (0.671%), 555 pgs degraded, 214 pgs undersized
2 pool(s) nearfull
768 slow ops, oldest one blocked for 15390 sec, daemons [osd.1701,osd.1706,osd.509] have slow ops.
services:
mon: 9 daemons, quorum vms-ceph110,vms-ceph112,vms-ceph113,vms-ceph114,vms-ceph117,vms-ceph120,vms-ceph119,vms-ceph106,vms-ceph121 (age 29m)
mgr: vms-ceph113(active, since 7w), standbys: vms-ceph106, vms-ceph110, vms-ceph114, vms-ceph117, vms-ceph119, vms-ceph120, vms-ceph121, vms-ceph112
osd: 107 osds: 104 up (since 28m), 106 in (since 88m); 235 remapped pgs
flags noout,nobackfill,norecover
data:
pools: 5 pools, 2625 pgs
objects: 16.84M objects, 63 TiB
usage: 179 TiB used, 130 TiB / 309 TiB avail
pgs: 318198/47393185 objects degraded (0.671%)
179854/47393185 objects misplaced (0.379%)
1900 active+clean
321 active+recovery_wait+degraded
169 active+recovery_wait+undersized+degraded+remapped
99 active+recovery_wait
60 active+clean+remapped
37 active+undersized+degraded
19 active+recovering+degraded
11 active+recovering
5 active+recovering+undersized+degraded+remapped
3 stale+active+undersized+degraded
1 active+recovering+degraded+remapped
io:
client: 59 MiB/s rd, 26 MiB/s wr, 698 op/s rd, 548 op/s wr
recovery: 679 B/s, 0 objects/s
progress:
Global Recovery Event (5h)
[====================........] (remaining: 112m)
root@vms-ceph121:/home/richard.admin# ceph health detail
HEALTH_WARN noout,nobackfill,norecover flag(s) set; 2 osds down; 1 host (1 osds) down; 2 nearfull osd(s); Reduced data availability: 3 pgs stale; Degraded data redundancy: 318175/47393245 objects degraded (0.671%), 542 pgs degraded, 214 pgs undersized; 2 pool(s) nearfull; 768 slow ops, oldest one blocked for 15441 sec, daemons [osd.1701,osd.1706,osd.509] have slow ops.
[WRN] OSDMAP_FLAGS: noout,nobackfill,norecover flag(s) set
[WRN] OSD_DOWN: 2 osds down
osd.1203 (root=youtrack-general-root,host=vms-ceph112_youtrack-general_rows) is down
osd.2112 (root=ha-ssd-root,host=vms-ceph121_ha-ssd) is down
[WRN] OSD_HOST_DOWN: 1 host (1 osds) down
host vms-ceph121_ha-ssd (root=ha-ssd-root) (1 osds) is down
[WRN] OSD_NEARFULL: 2 nearfull osd(s)
osd.1406 is near full
osd.2005 is near full
[WRN] PG_AVAILABILITY: Reduced data availability: 3 pgs stale
pg 8.11 is stuck stale for 5h, current state stale+active+undersized+degraded, last acting [1203]
pg 8.1e0 is stuck stale for 5h, current state stale+active+undersized+degraded, last acting [1203]
pg 8.351 is stuck stale for 5h, current state stale+active+undersized+degraded, last acting [1203]
[WRN] PG_DEGRADED: Degraded data redundancy: 318175/47393245 objects degraded (0.671%), 542 pgs degraded, 214 pgs undersized
pg 58.380 is active+recovery_wait+degraded, acting [1202,704,604]
pg 58.386 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1601,1904]
pg 58.388 is active+recovery_wait+degraded, acting [1208,1501,1901]
pg 58.393 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1007,1107]
pg 58.394 is active+recovery_wait+degraded, acting [1208,801,1604]
pg 58.395 is active+recovery_wait+degraded, acting [1201,801,1104]
pg 58.397 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [801,604]
pg 58.398 is active+recovery_wait+degraded, acting [1201,704,1708]
pg 58.39c is active+recovery_wait+degraded, acting [1201,1607,1708]
pg 58.39d is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [607,1104]
pg 58.3a1 is active+recovery_wait+degraded, acting [1904,1205,601]
pg 58.3a4 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1604,2101]
pg 58.3a5 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1901,1007]
pg 58.3a6 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [804,1607]
pg 58.3a9 is active+recovery_wait+degraded, acting [601,401,1208]
pg 58.3ad is active+recovery_wait+degraded, acting [2107,1202,1001]
pg 58.3ae is active+recovery_wait+degraded, acting [1104,1708,1204]
pg 58.3af is active+recovery_wait+degraded, acting [2104,1702,1004]
pg 58.3c0 is active+recovery_wait+degraded, acting [2104,1202,407]
pg 58.3c1 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [707,1705]
pg 58.3c2 is active+recovery_wait+degraded, acting [1204,404,1708]
pg 58.3c3 is active+recovery_wait+degraded, acting [1101,807,1208]
pg 58.3c7 is active+recovery_wait+degraded, acting [1004,1507,1207]
pg 58.3c8 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1504,2101]
pg 58.3c9 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1702,1504]
pg 58.3ca is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1604,804]
pg 58.3cb is active+recovery_wait+degraded, acting [804,1901,1205]
pg 58.3cc is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1507,1607]
pg 58.3cd is active+recovery_wait+degraded, acting [1208,1004,1104]
pg 58.3ce is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1507,1104]
pg 58.3cf is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [804,1708]
pg 58.3d2 is active+recovery_wait+degraded, acting [2107,1702,1107]
pg 58.3d6 is active+recovery_wait+degraded, acting [404,1208,2104]
pg 58.3d8 is active+recovery_wait+degraded, acting [2104,804,707]
pg 58.3db is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1705,1601]
pg 58.3de is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [701,1501]
pg 58.3df is active+recovery_wait+degraded, acting [1504,2104,607]
pg 58.3e0 is active+recovery_wait+degraded, acting [701,1204,1904]
pg 58.3e7 is active+recovery_wait+degraded, acting [1607,1202,1901]
pg 58.3e9 is active+recovery_wait+degraded, acting [1208,1104,1507]
pg 58.3eb is active+recovery_wait+degraded, acting [607,401,1205]
pg 58.3ee is active+recovery_wait+degraded, acting [1705,807,1201]
pg 58.3f1 is active+recovery_wait+degraded, acting [1201,1507,1107]
pg 58.3f3 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [801,1901]
pg 58.3f5 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [704,2101]
pg 58.3f6 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [401,607]
pg 58.3f8 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1507,701]
pg 58.3f9 is active+recovery_wait+degraded, acting [1205,607,1901]
pg 58.3fa is active+recovery_wait+degraded, acting [1202,1104,707]
pg 58.3fd is active+recovery_wait+degraded, acting [1208,1705,404]
pg 58.3ff is active+recovery_wait+degraded, acting [1705,807,1207]
[WRN] POOL_NEARFULL: 2 pool(s) nearfull
pool 'youtrack-general-pool' is nearfull
pool 'device_health_metrics' is nearfull
[WRN] SLOW_OPS: 768 slow ops, oldest one blocked for 15441 sec, daemons [osd.1701,osd.1706,osd.509] have slow ops.
root@vms-ceph121:/home/richard.admin#
Long shot but we have an old 2/1 pool, on our proxmox install hypoerconverged and have lost an osd. I now have 3 stale pg's funnily enough showing as on this osd.
Is there any way I can try to recover data from the failed disk and import it back in? The disk shows in the node but just never starts normaly with an input out put error
Ceph -s
root@vms-ceph121:/home/richard.admin# ceph -s
cluster:
id: 93cc6f61-f2dd-4346-8813-71de1fa1c221
health: HEALTH_WARN
noout,nobackfill,norecover flag(s) set
2 osds down
1 host (1 osds) down
2 nearfull osd(s)
Reduced data availability: 3 pgs stale
Degraded data redundancy: 318198/47393185 objects degraded (0.671%), 555 pgs degraded, 214 pgs undersized
2 pool(s) nearfull
768 slow ops, oldest one blocked for 15390 sec, daemons [osd.1701,osd.1706,osd.509] have slow ops.
services:
mon: 9 daemons, quorum vms-ceph110,vms-ceph112,vms-ceph113,vms-ceph114,vms-ceph117,vms-ceph120,vms-ceph119,vms-ceph106,vms-ceph121 (age 29m)
mgr: vms-ceph113(active, since 7w), standbys: vms-ceph106, vms-ceph110, vms-ceph114, vms-ceph117, vms-ceph119, vms-ceph120, vms-ceph121, vms-ceph112
osd: 107 osds: 104 up (since 28m), 106 in (since 88m); 235 remapped pgs
flags noout,nobackfill,norecover
data:
pools: 5 pools, 2625 pgs
objects: 16.84M objects, 63 TiB
usage: 179 TiB used, 130 TiB / 309 TiB avail
pgs: 318198/47393185 objects degraded (0.671%)
179854/47393185 objects misplaced (0.379%)
1900 active+clean
321 active+recovery_wait+degraded
169 active+recovery_wait+undersized+degraded+remapped
99 active+recovery_wait
60 active+clean+remapped
37 active+undersized+degraded
19 active+recovering+degraded
11 active+recovering
5 active+recovering+undersized+degraded+remapped
3 stale+active+undersized+degraded
1 active+recovering+degraded+remapped
io:
client: 59 MiB/s rd, 26 MiB/s wr, 698 op/s rd, 548 op/s wr
recovery: 679 B/s, 0 objects/s
progress:
Global Recovery Event (5h)
[====================........] (remaining: 112m)
root@vms-ceph121:/home/richard.admin# ceph health detail
HEALTH_WARN noout,nobackfill,norecover flag(s) set; 2 osds down; 1 host (1 osds) down; 2 nearfull osd(s); Reduced data availability: 3 pgs stale; Degraded data redundancy: 318175/47393245 objects degraded (0.671%), 542 pgs degraded, 214 pgs undersized; 2 pool(s) nearfull; 768 slow ops, oldest one blocked for 15441 sec, daemons [osd.1701,osd.1706,osd.509] have slow ops.
[WRN] OSDMAP_FLAGS: noout,nobackfill,norecover flag(s) set
[WRN] OSD_DOWN: 2 osds down
osd.1203 (root=youtrack-general-root,host=vms-ceph112_youtrack-general_rows) is down
osd.2112 (root=ha-ssd-root,host=vms-ceph121_ha-ssd) is down
[WRN] OSD_HOST_DOWN: 1 host (1 osds) down
host vms-ceph121_ha-ssd (root=ha-ssd-root) (1 osds) is down
[WRN] OSD_NEARFULL: 2 nearfull osd(s)
osd.1406 is near full
osd.2005 is near full
[WRN] PG_AVAILABILITY: Reduced data availability: 3 pgs stale
pg 8.11 is stuck stale for 5h, current state stale+active+undersized+degraded, last acting [1203]
pg 8.1e0 is stuck stale for 5h, current state stale+active+undersized+degraded, last acting [1203]
pg 8.351 is stuck stale for 5h, current state stale+active+undersized+degraded, last acting [1203]
[WRN] PG_DEGRADED: Degraded data redundancy: 318175/47393245 objects degraded (0.671%), 542 pgs degraded, 214 pgs undersized
pg 58.380 is active+recovery_wait+degraded, acting [1202,704,604]
pg 58.386 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1601,1904]
pg 58.388 is active+recovery_wait+degraded, acting [1208,1501,1901]
pg 58.393 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1007,1107]
pg 58.394 is active+recovery_wait+degraded, acting [1208,801,1604]
pg 58.395 is active+recovery_wait+degraded, acting [1201,801,1104]
pg 58.397 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [801,604]
pg 58.398 is active+recovery_wait+degraded, acting [1201,704,1708]
pg 58.39c is active+recovery_wait+degraded, acting [1201,1607,1708]
pg 58.39d is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [607,1104]
pg 58.3a1 is active+recovery_wait+degraded, acting [1904,1205,601]
pg 58.3a4 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1604,2101]
pg 58.3a5 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1901,1007]
pg 58.3a6 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [804,1607]
pg 58.3a9 is active+recovery_wait+degraded, acting [601,401,1208]
pg 58.3ad is active+recovery_wait+degraded, acting [2107,1202,1001]
pg 58.3ae is active+recovery_wait+degraded, acting [1104,1708,1204]
pg 58.3af is active+recovery_wait+degraded, acting [2104,1702,1004]
pg 58.3c0 is active+recovery_wait+degraded, acting [2104,1202,407]
pg 58.3c1 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [707,1705]
pg 58.3c2 is active+recovery_wait+degraded, acting [1204,404,1708]
pg 58.3c3 is active+recovery_wait+degraded, acting [1101,807,1208]
pg 58.3c7 is active+recovery_wait+degraded, acting [1004,1507,1207]
pg 58.3c8 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1504,2101]
pg 58.3c9 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1702,1504]
pg 58.3ca is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1604,804]
pg 58.3cb is active+recovery_wait+degraded, acting [804,1901,1205]
pg 58.3cc is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1507,1607]
pg 58.3cd is active+recovery_wait+degraded, acting [1208,1004,1104]
pg 58.3ce is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1507,1104]
pg 58.3cf is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [804,1708]
pg 58.3d2 is active+recovery_wait+degraded, acting [2107,1702,1107]
pg 58.3d6 is active+recovery_wait+degraded, acting [404,1208,2104]
pg 58.3d8 is active+recovery_wait+degraded, acting [2104,804,707]
pg 58.3db is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1705,1601]
pg 58.3de is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [701,1501]
pg 58.3df is active+recovery_wait+degraded, acting [1504,2104,607]
pg 58.3e0 is active+recovery_wait+degraded, acting [701,1204,1904]
pg 58.3e7 is active+recovery_wait+degraded, acting [1607,1202,1901]
pg 58.3e9 is active+recovery_wait+degraded, acting [1208,1104,1507]
pg 58.3eb is active+recovery_wait+degraded, acting [607,401,1205]
pg 58.3ee is active+recovery_wait+degraded, acting [1705,807,1201]
pg 58.3f1 is active+recovery_wait+degraded, acting [1201,1507,1107]
pg 58.3f3 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [801,1901]
pg 58.3f5 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [704,2101]
pg 58.3f6 is stuck undersized for 28m, current state active+recovery_wait+undersized+degraded+remapped, last acting [401,607]
pg 58.3f8 is stuck undersized for 57m, current state active+recovery_wait+undersized+degraded+remapped, last acting [1507,701]
pg 58.3f9 is active+recovery_wait+degraded, acting [1205,607,1901]
pg 58.3fa is active+recovery_wait+degraded, acting [1202,1104,707]
pg 58.3fd is active+recovery_wait+degraded, acting [1208,1705,404]
pg 58.3ff is active+recovery_wait+degraded, acting [1705,807,1207]
[WRN] POOL_NEARFULL: 2 pool(s) nearfull
pool 'youtrack-general-pool' is nearfull
pool 'device_health_metrics' is nearfull
[WRN] SLOW_OPS: 768 slow ops, oldest one blocked for 15441 sec, daemons [osd.1701,osd.1706,osd.509] have slow ops.
root@vms-ceph121:/home/richard.admin#