Hello Experts,
I have recently setup new proxmox 6 environment with ceph, I removed 3 osds from one of the node and after that ceph has stopped recovering even though its shows there is Degraded Data Redundancy, I can't figure out why it wouldn't recover these.
~# pveversion
pve-manager/6.1-8/806edfe1 (running kernel: 5.3.18-2-pve)
root@dellr730-1:~#
root@dellr730-1:~# ceph -v
ceph version 14.2.8 (0f8245b6b446e041c5cdc40ec82d4d2263c2895b) nautilus (stable)
root@dellr730-1:~#
ceph pg repair, osd restart, scrub nothing works. Here are relevant outputs.
I have recently setup new proxmox 6 environment with ceph, I removed 3 osds from one of the node and after that ceph has stopped recovering even though its shows there is Degraded Data Redundancy, I can't figure out why it wouldn't recover these.
~# pveversion
pve-manager/6.1-8/806edfe1 (running kernel: 5.3.18-2-pve)
root@dellr730-1:~#
root@dellr730-1:~# ceph -v
ceph version 14.2.8 (0f8245b6b446e041c5cdc40ec82d4d2263c2895b) nautilus (stable)
root@dellr730-1:~#
ceph pg repair, osd restart, scrub nothing works. Here are relevant outputs.
Code:
# ceph -w
cluster:
id: 01bcf2d5-6e96-4d50-81ec-cd6bb55c500e
health: HEALTH_WARN
Degraded data redundancy: 440934/1611642 objects degraded (27.359%), 420 pgs degraded, 420 pgs undersized
services:
mon: 3 daemons, quorum dellr730-1,dellr730-2,hp1 (age 7h)
mgr: dellr730-1(active, since 7h)
osd: 14 osds: 14 up (since 5m), 14 in (since 46m); 92 remapped pgs
data:
pools: 1 pools, 512 pgs
objects: 537.21k objects, 2.0 TiB
usage: 4.5 TiB used, 8.2 TiB / 13 TiB avail
pgs: 440934/1611642 objects degraded (27.359%)
96280/1611642 objects misplaced (5.974%)
420 active+undersized+degraded
92 active+clean+remapped
io:
client: 3.7 MiB/s rd, 5.9 MiB/s wr, 74 op/s rd, 135 op/s wr
Code:
# ceph osd crush tree
ID CLASS WEIGHT TYPE NAME
-1 12.73599 root default
-10 6.36800 host dellr730-1
10 ssd 0.90999 osd.10
11 ssd 0.90999 osd.11
12 ssd 0.90999 osd.12
13 ssd 0.90999 osd.13
14 ssd 0.90999 osd.14
15 ssd 0.90999 osd.15
16 ssd 0.90999 osd.16
-7 6.36800 host dellr730-2
3 ssd 0.90999 osd.3
4 ssd 0.90999 osd.4
5 ssd 0.90999 osd.5
6 ssd 0.90999 osd.6
7 ssd 0.90999 osd.7
8 ssd 0.90999 osd.8
9 ssd 0.90999 osd.9
-3 0 host hp1
root@dellr730-1:~#
Code:
# ceph health detail
HEALTH_WARN Degraded data redundancy: 440934/1611642 objects degraded (27.359%), 420 pgs degraded, 420 pgs undersized
PG_DEGRADED Degraded data redundancy: 440934/1611642 objects degraded (27.359%), 420 pgs degraded, 420 pgs undersized
pg 1.1bb is active+undersized+degraded, acting [14,6]
pg 1.1bc is stuck undersized for 6352.642140, current state active+undersized+degraded, last acting [12,5]
pg 1.1bd is stuck undersized for 918.274133, current state active+undersized+degraded, last acting [4,11]
pg 1.1be is stuck undersized for 1081.936833, current state active+undersized+degraded, last acting [8,10]
pg 1.1bf is stuck undersized for 6356.685735, current state active+undersized+degraded, last acting [7,14]
pg 1.1c0 is stuck undersized for 918.276848, current state active+undersized+degraded, last acting [7,11]
pg 1.1c2 is stuck undersized for 6356.699629, current state active+undersized+degraded, last acting [8,15]
pg 1.1c3 is stuck undersized for 6360.756731, current state active+undersized+degraded, last acting [3,14]
pg 1.1c4 is stuck undersized for 6356.702880, current state active+undersized+degraded, last acting [4,13]
pg 1.1c5 is stuck undersized for 6356.699650, current state active+undersized+degraded, last acting [5,13]
pg 1.1c6 is stuck undersized for 6356.698034, current state active+undersized+degraded, last acting [8,14]
pg 1.1c8 is stuck undersized for 1081.938997, current state active+undersized+degraded, last acting [5,10]
pg 1.1c9 is stuck undersized for 6356.702559, current state active+undersized+degraded, last acting [4,16]
pg 1.1ca is stuck undersized for 6360.764538, current state active+undersized+degraded, last acting [13,4]
pg 1.1cb is stuck undersized for 1081.928116, current state active+undersized+degraded, last acting [10,8]
pg 1.1cd is stuck undersized for 6352.641605, current state active+undersized+degraded, last acting [12,6]
pg 1.1cf is stuck undersized for 6356.699738, current state active+undersized+degraded, last acting [5,16]
pg 1.1d1 is stuck undersized for 6356.701899, current state active+undersized+degraded, last acting [16,9]
pg 1.1d5 is stuck undersized for 918.277097, current state active+undersized+degraded, last acting [9,11]
pg 1.1d6 is stuck undersized for 918.263862, current state active+undersized+degraded, last acting [11,9]
pg 1.1d7 is stuck undersized for 918.265974, current state active+undersized+degraded, last acting [11,3]
pg 1.1d8 is stuck undersized for 918.259707, current state active+undersized+degraded, last acting [11,6]
pg 1.1d9 is stuck undersized for 6360.765421, current state active+undersized+degraded, last acting [13,7]
pg 1.1da is stuck undersized for 918.277137, current state active+undersized+degraded, last acting [9,11]
pg 1.1dd is stuck undersized for 6352.625143, current state active+undersized+degraded, last acting [14,8]
pg 1.1de is stuck undersized for 918.266185, current state active+undersized+degraded, last acting [11,4]
pg 1.1df is stuck undersized for 6352.644790, current state active+undersized+degraded, last acting [16,9]
pg 1.1e1 is stuck undersized for 932.058143, current state active+undersized+degraded, last acting [9,14]
pg 1.1e2 is stuck undersized for 1081.926149, current state active+undersized+degraded, last acting [10,7]
pg 1.1e7 is stuck undersized for 1081.935331, current state active+undersized+degraded, last acting [4,10]
pg 1.1e8 is stuck undersized for 918.262368, current state active+undersized+degraded, last acting [11,7]
pg 1.1e9 is stuck undersized for 6360.749248, current state active+undersized+degraded, last acting [14,5]
pg 1.1ea is stuck undersized for 6356.700886, current state active+undersized+degraded, last acting [8,13]
pg 1.1eb is stuck undersized for 6352.644904, current state active+undersized+degraded, last acting [6,16]
pg 1.1ec is stuck undersized for 6360.753516, current state active+undersized+degraded, last acting [8,14]
pg 1.1ed is stuck undersized for 6356.701374, current state active+undersized+degraded, last acting [8,12]
pg 1.1ee is stuck undersized for 6360.755001, current state active+undersized+degraded, last acting [8,16]
pg 1.1ef is stuck undersized for 6352.640852, current state active+undersized+degraded, last acting [12,5]
pg 1.1f0 is stuck undersized for 6356.686830, current state active+undersized+degraded, last acting [14,5]
pg 1.1f2 is stuck undersized for 1098.912748, current state active+undersized+degraded, last acting [13,3]
pg 1.1f3 is stuck undersized for 6352.621163, current state active+undersized+degraded, last acting [14,9]
pg 1.1f4 is stuck undersized for 6356.700767, current state active+undersized+degraded, last acting [16,7]
pg 1.1f5 is stuck undersized for 6360.753729, current state active+undersized+degraded, last acting [8,12]
pg 1.1f7 is stuck undersized for 918.273982, current state active+undersized+degraded, last acting [4,11]
pg 1.1f9 is stuck undersized for 6356.701051, current state active+undersized+degraded, last acting [12,4]
pg 1.1fa is stuck undersized for 6356.701741, current state active+undersized+degraded, last acting [16,7]
pg 1.1fb is stuck undersized for 6360.756217, current state active+undersized+degraded, last acting [15,6]
pg 1.1fc is stuck undersized for 6356.704673, current state active+undersized+degraded, last acting [13,3]
pg 1.1fd is stuck undersized for 6360.755293, current state active+undersized+degraded, last acting [16,9]
pg 1.1fe is stuck undersized for 6360.759098, current state active+undersized+degraded, last acting [9,15]
pg 1.1ff is stuck undersized for 6352.646041, current state active+undersized+degraded, last acting [13,7]
root@dellr730-1:~#
Code:
ceph osd crush tree --show-shadow
ID CLASS WEIGHT TYPE NAME
-11 ssd2 0 root default~ssd2
-8 ssd2 0 host dellr730-1~ssd2
-4 ssd2 0 host dellr730-2~ssd2
-2 ssd2 0 host hp1~ssd2
-6 ssd 12.73984 root default~ssd
-12 ssd 6.36992 host dellr730-1~ssd
10 ssd 0.90999 osd.10
11 ssd 0.90999 osd.11
12 ssd 0.90999 osd.12
13 ssd 0.90999 osd.13
14 ssd 0.90999 osd.14
15 ssd 0.90999 osd.15
16 ssd 0.90999 osd.16
-9 ssd 6.36992 host dellr730-2~ssd
3 ssd 0.90999 osd.3
4 ssd 0.90999 osd.4
5 ssd 0.90999 osd.5
6 ssd 0.90999 osd.6
7 ssd 0.90999 osd.7
8 ssd 0.90999 osd.8
9 ssd 0.90999 osd.9
-5 ssd 0 host hp1~ssd
-1 12.73599 root default
-10 6.36800 host dellr730-1
10 ssd 0.90999 osd.10
11 ssd 0.90999 osd.11
12 ssd 0.90999 osd.12
13 ssd 0.90999 osd.13
14 ssd 0.90999 osd.14
15 ssd 0.90999 osd.15
16 ssd 0.90999 osd.16
-7 6.36800 host dellr730-2
3 ssd 0.90999 osd.3
4 ssd 0.90999 osd.4
5 ssd 0.90999 osd.5
6 ssd 0.90999 osd.6
7 ssd 0.90999 osd.7
8 ssd 0.90999 osd.8
9 ssd 0.90999 osd.9
-3 0 host hp1
root@dellr730-1:~#