Greetings,
Our proxmox cluster uses a ceph pool replicated across 3 osds on 3 servers. After a power failure however one of the osds appears to refuse to sync at all and all objects on it appear degraded. In fact I cleared some space on the cluster, by erasing an old no longer used VM, and said pool still has its old usage percentage (~85,5%), while the other two have lowered to the proper new size (~63%, both as seen in the ceph osd tab).
OS reports that the OSD is mounted rw (via mount command) and in fact the server has been normally rebooted a few times since.
Cluster status report is:
cluster 5a36c253-d38a-4e7e-bbb2-929f58639662
health HEALTH_WARN
64 pgs backfill_toofull
64 pgs degraded
64 pgs stuck degraded
64 pgs stuck unclean
64 pgs stuck undersized
64 pgs undersized
recovery 149738/449178 objects degraded (33.336%)
recovery 149726/449178 objects misplaced (33.333%)
1 near full osd(s)
monmap e3: 3 mons at {0=10.2.2.242:6789/0,1=10.2.2.240:6789/0,2=10.2.2.243:6789/0}
election epoch 488, quorum 0,1,2 1,0,2
osdmap e394: 3 osds: 3 up, 3 in; 64 remapped pgs
pgmap v32555208: 64 pgs, 1 pools, 575 GB data, 146 kobjects
1959 GB used, 818 GB / 2778 GB avail
149738/449178 objects degraded (33.336%)
149726/449178 objects misplaced (33.333%)
64 active+undersized+degraded+remapped+backfill_toofull
client io 274 kB/s rd, 625 kB/s wr, 153 op/s
and ceph versions are:
osd.0: {
"version": "ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)"
}
osd.1: {
"version": "ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)"
}
osd.2: {
"version": "ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)"
}
How can we restore this node?
Our proxmox cluster uses a ceph pool replicated across 3 osds on 3 servers. After a power failure however one of the osds appears to refuse to sync at all and all objects on it appear degraded. In fact I cleared some space on the cluster, by erasing an old no longer used VM, and said pool still has its old usage percentage (~85,5%), while the other two have lowered to the proper new size (~63%, both as seen in the ceph osd tab).
OS reports that the OSD is mounted rw (via mount command) and in fact the server has been normally rebooted a few times since.
Cluster status report is:
cluster 5a36c253-d38a-4e7e-bbb2-929f58639662
health HEALTH_WARN
64 pgs backfill_toofull
64 pgs degraded
64 pgs stuck degraded
64 pgs stuck unclean
64 pgs stuck undersized
64 pgs undersized
recovery 149738/449178 objects degraded (33.336%)
recovery 149726/449178 objects misplaced (33.333%)
1 near full osd(s)
monmap e3: 3 mons at {0=10.2.2.242:6789/0,1=10.2.2.240:6789/0,2=10.2.2.243:6789/0}
election epoch 488, quorum 0,1,2 1,0,2
osdmap e394: 3 osds: 3 up, 3 in; 64 remapped pgs
pgmap v32555208: 64 pgs, 1 pools, 575 GB data, 146 kobjects
1959 GB used, 818 GB / 2778 GB avail
149738/449178 objects degraded (33.336%)
149726/449178 objects misplaced (33.333%)
64 active+undersized+degraded+remapped+backfill_toofull
client io 274 kB/s rd, 625 kB/s wr, 153 op/s
and ceph versions are:
osd.0: {
"version": "ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)"
}
osd.1: {
"version": "ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)"
}
osd.2: {
"version": "ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)"
}
How can we restore this node?