I have a 2 Node CEPH cluster for a Pool.
The data is in 2 replica mode.
I took down one of the node for maintenance whereas the other node was working.
Later there was a power outage which caused the second node to restart but the problem is it had RAID card caching enabled and the battery had degraded which caused the cache to be lost.
Later the same node came up once we cleared the Write Back Cache. Now the Pool was not at all functional as it had missed some data due to cache clearance.
We then started the first node which we had taken down for maintenance and now when this node came up, few pgs became recovery_unfound, I later marked the pgs mark_unfound_lost delete. This cleared the errors but all the PGs now came to down state.
Apart from this, OSDs of only one node remain up because when I try to start other OSDs in any node. The OSDs from other node goes down automatically.
Now, I am fine with loosing data after the point where we had shut down first node for maintenance.
But all I am expecting is to bring the Pool up such that I can get the data from that first node atleast.
I have tried mounting the ceph bluestores via fuse and tried recovering the data using pg data present but still no luck as I have no idea as of where to find the RBD image hash id.
Any help would be greatly appreciated.
The data is in 2 replica mode.
I took down one of the node for maintenance whereas the other node was working.
Later there was a power outage which caused the second node to restart but the problem is it had RAID card caching enabled and the battery had degraded which caused the cache to be lost.
Later the same node came up once we cleared the Write Back Cache. Now the Pool was not at all functional as it had missed some data due to cache clearance.
We then started the first node which we had taken down for maintenance and now when this node came up, few pgs became recovery_unfound, I later marked the pgs mark_unfound_lost delete. This cleared the errors but all the PGs now came to down state.
Apart from this, OSDs of only one node remain up because when I try to start other OSDs in any node. The OSDs from other node goes down automatically.
Now, I am fine with loosing data after the point where we had shut down first node for maintenance.
But all I am expecting is to bring the Pool up such that I can get the data from that first node atleast.
I have tried mounting the ceph bluestores via fuse and tried recovering the data using pg data present but still no luck as I have no idea as of where to find the RBD image hash id.
Any help would be greatly appreciated.