Hi Everyone,
I'm in a bit of a situation here.
We identified a bad drive (but still running) and decided we needed to remove it. Therefore we followed these instructions believing that it would work without a hitch and all our containers/vms would continue to run. Unfortunately, not the case.
https://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/
I suspect we went wrong because we didn't 'down' the drive first, we went strait for 'out'. Now we are in a situation where we have PGs stuck in 'activating+remapped'
How can I get all this recovered. It's on a production server and where did we go wrong?
Any help very appreciated!
I'm in a bit of a situation here.
We identified a bad drive (but still running) and decided we needed to remove it. Therefore we followed these instructions believing that it would work without a hitch and all our containers/vms would continue to run. Unfortunately, not the case.
https://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/
I suspect we went wrong because we didn't 'down' the drive first, we went strait for 'out'. Now we are in a situation where we have PGs stuck in 'activating+remapped'
How can I get all this recovered. It's on a production server and where did we go wrong?
Any help very appreciated!