VM/Containers not running after removing bad OSD

dburleson

New Member
Jun 6, 2018
13
0
1
41
Hi Everyone,

I'm in a bit of a situation here.

We identified a bad drive (but still running) and decided we needed to remove it. Therefore we followed these instructions believing that it would work without a hitch and all our containers/vms would continue to run. Unfortunately, not the case.

https://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/

I suspect we went wrong because we didn't 'down' the drive first, we went strait for 'out'. Now we are in a situation where we have PGs stuck in 'activating+remapped'

How can I get all this recovered. It's on a production server and where did we go wrong?

Any help very appreciated!