Hi,
I have a small three node cluster. There are two pools with three OSDs each. Each node hosts one OSD from each pool (one HDD, one SSD). Replication rule is 3/2.
When one of the OSDs in one of the nodes started acting up, I decided to not just replace the OSD but to also replace the entire (for different reasons).
So I set up a new node with identical OSDs and added it to the pool. Ceph started copying PGs to the new node. After a while I shut down the old node so that my pool again had 3 running nodes. Ceph continued to copy (now rebalance, I think) PGs to the new node.
But then, it suddenly stopped and just complains about one node with two OSDs being down. It shows that so and so many PGs are undersized but it doesn't do anything about this.
Any ideas why? Can I force Ceph to continue rebalancing?
Thanks!
I have a small three node cluster. There are two pools with three OSDs each. Each node hosts one OSD from each pool (one HDD, one SSD). Replication rule is 3/2.
When one of the OSDs in one of the nodes started acting up, I decided to not just replace the OSD but to also replace the entire (for different reasons).
So I set up a new node with identical OSDs and added it to the pool. Ceph started copying PGs to the new node. After a while I shut down the old node so that my pool again had 3 running nodes. Ceph continued to copy (now rebalance, I think) PGs to the new node.
But then, it suddenly stopped and just complains about one node with two OSDs being down. It shows that so and so many PGs are undersized but it doesn't do anything about this.
Any ideas why? Can I force Ceph to continue rebalancing?
Thanks!