The way i read it, it is to take that Node (which has some issues of high load - affecting node1+2) out of the equation, while he goes and fixes/reinstalls Node3.
The high load may be the result of the all the rebalancing ceph trying to do. Eric's original post says the cluster lost 33% of its disk. But do not know what caused the loss. After the loss i believe he marked the OSDs OUT. Which started rebalancing and as fas i can understand it never finished rebalancing. High IO is obviously normal while Ceph rebalancing which Will slow down cluster significantly specially a small 3 node cluster with 3 replicas.The way i read it, it is to take that Node (which has some issues of high load - affecting node1+2) out of the equation, while he goes and fixes/reinstalls Node3.
If you have never tried with data center grade SSD's how can you then assume journals on SSD does not noticeably increase performance?I personally after doing a bunch of Tests on my test-Machine, then Test-CLuster, then verifying with the OFFice production Cluster and finally moving to our Storage-CLusters have completely moved off SSD backed journals and run them on their OSD instead. We now use those SSD's as Replicated SSD-Caching-tiers for all pools instead.
If you have never tried with data center grade SSD's how can you then assume journals on SSD does not noticeably increase performance?